From the November 2015 issue of IEEE Transactions on Nanotechnology
An Energy-Efficient Nonvolatile In-Memory Computing Architecture for Extreme Learning Machine by Domain-Wall Nanowire Devices
by Yuhao Wang; Hao Yu ; Leibin Ni ; Guang-Bin Huang ; Mei Yan ; Chuliang Weng ; Wei Yang ; Junfeng Zhao T-NANO, Vol. 14, Issue 6, pp. 998 – 1012, November 2015.
Abstract: The data-oriented applications have introduced increased demands on memory capacity and bandwidth, which raises the need to rethink the architecture of the current computing platforms. The logic-in-memory architecture is highly promising as future logic-memory integration paradigm for high throughput data-driven applications. From memory technology aspect, as one recently introduced nonvolatile memory device, domain-wall nanowire (or race-track) not only shows potential as future power efficient memory, but also computing capacity by its unique physics of spintronics. This paper explores a novel distributed in-memory computing architecture where most logic functions are executed within the memory, which significantly alleviates the bandwidth congestion issue and improves the energy efficiency. The proposed distributed in-memory computing architecture is purely built by domain-wall nanowire, i.e., both memory and logic are implemented by domain-wall nanowire devices. As a case study, neural network-based image resolution enhancement algorithm, called DW-NN, is examined within the proposed architecture. We show that all operations involved in machine learning on neural network can be mapped to a logic-in-memory architecture by nonvolatile domain-wall nanowire. Domain-wall nanowire-based logic is customized for in machine learning within image data storage. As such, both neural network training and processing can be performed locally within the memory. The experimental results show that the domain-wall memory can reduce 92% leakage power and 16% dynamic power compared to main memory implemented by DRAM; and domain-wall logic can reduce 31% both dynamic and 65% leakage power under the similar performance compared to CMOS transistor-based logic. And system throughput in DW-NN is improved by 11.6x and the energy efficiency is improved by 56x when compared to conventional image processing system.