TY - GEN
T1 - A Charge Domain P-8T SRAM Compute-In-Memory with Low-Cost DAC/ADC Operation for 4-bit Input Processing
AU - Kim, Joonhyung
AU - Lee, Kyeongho
AU - Park, Jongsun
N1 - Funding Information:
This research was supported by the National Research Foundation of Korea grant funded by the Korea government (No. NRF-2020R1A2C3014820. The EDA tool was supported by the IC Design Education Center (IDEC), Korea.
Publisher Copyright:
© 2022 Copyright held by the owner/author(s).
PY - 2022/8/2
Y1 - 2022/8/2
N2 - This paper presents a lowcost PMOS-based 8T (P-8T) SRAM Compute- In-Memory (CIM) architecture that efficiently per-forms the multiplyaccumulate (MAC) operations between 4-bit input activations and 8-bit weights. First, bit-line (BL) charge-sharing technique is employed to design the low-cost and reliable digital-to-analog conversion of 4-bit input activations in the pro-posed SRAM CIM, where the charge domain analog computing provides variation tolerant and linear MAC outputs. The 16 local arrays are also effectively exploited to implement the analog mul-tiplication unit (AMU) that simultaneously produces 16 multipli-cation results between 4-bit input activations and 1-bit weights. For the hardware cost reduction of analog-to-digital converter (ADC) without sacrificing DNN accuracy, hardware aware system simulations are performed to decide the ADC bit-resolutions and the number of activated rows in the proposed CIM macro. In addition, for the ADC operation, the AMU-based reference col-umns are utilized for generating ADC reference voltages, with which low-cost 4-bit coarse-fine flash ADC has been designed. The 25680 P-8T SRAM CIM macro implementation using 28nm CMOS process shows that the proposed CIM shows the accuracies of 91.46% and 66.67% with CIFAR-10 and CIFAR-100 dataset, respectively, with the energy efficiency of 50.07-TOPS/W.
AB - This paper presents a lowcost PMOS-based 8T (P-8T) SRAM Compute- In-Memory (CIM) architecture that efficiently per-forms the multiplyaccumulate (MAC) operations between 4-bit input activations and 8-bit weights. First, bit-line (BL) charge-sharing technique is employed to design the low-cost and reliable digital-to-analog conversion of 4-bit input activations in the pro-posed SRAM CIM, where the charge domain analog computing provides variation tolerant and linear MAC outputs. The 16 local arrays are also effectively exploited to implement the analog mul-tiplication unit (AMU) that simultaneously produces 16 multipli-cation results between 4-bit input activations and 1-bit weights. For the hardware cost reduction of analog-to-digital converter (ADC) without sacrificing DNN accuracy, hardware aware system simulations are performed to decide the ADC bit-resolutions and the number of activated rows in the proposed CIM macro. In addition, for the ADC operation, the AMU-based reference col-umns are utilized for generating ADC reference voltages, with which low-cost 4-bit coarse-fine flash ADC has been designed. The 25680 P-8T SRAM CIM macro implementation using 28nm CMOS process shows that the proposed CIM shows the accuracies of 91.46% and 66.67% with CIFAR-10 and CIFAR-100 dataset, respectively, with the energy efficiency of 50.07-TOPS/W.
KW - BL Charge-sharing
KW - CIM
KW - Compute-In-Memory
KW - MAC operation
KW - SRAM
UR - http://www.scopus.com/inward/record.url?scp=85136259720&partnerID=8YFLogxK
U2 - 10.1145/3531437.3539718
DO - 10.1145/3531437.3539718
M3 - Conference contribution
AN - SCOPUS:85136259720
T3 - Proceedings of the International Symposium on Low Power Electronics and Design
BT - 2022 ACM/IEEE International Symposium on Low Power Electronics and Design, ISLPED 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 ACM/IEEE International Symposium on Low Power Electronics and Design, ISLPED 2022
Y2 - 1 August 2022 through 2 August 2022
ER -