Design of deep convolutional neural network accelerator based on low-cost FPGA

Home > Archive>Volume 47, Issue 10, 2024 >184-190

Design of deep convolutional neural network accelerator based on low-cost FPGA
DOI:
                        
CSTR:
                        [cstr]
                    
Author:
                        
Affiliation:School of Microelectronics, Hefei University of Technology，Hefei 230601, China
Clc Number:TN46
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Existing DCNN generate a large amount of inter-layer feature data during inference. To maintain real-time processing on embedded systems, a significant amount of onchip storage is required to cache inter-layer feature maps. This paper proposes an inter-layer feature compression technique to significantly reduce off-chip memory access bandwidth. Additionally, a generic convolution computation scheme tailored for BRAM in FPGA is proposed, with optimizations made at the circuit level to reduce memory accesses and improve DSP computational efficiency, thereby greatly enhancing computation speed. Compared to running MobileNetV2 on a CPU, the proposed DCNN accelerator in this paper achieves a performance improvement of 6.3 times; compared to other DCNN accelerators of the same type, the proposed DCNN accelerator in this paper achieves DSP performance efficiency improvements of 17% and 156%, respectively.

Reference

Cited by

Get Citation

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:
Revised:
Adopted:
Online: September 12,2024
Published:

Home

Introduction

Editorial Committee

Policy

Contact Us

中文版

Get Citation

Share

Article Metrics

History

Article QR Code