In recent years, due to the rapid development of deep convolutional neural networks, deep learning model inference needs to consume a lot of computing resources. Most current edge devices cannot support deep learning applications with low latency, low power consumption, and high accuracy due to limited resources. Deep learning applications. Therefore, model compression and acceleration of deep networks are an effective solution, and network pruning that simplifies the model by removing redundant parameters in the inference stage is a hot research in this field in recent years. This paper divides the work into six aspects for a detailed analysis, combs the latest progress of deep neural network pruning technology from the perspective of different granular pruning and weight measurement standards, and finally points out the problems in the current research and analyzes Future research directions in the field of pruning.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.