ADMM 기반 전역 가중치 제거를 통한 딥러닝 모델의 압축
Deep learning, which has recently shown excellent performance, has a problem that the amount of computation and required memory are large. Model compression is very useful because it saves memory and reduces storage size while maintaining model performance. Model compression methods reduce the number of edges by pruning weights that are deemed unnecessary in the calculation. Existing weight pruning methods using ADMM construct an optimization problem by a layer-by-layer addition of pre-defined removal-ratio constraints. Decomposing into two subproblems through the ADMM process, one can solve them through gradient descent and projection. However, the layer-by-layer removal ratios must be structurally specified, causing a sharp increase in training time due to a large number of parameters, and hardly feasible to use for large models that actually require weight pruning. Our proposed method performs weight pruning, producing similar performance, by setting a global removal ratio for the entire model without prior knowledge of structural characteristics in order to solve the shortcomings of the existing ADMM weight-pruning methods. To effectively avoid performance degradation, the method removes a relatively small number of previous layers in charge of feature extraction. Experiments show high-quality performance, not necessarily setting layer-by-layer removal ratios. Additionally, experiments increasing layers yield an insight for feature extraction in pruned layers. The experiment of the proposed method to the LeNet-5 model using MNIST data results in a higher compression ratio of 99.3% outperforming those of other existing algorithms. We also demonstrate the effectiveness of the proposed method in YOLOv4, an object detection model requiring substantial computation.