DenseNet-DC: Optimizing DenseNet Parameters Through Feature Map Generation Control
Keywords:CNN, DenseNet, Optimization
AbstractConvolutional Neural Networks still suffer from the need for great computational power, often
restricting their use on various platforms. Therefore, we propose a new optimization method made for DenseNet, a convolutional neural network that has the characteristic of being completely connected. The objective of the method is to control the generation of the characteristic maps in relation to the moment the network is in, aiming to reduce the size of the network with the minimum of loss in accuracy. This control occurs reducing the number of feature maps through the addition of a new parameter called the Decrease Control or dc value, where the decrease occurs from half of the layers. In order to validate the behavior of the proposed model, experiments were performed using different image bases: MNIST, Fashion-MNIST, CIFAR-10, CIFAR-100, CALTECH-101, Cats vs Dogs and TinyImageNet. Some of the results achieved were: for the MNIST and Fashion-MNIST base, there was 43% parameter reduction. For the CIFAR-10 base achieved a 44% reduction in network parameters, while in base CIFAR-100 the parameter reduction are 43%. In the CALTECH-101 base the parameter optimization was 35%, while the Cats vs Dogs optimized 30% of model parameters. Finally, the TinyImageNet base was reduced 31% of the parameters.
 DEAN, J. et al. Large scale distributed deep networks. In: Advances in neural information processing systems. [S.l.: s.n.], 2012. p. 1223–1231.
KRIZHEVSKI, A.; SUTSKEVER, I.; HINTON, G. E. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, p. 1097–1105, 2012.
HE, K. et al. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. [S.l.: s.n.], 2016. p. 770–778.
CHENG, Y. et al. An exploration of parameter redundancy in deep networks with circulant projections. In: Proceedings of the IEEE International Conference on Computer Vision. [S.l.: s.n.], 2015. p. 2857–2865.
HUANG, G. et al. Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. [S.l.: s.n.], 2017. p. 4700–4708.
HOCHREITER, S. Recurrent neural net learning and vanishing gradient. International Journal Of Uncertainity, Fuzziness and Knowledge-Based Systems, Citeseer, v. 6, n. 2, p. 107–116, 1998.
ZHANG, Q. et al. Recent advances in convolutional neural network acceleration. Neurocomputing, Elsevier, v. 323, p. 37–51, 2019.
RAWAT, W.; WANG, Z. Deep convolutional neural networks for image classification: A comprehensive review. Neural Computation, v. 29, n. 9, p. 2352–2449, 2017.
MIAN, P. et al. A systematic review process for software engineering. In: ESELAW’05: 2nd Experimental Software Engineering Latin American Workshop. [S.l.: s.n.], 2005.
DENIL, M. et al. Predicting parameters in deep learning. In: Advances in neural information processing systems. [S.l.: s.n.], 2013. p. 2148–2156.
SAINATH, T. N. et al. Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In: IEEE. 2013 IEEE international conference on acoustics, speech and signal processing. [S.l.], 2013. p. 6655–6659.
JADERBERG, M.; VEDALDI, A.; ZISSERMAN, A. Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866, 2014.
DING, C. et al. Circnn: Accelerating and compressing deep neural networks using block-circulant weight matrices. In: Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture. New York, NY, USA: ACM, 2017. (MICRO-50 ’17), p. 395–408. Disponível em: http://doi.acm.org/10.1145/3123939.3124552.
LECUN, Y.; DENKER, J. S.; SOLLA, S. A. Optimal brain damage. In: Advances in neural information processing systems. [S.l.: s.n.], 1990. p. 598–605.
LI, H. et al. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710, 2016.
HE, J. et al. Research on video capture scheme and face recognition algorithms in a class attendance system. In: Proceedings of the International Conference on Watermarking and Image Processing. New York, NY, USA: ACM, 2017. (ICWIP 2017), p. 6–10. Disponível em: http://doi.acm.org/10.1145/3150978.3150983.
BUCILUǍ, C.; CARUANA, R.; NICULESCU-MIZIL, A. Model compression. In: ACM. Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. [S.l.], 2006. p. 535–541.
CARUANA, R. et al. Ensemble selection from libraries of models. In: Proceedings of the Twenty-first International Conference on Machine Learning. New York, NY, USA: ACM, 2004. (ICML ’04), p. 18–. Disponível em: http://doi.acm.org/10.1145/1015330.1015432.
XU, Z.; HSU, Y.-C.; HUANG, J. Training student networks for acceleration with conditional adversarial networks. In: BMVC. [S.l.: s.n.], 2018. p. 61.
TSANG, S. H. 2018. https://towardsdatascience.com/review-densenet-image-classification-b6631a8ef803. Acessado em 15 de Novembro de 2019.
ROBBINS, H.; MONRO, S. A stochastic approximation method. The annals of mathematical statistics, JSTOR, p. 400–407, 1951.
SHARMA, S. 2017. https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6. Acessado em 15 de Novembro de 2019.
DIETTERICH, T. G. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computing, MIT Press, Cambridge, MA, USA, v. 10, n. 7, p. 1895–1923, out. 1998. Disponível em: http://dx.doi.org/10.1162/089976698300017197.
MCNEMAR, Q. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, v. 12, n. 2, p. 153–157, Jun 1947. Disponível em: https://doi.org/10.1007/BF02295996.
WILCOXON, F. Individual comparisons by ranking methods. In: Breakthroughs in statistics. [S.l.]: Springer, 1992. p. 196–202.
DEMŠAR, J. Statistical comparisons of classifiers over multiple data sets. Journal of Machine learning research, v. 7, n. Jan, p. 1–30, 2006.
DENG, L. The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Processing Magazine, IEEE, v. 29, n. 6, p. 141–142, 2012.
LECUN, Y. et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, v. 86, n. 11, p. 2278–2324, 1998.
XIAO, H.; RASUL, K.; VOLLGRAF, R. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
KRIZHEVSKY, A.; HINTON, G. Convolutional deep belief networks on cifar-10. Unpublished manuscript, v. 40, n. 7, p. 1–9, 2010.
KRIZHEVSKY, A.; HINTON, G. et al. Learning multiple layers of features from tiny images. [S.l.], 2009.
FEI-FEI, L.; FERGUS, R.; PERONA, P. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In: IEEE. 2004 conference on computer vision and pattern recognition workshop. [S.l.], 2004. p. 178–178.
ELSON, J. et al. Asirra: a captcha that exploits interest aligned manual image categorization. 2007.
LE, Y.; YANG, X. Tiny imagenet visual recognition challenge. CS 231N, 2015.