Theory of Deep Learning

Abstract

Deep learning has been widely applied and brought breakthroughs in speech recognition, computer vision, and many other practical domains. The involved deep neural network architectures and computational issues have been well considered in the community of machine learning. But there lacks a theoretical foundation for understanding the modelling, approximation or generalization ability of deep learning models and algorithms. We are interested in deep convolutional neural networks (CNNs) which are practically powerful in processing natural images and speeches. The convolutional architecture and structures give essential differences between the deep CNNs and fully-connected deep neural networks, and the classical theory for fully connected networks developed around 30 years ago does not apply. This talk describes a theory of deep CNNs associated with the rectied linear unit (ReLU) activation function. In particular, we give the first proof for the universality of deep CNNs, meaning that a deep CNN can be used to approximate any continuous function to an arbitrary accuracy when the depth of the neural network is large enough. We also give explicit rates of approximation, and show that the approximation ability of deep CNNs is at least as good as that of fully-connected multi-layer neural networks. Our quantitative estimates, given tightly in terms of the number of free parameters to be trained, verify the effciency of deep learning algorithms in dealing with large dimensional data.

Speaker: Professor Ding-Xuan ZHOU
Date: 3 June 2020 (Wed)
Time: 11:00am - 12:00pm
Poster: Click here

Latest Seminar

Biography

Professor Ding-Xuan ZHOU received his B.Sc and Ph.D degrees in applied mathematics in 1988 and 1991, respectively, from Zhejiang University. He is currently a Chair Professor in School of Data Science and Department of Mathematics at CityU, serving also as Associate Dean of School of Data Science, and Director of the Liu Bie Ju Centre for Mathematical Sciences. He received a Humboldt Research Fellowship in 1993 and a Fund for Distinguished Young Scholars from NSFC in 2005, and was rated in 2014-2017 by Thomson Reuters/Clarivate Analytics as a Highly-cited Researcher . His research interests include deep learning, machine learning theory, wavelet analysis and approximation theory of deep neural networks. He is serving on editorial boards of more than ten international journals such as Applied and Computational Harmonic Analysis, Journal of Approximation Theory, Journal of Complexity, Econometrics and Statistics, and Frontiers in Mathematics of Computation and Data Science. He is an Editor-in-Chief of the journals "Analysis and Application" and "Mathematical Foundations of Computing", and the book series "Progress in Data Science".