Data science has become a fourth approach to scientific discovery, in addition to experimentation, theory, and simulation. From a mathematical perspective, a fundamental problem in data science is to approximate an unknown target function using its data. In this talk, I will
give an overview on some of fundamental issues in data science.
Modern machine learning has had tremendous success in wide range of applications. However, its theoretical understanding remains elusive. The first part of this talk will be focused on recent theoretical progress on neural network-based machine learning. It has been widely known that the deeper a neural network is, the harder it is to train. Although, there are many empirical and heuristic explanations, little is known about its theoretical analysis. A rigorous answer will be given by showing that a deep ReLU network will eventually die in probability as the depth goes to infinite.
The second part of this talk will be devoted to approximation under different data collection scenarios. Depending on how data are collected, different approximation methods need to be applied in order to utilize the data properly. Two scenarios will be discussed. One is for big data, and the other is for corrupted data.