Universal approximation, whether a set of functions can approximate an arbitrary function in a specific function space, has been actively studied in recent years owing to the significant development of neural networks. Neural networks have various constraints according to the structures, and the range of functions that can be approximated varies depending on the structure. In this thesis, we demonstrate the universal approximation theorem for two different deep learning network structures: convolutional neural networks and recurrent neural networks.
First, we proved the universality of convolutional neural networks. A convolution with padding outputs the data of the same shape as the input data; therefore, it is necessary to prove whether a convolutional neural network composed of convolutions can approximate such a function. We have shown that convolutional neural networks can approximate continuous functions whose input and output values have the same shape. In addition, the minimum depth of the neural network required for approximation was presented, and we proved that it is the optimal value. We also verified that convolutional neural networks with sufficiently deep layers have universality when the number of channels is limited.
Second, we investigated the universality of recurrent neural networks. A recurrent neural network is past dependent, and we studied the universality of recurrent neural networks in the past-dependent function space. Specifically, we demonstrated that a multilayer recurrent neural network with limited channels could approximate arbitrary past-dependent continuous functions and Lp functions, respectively. We also extended this result to bidirectional recurrent neural networks, GRU, and LSTM.