Deep learning


Since most artificial intelligence systems don't come close to the ability of the human brain to solve many problems related to vision, speech recognition and natural language understanding, a lot of research has been trying to draw inspiration from the human brain to design machine learning solutions to such tasks. One obvious property of the brain is its deep, layered connectivity, particularly apparent in the visual cortex. Yet, until the mid-2000s, attemps to train artificial neural networks with several hidden layers have mostly failed, i.e. would generally not yield performances higher than non-deep or shallow neural networks.

In 2006, Geoffrey Hinton, Simon Osindero and Yee Whye Teh designed the deep belief network, a probabilistic neural network, along with an efficient greedy procedure for successfully pre-training (i.e. initializing) it. This procedure relies on the learning algorithm of the restricted Boltzmann machine (RBM) for layer-wise training of the hidden layers, in an unsupervised fashion.

Later, in Greedy Layer-Wise Training of Deep Networks, Yoshua Bengio, Pascal Lamblin, Dan Popovici and myself generalized this procedure by showing that the learning algorithm of the autoencoder could also be used for greedy layer-wise pre-training. Along with similar results from Yann Lecun's group, this initial work culminated into the creation of a new topic of research: deep learning.

Ever since, I've been interested in further studying the original pre-training procedure and deriving new deep learning algorithms as well. I give here a quick overview of some of the work I've been doing.

In An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation, along with many colleagues from the LISA lab, I've empirically compared the performance of deep neural networks (based on RBMs or autoencoders) on several image classification problems of varying complexity, showing that they generally would outperform other shallow models.

In Extracting and Composing Robust Features with Denoising Autoencoders, Pascal Vincent, Yoshua Bengio, Pierre-Antoine Manzagol and myself designed the denoising autoencoder, which outperforms both the regular autoencoder and the RBM as a pre-training module.

In Deep Learning using Robust Interdependent Codes (joint work with Pascal Vincent and Dumitru Erhan), I investigated a variant of the denoising autoencoder that would allow for more complexe interactions between hidden units within the same hidden layer.

Deep Boltzmann machines are another interesting alternative to deep belief networks and artifical neural networks for deep learning, developed by Ruslan Salakhutdinov and Geoffrey Hinton. In Efficient Learning of Deep Boltzmann Machines, Ruslan Salakhutdinov and I looked at the deep Boltzmann machine and proposed a more efficient learning algorithm based on an improved approximate inference procedure.

References


  • Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion [pdf]
    Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio and Pierre-Antoine Manzagol,
    Journal of Machine Learning Research, 11(Dec): 3371--3408, 2010

  • Efficient Learning of Deep Boltzmann Machines [pdf][code]
    Ruslan Salakhutdinov and Hugo Larochelle,
    Artificial Intelligence and Statistics, 2010

  • Exploring Strategies for Training Deep Neural Networks [pdf]
    Hugo Larochelle, Yoshua Bengio, Jérôme Louradour and Pascal Lamblin,
    Journal of Machine Learning Research, 10(Jan): 1--40, 2009

  • Deep Learning using Robust Interdependent Codes [pdf]
    Hugo Larochelle, Dumitru Erhan and Pascal Vincent,
    Artificial Intelligence and Statistics, 2009

  • Extracting and Composing Robust Features with Denoising Autoencoders [pdf]
    Pascal Vincent, Hugo Larochelle, Yoshua Bengio and Pierre-Antoine Manzagol,
    International Conference on Machine Learning proceedings, 2008

  • An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation [pdf][html]
    Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra and Yoshua Bengio,
    International Conference on Machine Learning proceedings, 2007

  • Greedy Layer-Wise Training of Deep Networks [pdf]
    Yoshua Bengio, Pascal Lamblin, Dan Popovici and Hugo Larochelle,
    Advances in Neural Information Processing Systems 19, 2007