Previous topic

Generic Learners

Next topic

distribution Learners

This Page

Classification Learners

The learners.classification module contains Learners meant for classification problems. They normally will require (at least) the metadata 'targets'. The MLProblems for these Learners should be iterators over pairs of inputs and targets, with the target being a class index.

The currently implemented algorithms are:

  • BayesClassifier: Bayes classifier obtained from distribution estimators.
  • LinearSVM: Linear SVM trained with Pegasos algorithm.
  • NNet: Neural Network for classification.
  • ClusteredWeightsNNet: Neural Network with cluster-dependent weights, for classification.
  • ClassNADE: NADE model for classification.
class learners.classification.BayesClassifier(estimators=[])[source]

Bayes classifier from distribution estimators.

Given one distribution learner per class (option estimators), this learner will train each one on a separate class and classify examples using Bayes’ rule.

Required metadata:

  • 'targets'
train(trainset)[source]

Trains each estimator. Each call to train increments self.stage by 1. If self.stage == 0, first initialize the model.

use(dataset)[source]

Outputs the class_id chosen by the algorithm, for each example in the dataset.

test(dataset)[source]

Outputs the class_id chosen by the algorithm and the classification error cost for each example in the dataset

class learners.classification.LinearSVM(l2_regularization=0.1, n_stages=-1, termination_threshold=1e-05, learning_rate=-1.0, decrease_constant=0.0)[source]

Linear SVM trained with Pegasos algorithm.

This class implements Pegasos applied to the multi-class version of the linear SVM. Moreover, it implements the online version of it, i.e. the version that updates the parameters based on a single example.

Option l2_regularization is the slack parameter (weight decay) (default=0.1).

Option n_stages is the number of training iterations on the training set (default=-1). If < 1, than training stop once the relative difference in parameter between iterations is smaller than a termination threshold (see option termination_threshold).

Option termination_threshold is the threshold used to detect convergence of training (default=0.00001). If the difference in the SVM objective between two successive iterations is smaller than this threshold, training is stopped. This option is ignored if n_stages is > 0.

Option learning_rate is the (starting) learning rate to use. If < 0, then will use the learning rate schedule prescribed by Pegasos.

Option decrease_constant is a constant controlling how fast the learning rate schedule decreases (default=0). The learning rate used for the t^th update is learning_rate/(1.+decrease_constant*t).

Required metadata:

  • 'input_size': Size of the input.
  • 'targets': Set of possible targets.
References:
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Shalev-Shwartz, Singer and Srebro

Multi-Class Pegasos on a Budget
Wang, Crammer and Vucetic
train(trainset)[source]

Trains the linear SVM using Pegasos. If self.stage == 0, first initializes the model.

use(dataset)[source]

Outputs the class_id chosen by the algorithm, for each example in the dataset.

test(dataset)[source]

Outputs the class_id chosen by the algorithm and the classification error cost for each example in the dataset

class learners.classification.NNet(n_stages, learning_rate=0.01, decrease_constant=0, hidden_sizes=[100], activation_function='sigmoid', seed=1234, pretrained_parameters=None)[source]

Neural Network for classification.

Option n_stages is the number of training iterations.

Options learning_rate and decrease_constant correspond to the learning rate and decrease constant used for stochastic gradient descent.

Option hidden_sizes should be a list of positive integers specifying the number of hidden units in each hidden layer, from the first to the last.

Option activation_function should be string describing the hidden unit activation function to use. Choices are 'sigmoid' (default), 'tanh' and 'reclin'.

Option seed determines the seed for randomly initializing the weights.

Option pretrained_parameters should be a pair made of the list of hidden layer weights and biases, to replace random initialization. If None (default), random initialization will be used.

Required metadata:

  • 'input_size': Size of the input.
  • 'targets': Set of possible targets.
class learners.classification.ClusteredWeightsNNet(n_stages=10, learning_rate=0.01, decrease_constant=0, hidden_size=10, seed=1234, n_clusters=10, n_k_means_stages=10, n_k_means=1, n_k_means_inputs=-1, autoencoder_regularization=0, autoencoder_missing_fraction=0.1, activation_function='reclin')[source]

Neural Network with cluster-dependent weights, for classification.

Option n_stages is the number of training iterations.

Options learning_rate and decrease_constant correspond to the learning rate and decrease constant used for stochastic gradient descent.

Option hidden_size is the hidden layer size for each clustered weights.

Option seed determines the seed for randomly initializing the weights.

Option n_clusters is the number of clusters to extract with k-means.

Option n_k_means_stages is the number of training iterations for k-means.

Option n_k_means is the number of k-means clusterings to produce.

Option n_k_means_inputs is the number of randomly selected inputs for each k-means clustering. If < 1, then will use all inputs.

Option autoencoder_regularization is the weight of regularization, based on the denoising autoencoder objective (default = 0).

Option autoencoder_missing_fraction is the fraction of inputs to mask and set to 0, for the denoising autoencoder objective (default = 0.1).

Option activation_function should be string describing the hidden unit activation function to use. Choices are 'sigmoid' (default), 'tanh' and 'reclin'.

Required metadata:

  • 'input_size': Size of the input.
  • 'targets': Set of possible targets.
fprop(input)[source]

Computes the output given some input. Puts the result in self.output

bprop(target)[source]

Computes the loss derivatives with respect to all parameters times the current learning rate. It assumes that self.fprop(input) was called first. All the derivatives are put in their corresponding object attributes (i.e. self.d*).

update()[source]

Updates the model’s parameters. Assumes that self.fprop(input) and self.bprop(input) was called first.

It also sets all gradient information to 0.

class learners.classification.ClassNADE(n_stages=1, learning_rate=0.01, decrease_constant=0, hidden_size=500, seed=1234, activation_function='sigmoid', input_order=None, untied_weights=True, alpha=1, learn_activation_weighting=False, recursive_activation=False)[source]

Neural Autoregressive Distribution Estimator (NADE) for classification

Option n_stages is the number of training iterations.

Option learning_rate is the learning rate.

Option decrease_constant is the decrease constant.

Option untied_weights is whether to untie the weights going into and out of the hidden units.

Option hidden_size is the number of hidden units.

Option input_order is the list of integers corresponding to the order for input modeling. If None (default), then a different input order is used for every training update. At test time, the original input ordering is used.

Option seed is the seed for randomly initializing the weights.

Option activation_function is the hidden unit activation function. Choices are 'sigmoid' (default), 'tanh' and 'reclin'.

Option alpha is the weight vector for each input generative cost.

Option learn_activation_weighting is whether to learn a separate multiplicative weight of the hidden layer activation, for each conditional.

Option recursive_activation is whether the hidden units should have a self (recursive) connection.

Required metadata:

  • 'input_size': Size of the input.
  • 'targets': Set of possible targets.
Reference:
The Neural Autoregressive Distribution Estimator
Larochelle and Murray
recursive_sigmoid(input, output)[source]

Computes sigmoidal units with self-connections, parameterized by self.beta.

drecursive_sigmoid(output, doutput, dinput)[source]

Propagates derivative through sigmoidal units with self-connections and computes the derivatives with respect to the parameters self.beta (stored in self.dbeta).

recursive_tanh(input, output)[source]

Computes tanh units with self-connections, parameterized by self.beta.

drecursive_tanh(output, doutput, dinput)[source]

Propagates derivative through tanh units with self-connections and computes the derivatives with respect to the parameters self.beta (stored in self.dbeta).

recursive_reclin(input, output)[source]

Computes rectified linear units with self-connections, parameterized by self.beta.

drecursive_reclin(output, doutput, dinput)[source]

Propagates derivative through rectified linear units with self-connections and computes the derivatives with respect to the parameters self.beta (stored in self.dbeta).

use_learner(example)[source]

Returns the predicted class, the output class probability distribution and the reconstruction probabilities