Previous topic

Classification Learners

Next topic

Feature Learners

This Page

distribution Learners

The learners.distribution module contains Learners meant for density or distribution estimation problems. The MLProblems for these Learners should be iterators over inputs.

The currently implemented algorithms are:

  • Bagdistribution: a distribution estimation learner where each example is a bag of inputs.
  • NADE: the Neural Autoregressive Distribution Estimator (NADE) for multivariate binary distribution estimation
  • PoissonNADE: the Neural Autoregressive Distribution Estimator (NADE) for multivariate Poisson observations
  • FVSBN: a fully visible Sigmoid Belief Network (FVSBN) for binary distribution estimation
class learners.distribution.BagDistribution(estimator=None)[source]

A distribution estimation learner where each example is a bag of inputs.

Given a distribution learner (given by the user), this learner will train it on all inputs in all bags. It is assumed that the distribution learner outputs its estimate of the log-distribution (when calling use(...)).

train(trainset)[source]

Trains the estimator on all examples in all bags. Each call to train increments self.stage by 1.

use(dataset)[source]

Outputs the sum of the distribution learning outputs for all inputs in each bag (example).

test(dataset)[source]

Outputs the NLLs of each example, normalized by the size of the example’s bag.

class learners.distribution.NADE(n_stages=1, learning_rate=0.01, decrease_constant=0, hidden_size=500, seed=1234, input_order=None, untied_weights=True, alpha=1, learn_activation_weighting=False, recursive_activation=False)[source]

Neural Autoregressive Distribution Estimator (NADE) for multivariate binary distribution estimation

Option n_stages is the number of training iterations.

Option learning_rate is the learning rate.

Option decrease_constant is the decrease constant.

Option untied_weights is whether to untie the weights going into and out of the hidden units.

Option hidden_size is the number of hidden units.

Option input_order is the list of integers corresponding to the order for input modeling. If None (default), then a different input order is used for every training update. At test time, the original input ordering is used.

Option seed is the seed for randomly initializing the weights.

Option alpha is the weight vector for each input generative cost.

Option learn_activation_weighting is whether to learn a separate multiplicative weight of the hidden layer activation, for each conditional.

Option recursive_activation is whether the hidden units should have a self (recursive) connection.

Required metadata:

  • 'input_size'
Reference:
The Neural Autoregressive Distribution Estimator
Larochelle and Murray
recursive_sigmoid(input, output)[source]

Computes sigmoidal units with self-connections, parameterized by self.beta.

drecursive_sigmoid(output, doutput, dinput)[source]

Propagates derivative through sigmoidal units with self-connections and computes the derivatives with respect to the parameters self.beta (stored in self.dbeta).

class learners.distribution.FVSBN(n_stages, **kw)[source]

A fully visible Sigmoid Belief Network (FVSBN) for binary distribution estimation

Option n_stages is the number of training iterations.

Option learning_rate is the learning rate.

Option decrease_constant is the decrease constant.

Option input_order is the list of integers corresponding to the order for input modeling.

Option seed is the seed for randomly initializing the weights.

Required metadata:

  • 'input_size'
Reference:
Connectionist Learning of Belief Networks
Neal
class learners.distribution.PoissonNADE(n_stages, hidden_size=100, learning_rate=0.001, seed=1234, fTarget=True, fPoisson=True)[source]

Neural autoregressive Poisson distribution estimator for topic model.

Option n_stages is the number of training iterations.

Option hidden_size should be a positive integer specifying the number of hidden units (features).

Options learning_rate is the learning rate (default=0.001).

Option seed determines the seed for randomly initializing the weights.

Option fTarget to know if the data have targets.

Option fPoisson, if True we use the Poisson distribution (Sigmoid if False).

Required metadata:

  • 'input_size': Vocabulary size