The learners.distribution module contains Learners meant for density or distribution estimation problems. The MLProblems for these Learners should be iterators over inputs.
The currently implemented algorithms are:
A distribution estimation learner where each example is a bag of inputs.
Given a distribution learner (given by the user), this learner will train it on all inputs in all bags. It is assumed that the distribution learner outputs its estimate of the log-distribution (when calling use(...)).
Trains the estimator on all examples in all bags. Each call to train increments self.stage by 1.
Neural Autoregressive Distribution Estimator (NADE) for multivariate binary distribution estimation
Option n_stages is the number of training iterations.
Option learning_rate is the learning rate.
Option decrease_constant is the decrease constant.
Option untied_weights is whether to untie the weights going into and out of the hidden units.
Option hidden_size is the number of hidden units.
Option input_order is the list of integers corresponding to the order for input modeling. If None (default), then a different input order is used for every training update. At test time, the original input ordering is used.
Option seed is the seed for randomly initializing the weights.
Option alpha is the weight vector for each input generative cost.
Option learn_activation_weighting is whether to learn a separate multiplicative weight of the hidden layer activation, for each conditional.
Option recursive_activation is whether the hidden units should have a self (recursive) connection.
Required metadata:
A fully visible Sigmoid Belief Network (FVSBN) for binary distribution estimation
Option n_stages is the number of training iterations.
Option learning_rate is the learning rate.
Option decrease_constant is the decrease constant.
Option input_order is the list of integers corresponding to the order for input modeling.
Option seed is the seed for randomly initializing the weights.
Required metadata:
Neural autoregressive Poisson distribution estimator for topic model.
Option n_stages is the number of training iterations.
Option hidden_size should be a positive integer specifying the number of hidden units (features).
Options learning_rate is the learning rate (default=0.001).
Option seed determines the seed for randomly initializing the weights.
Option fTarget to know if the data have targets.
Option fPoisson, if True we use the Poisson distribution (Sigmoid if False).
Required metadata: