Table Of Contents

Previous topic

TreeLearn Learners

Next topic

Mathutils

This Page

Orange Learners

The package learners.third_party.orange contains modules for learning algorithms implemented by in the Orange library. These modules all require that the Orange libraries be installed.

To install Orange, download the packed sources from http://orange.biolab.si/nightly_builds.html, unzip the sources and in the unzipped directory run:

python setup.py build
sudo python setup.py install

or see http://orange.biolab.si/nightly_builds.html for other ways to install Orange.

Currently, learner.third_party.orange contains the following modules:

  • learning.third_party.orange.classification: Classifiers from the Orange library.

Orange Classifiers

The learners.third_party.orange.classification module contains classifiers from the Orange library:

  • RandomForest: Random forest classifier.
  • BoostedTrees: Ensemble of boosted trees (Adaboost.M1).

It also contains one helper function:

  • make_orange_dataset: converts an MLProblem into a classification dataset in Orange format.
learners.third_party.orange.classification.make_orange_dataset(dataset, domain=None)[source]

Returns a classification dataset into the Orange format. The domain of the dataset can be specified (default is None, in which case the domain is computed from the metadata).

class learners.third_party.orange.classification.RandomForest(n_trees=50, n_features_per_node=None, seed=1234)[source]

Random Forest classifeir based on the Orange library.

Option n_trees is the number of trees to train in the ensemble (default = 50).

Option n_features_per_node is the number of inputs (features) to consider when splitting a tree node. The default (None) is to use the square root of the input size.

Option seed will set the random number generator’s seed.

Required metadata:

  • 'targets'
  • 'class_to_id'
train(trainset)[source]

Trains a random forest using Orange.

use(dataset)[source]

Outputs the class predictions for dataset.

test(dataset)[source]

Outputs the result of use(dataset) and the classification error cost for each example in the dataset.

class learners.third_party.orange.classification.BoostedTrees(n_trees=50, max_majority=1.0, max_depth=2, min_leaf_size=0, skip_prob=0)[source]

Ensemble of decision trees based on AdaBoost.M1.

Option n_trees is the number of trees to train in the ensemble (default = 50).

Option max_majority is the maximal proportion of the majority class. When this is exceeded, a node is not split further (default = 1.0).

Option max_depth is the maximum depth of the trees (default = 2).

Option min_leaf_size is a minimum threshold on the number of training examples in a node, below which a node is not split (default = 0).

Option skip_prob is the probability of skipping an input when considering splits for a node (default = 0).

Required metadata:

  • 'targets'
  • 'class_to_id'
train(trainset)[source]

Trains an ensemble of tree with Adaboost.M1.

use(dataset)[source]

Outputs the class predictions for dataset.

test(dataset)[source]

Outputs the result of use(dataset) and the classification error cost for each example in the dataset.