Classes with tests

The two shortcut functions uses classes that actually prepare samples of posteriors, compute probabilities and plot the distributions.

Let nbc and j48 contain average performances of two methods for a collection of data sets. With shortcut functions, we computed the signed-ranks test with

>>> two_on_multiple(nbc, j48, rope=1)
(0.23014, 0.00674, 0.76312)

This is equivalent to calling SignedRankTest.probs:

>>> SignedRankTest.probs(nbc, j48, rope=1)
(0.23014, 0.00674, 0.76312)

We may choose a different test, the SignTest:

>>> SignTest.probs(nbc, j48, rope=1)
(0.26344, 0.13722, 0.59934)

To plot the distribution, we call

>>> fig = SignedRankTest.plot(nbc, j48, rope=1)

Or we may prefer to see as a histogram:

>>> fig = SignedRankTest.plot_histogram(nbc, j48, names=("nbc", "j48"))
_images/signedrank-histogram.png

Using test classes instead of shortcut functions offers more control and insight in what the tests do.

Single data set

The test for comparing two classifiers on a single data set is implemented in class CorrelatedTTest.

The class uses a Bayesian interpretation of the t-test (A Bayesian approach for comparing cross-validated algorithms on multiple data sets, G. Corani and A. Benavoli, Mach Learning 2015).

The test assumes that the classifiers were evaluated using cross validation. The number of folds is determined from the length of the vector of differences, as len(diff) / runs. Computation includes a correction for underestimation of variance due to overlapping training sets (Inference for the Generalization Error, C. Nadeau and Y. Bengio, Mach Learning 2003).

Multiple data sets

The library has three tests for comparisons on multiple data sets: a sign test (SignTest), a signed-rank test (SignedRankTest) and a hierarchical t-test (HierarchicalTest).

All classes have a common interface but differ in the computation of the posterior distribution. Consequently, some tests have specific additional parameters.

Common methods

The common behaviour of all tests is defined in the class Test.

Note that all methods are class methods. Classes are used as nested namespace. As described in the next section, it is impossible to construct an instance of a Test or derived classes.

Tests