Package nltk :: Package metrics :: Module confusionmatrix :: Class ConfusionMatrix
[hide private]
[frames] | no frames]

type ConfusionMatrix

source code

object --+
         |
        ConfusionMatrix

The confusion matrix between a list of reference values and a corresponding list of test values. Entry [r,t] of this matrix is a count of the number of times that the reference value r corresponds to the test value t. E.g.:

>>> ref  = 'DET NN VB DET JJ NN NN IN DET NN'.split()
>>> test = 'DET VB VB DET NN NN NN IN DET NN'.split()
>>> cm = ConfusionMatrix(ref, test)
>>> print cm['NN', 'NN']
3

Note that the diagonal entries (Ri=Tj) of this matrix corresponds to correct values; and the off-diagonal entries correspond to incorrect values.

Instance Methods [hide private]
 
__init__(self, reference, test, sort_by_count=False)
Construct a new confusion matrix from a list of reference values and a corresponding list of test values.
source code
int
__getitem__(self, (li, lj))
Returns: The number of times that value li was expected and value lj was given.
source code
 
__repr__(self) source code
 
__str__(self) source code
 
pp(self, show_percents=False, values_in_chart=True, truncate=None, sort_by_count=False)
Returns: A multi-line string representation of this confusion matrix.
source code
 
key(self) source code
Instance Variables [hide private]
  _values
A list of all values in reference or test.
  _indices
A dictionary mapping values in self._values to their indices.
  _confusion
The confusion matrix itself (as a list of lists of counts).
  _max_conf
The greatest count in self._confusion (used for printing).
  _total
The total number of values in the confusion matrix.
  _correct
The number of correct (on-diagonal) values in the matrix.
Method Details [hide private]

__init__(self, reference, test, sort_by_count=False)
(Constructor)

source code 

Construct a new confusion matrix from a list of reference values and a corresponding list of test values.

Parameters:
  • reference (list) - An ordered list of reference values.
  • test (list) - A list of values to compare against the corresponding reference values.
Raises:
  • ValueError - If reference and length do not have the same length.
Overrides: object.__init__

__getitem__(self, (li, lj))
(Indexing operator)

source code 
Returns: int
The number of times that value li was expected and value lj was given.

__repr__(self)
(Representation operator)

source code 
Overrides: object.__repr__
(inherited documentation)

__str__(self)
(Informal representation operator)

source code 
Overrides: object.__str__
(inherited documentation)

pp(self, show_percents=False, values_in_chart=True, truncate=None, sort_by_count=False)

source code 
Parameters:
  • truncate (int) - If specified, then only show the specified number of values. Any sorting (e.g., sort_by_count) will be performed before truncation.
  • sort_by_count - If true, then sort by the count of each label in the reference data. I.e., labels that occur more frequently in the reference label will be towards the left edge of the matrix, and labels that occur less frequently will be towards the right edge.
Returns:
A multi-line string representation of this confusion matrix.

To Do: add marginals?