Package nltk :: Module probability :: Class ConditionalFreqDist
[hide private]
[frames] | no frames]

type ConditionalFreqDist

source code

object --+
         |
        ConditionalFreqDist

A collection of frequency distributions for a single experiment run under different conditions. Conditional frequency distributions are used to record the number of times each sample occurred, given the condition under which the experiment was run. For example, a conditional frequency distribution could be used to record the frequency of each word (type) in a document, given its length. Formally, a conditional frequency distribution can be defined as a function that maps from each condition to the FreqDist for the experiment under that condition.

The frequency distribution for each condition is accessed using the indexing operator:

>>> cfdist[3]
<FreqDist with 73 outcomes>
>>> cfdist[3].freq('the')
0.4
>>> cfdist[3]['dog']
2

When the indexing operator is used to access the frequency distribution for a condition that has not been accessed before, ConditionalFreqDist creates a new empty FreqDist for that condition.

Conditional frequency distributions are typically constructed by repeatedly running an experiment under a variety of conditions, and incrementing the sample outcome counts for the appropriate conditions. For example, the following code will produce a conditional frequency distribution that encodes how often each word type occurs, given the length of that word type:

>>> cfdist = ConditionalFreqDist()
>>> for word in tokenize.whitespace(sent):
...     condition = len(word)
...     cfdist[condition].inc(word)

An equivalent way to do this is with the initializer:

>>> cfdist = ConditionalFreqDist((len(word), word) for word in tokenize.whitespace(sent))
Instance Methods [hide private]
 
__init__(self, cond_samples=None)
Construct a new empty conditional frequency distribution.
source code
FreqDist
__getitem__(self, condition)
Returns: The frequency distribution that encodes the frequency of each sample outcome, given that the experiment was run under the given condition.
source code
list
conditions(self)
Returns: A list of the conditions that have been accessed for this ConditionalFreqDist.
source code
int
__len__(self)
Returns: The number of conditions that have been accessed for this ConditionalFreqDist.
source code
int
N(self)
Returns: The total number of sample outcomes that have been recorded by this ConditionalFreqDist.
source code
 
plot(self, *args, **kwargs)
Plot the given samples from the conditional frequency distribution.
source code
 
tabulate(self, *args, **kwargs)
Tabulate the given samples from the conditional frequency distribution.
source code
 
__eq__(self, other) source code
 
__ne__(self, other) source code
 
__le__(self, other) source code
 
__lt__(self, other) source code
 
__ge__(self, other) source code
 
__gt__(self, other) source code
string
__repr__(self)
Returns: A string representation of this ConditionalFreqDist.
source code
Method Details [hide private]

__init__(self, cond_samples=None)
(Constructor)

source code 

Construct a new empty conditional frequency distribution. In particular, the count for every sample, under every condition, is zero.

Parameters:
  • cond_samples (Sequence of (condition, sample) tuples) - The samples to initialize the conditional frequency distribution with
Overrides: object.__init__

__getitem__(self, condition)
(Indexing operator)

source code 
Parameters:
  • condition (any) - The condition under which the experiment was run.
Returns: FreqDist
The frequency distribution that encodes the frequency of each sample outcome, given that the experiment was run under the given condition. If the frequency distribution for the given condition has not been accessed before, then this will create a new empty FreqDist for that condition.

conditions(self)

source code 
Returns: list
A list of the conditions that have been accessed for this ConditionalFreqDist. Use the indexing operator to access the frequency distribution for a given condition. Note that the frequency distributions for some conditions may contain zero sample outcomes.

__len__(self)
(Length operator)

source code 
Returns: int
The number of conditions that have been accessed for this ConditionalFreqDist.

N(self)

source code 
Returns: int
The total number of sample outcomes that have been recorded by this ConditionalFreqDist.

plot(self, *args, **kwargs)

source code 

Plot the given samples from the conditional frequency distribution. For a cumulative plot, specify cumulative=True. (Requires Matplotlib to be installed.)

Parameters:
  • samples (list) - The samples to plot
  • title (str) - The title for the graph
  • conditions (list) - The conditions to plot (default is all)

tabulate(self, *args, **kwargs)

source code 

Tabulate the given samples from the conditional frequency distribution.

Parameters:
  • samples (list) - The samples to plot
  • title (str) - The title for the graph
  • conditions (list) - The conditions to plot (default is all)

__repr__(self)
(Representation operator)

source code 
Returns: string
A string representation of this ConditionalFreqDist.
Overrides: object.__repr__