type NgramAssocMeasures
source code
object --+
|
NgramAssocMeasures
- Known Subclasses:
-
An abstract class defining a collection of generic association measures.
Each public method returns a score, taking the following arguments:
score_fn(count_of_ngram,
(count_of_n-1gram_1, ..., count_of_n-1gram_j),
(count_of_n-2gram_1, ..., count_of_n-2gram_k),
...,
(count_of_1gram_1, ..., count_of_1gram_n),
count_of_total_words)
See L{BigramAssocMeasures} and L{TrigramAssocMeasures}
Inheriting classes should define a property _n, and a method _contingency
which calculates contingency values from marginals in order for all
association measures defined here to be usable.
|
|
_expected_values(cls,
cont)
Calculates expected values for a contingency table. |
source code
|
|
|
|
student_t(cls,
*marginals)
Scores ngrams using Student's t test with independence hypothesis for
unigrams, as in Manning and Schutze 5.3.2. |
source code
|
|
|
|
chi_sq(cls,
*marginals)
Scores ngrams using Pearson's chi-square as in Manning and Schutze |
source code
|
|
|
|
pmi(cls,
*marginals)
Scores ngrams by pointwise mutual information, as in Manning and
Schutze 5.4. |
source code
|
|
|
|
likelihood_ratio(cls,
*marginals)
Scores ngrams using likelihood ratios as in Manning and Schutze
5.3.4. |
source code
|
|
|
|
poisson_stirling(cls,
*marginals)
Scores ngrams using the Poisson-Stirling measure. |
source code
|
|
|
|
jaccard(cls,
*marginals)
Scores ngrams using the Jaccard index. |
source code
|
|
|
|
_contingency(*marginals)
Calculates values of a contingency table from marginal values. |
source code
|
|
|
|
_marginals(*contingency)
Calculates values of contingency table marginals from its values. |
source code
|
|
|
|
raw_freq(*marginals)
Scores ngrams by their frequency |
source code
|
|
|
|
mi_like(*marginals,
**kwargs)
Scores ngrams using a variant of mutual information. |
source code
|
|
|
Scores ngrams using Pearson's chi-square as in Manning and Schutze
-
|
mi_like(*marginals,
**kwargs)
Static Method
| source code
|
Scores ngrams using a variant of mutual information. The keyword
argument power sets an exponent (default 3) for the numerator. No
logarithm of the result is calculated.
|