type BigramAssocMeasures
source code
object --+
|
NgramAssocMeasures --+
|
BigramAssocMeasures
A collection of trigram association measures. Each association measure
is provided as a function with three arguments:
bigram_score_fn(n_ii, (n_ix, n_xi), n_xx)
The arguments constitute the marginals of a contingency table, counting
the occurrences of particular events in a corpus. The letter i in the
suffix refers to the appearance of the word in question, while x indicates
the appearance of any word. Thus, for example:
n_ii counts (w1, w2), i.e. the bigram being scored
n_ix counts (w1, *)
n_xi counts (*, w2)
n_xx counts (*, *), i.e. any bigram
This may be shown with respect to a contingency table:
w1 ~w1
------ ------
w2 | n_ii | n_oi | = n_xi
------ ------
~w2 | n_io | n_oo |
------ ------
= n_ix TOTAL = n_xx
|
|
phi_sq(cls,
*marginals)
Scores bigrams using phi-square, the square of the Pearson
correlation coefficient. |
source code
|
|
|
|
chi_sq(cls,
n_ii,
(n_ix, n_xi),
n_xx)
Scores bigrams using chi-square, i.e. |
source code
|
|
|
Inherited from NgramAssocMeasures:
jaccard,
likelihood_ratio,
pmi,
poisson_stirling,
student_t
|
|
|
_contingency(n_ii,
(n_ix, n_xi),
n_xx)
Calculates values of a bigram contingency table from marginal values. |
source code
|
|
|
|
_marginals(n_ii,
n_oi,
n_io,
n_oo)
Calculates values of contingency table marginals from its values. |
source code
|
|
|
|
|
|
|
dice(n_ii,
(n_ix, n_xi),
n_xx)
Scores bigrams using Dice's coefficient. |
source code
|
|
|
Inherited from NgramAssocMeasures:
mi_like,
raw_freq
|
chi_sq(cls,
n_ii,
(n_ix, n_xi),
n_xx)
Class Method
| source code
|
Scores bigrams using chi-square, i.e. phi-sq multiplied by the number
of bigrams, as in Manning and Schutze 5.3.3.
- Overrides:
NgramAssocMeasures.chi_sq
|