Represents an annotation task, i.e. people assign labels to items.
Notation tries to match notation in Artstein and Poesio (2007).
In general, coders and items can be represented as any hashable
object. Integers, for example, are fine, though strings are more
readable. Labels must support the distance functions applied to them, so
e.g. a string-edit-distance makes no sense if your labels are integers,
whereas interval distance needs numeric values. A notable case of this
is the MASI metric, which requires Python sets.
|
|
__init__(self,
data=None,
distance=<function binary_distance at 0x10db7b0>)
Initialize an empty annotation task. |
source code
|
|
|
|
|
|
|
|
|
|
agr(self,
cA,
cB,
i)
Agreement between two coders on a given item |
source code
|
|
|
|
N(self,
k=None,
i=None,
c=None)
Implements the "n-notation" used in Artstein and Poesio
(2007) |
source code
|
|
|
|
Ao(self,
cA,
cB)
Observed agreement between two coders on all items. |
source code
|
|
|
|
avg_Ao(self)
Average observed agreement across all coders and items. |
source code
|
|
|
|
|
|
|
Do_Kw_pairwise(self,
cA,
cB,
max_distance=1.0)
The observed disagreement for the weighted kappa coefficient. |
source code
|
|
|
|
Do_Kw(self,
max_distance=1.0)
Averaged over all labelers |
source code
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
weighted_kappa_pairwise(self,
cA,
cB,
max_distance=1.0)
Cohen 1968 |
source code
|
|
|
|
|