Package nltk :: Package classify :: Module rte_classify
[hide private]
[frames] | no frames]

Module rte_classify

source code

Simple classifier for RTE corpus.

It calculates the overlap in words and named entities between text and hypothesis, and also whether there are words / named entities in the hypothesis which fail to occur in the text, since this is an indicator that the hypothesis is more informative than (i.e not entailed by) the text.

TO DO: better Named Entity classification TO DO: add lemmatization

Classes [hide private]
RTEFeatureExtractor
This builds a bag of words for both the text and the hypothesis after throwing away some stopwords, then calculates overlap and difference.
Functions [hide private]
 
ne(token)
This just assumes that words in all caps or titles are named entities.
source code
 
lemmatize(word)
Use morphy from WordNet to find the base form of verbs.
source code
 
rte_features(rtepair) source code
 
rte_classifier(trainer, features=<function rte_features at 0x11cc630>)
Classify RTEPairs
source code
 
demo_features() source code
 
demo_feature_extractor() source code
 
demo() source code
Function Details [hide private]

ne(token)

source code 

This just assumes that words in all caps or titles are named entities.

Parameters:
  • token (str)