Package nltk :: Package classify :: Module maxent :: Class TadmEventMaxentFeatureEncoding
[hide private]
[frames] | no frames]

type TadmEventMaxentFeatureEncoding

source code

             object --+        
                      |        
 MaxentFeatureEncodingI --+    
                          |    
BinaryMaxentFeatureEncoding --+
                              |
                             TadmEventMaxentFeatureEncoding

Instance Methods [hide private]
 
__init__(self, labels, mapping, unseen_features=False, alwayson_features=False) source code
list of (int, number)
encode(self, featureset, label)
Given a (featureset, label) pair, return the corresponding vector of joint-feature values.
source code
list
labels(self)
Returns: A list of the "known labels" -- i.e., all labels l such that self.encode(fs,l) can be a nonzero joint-feature vector for some value of fs.
source code
str
describe(self, fid)
Returns: A string describing the value of the joint-feature whose index in the generated feature vectors is fid.
source code
int
length(self)
Returns: The size of the fixed-length joint-feature vectors that are generated by this encoding.
source code
Class Methods [hide private]
 
train(cls, train_toks, count_cutoff=0, labels=None, **options)
Construct and return new feature encoding, based on a given training corpus train_toks.
source code
Instance Variables [hide private]
Method Details [hide private]

__init__(self, labels, mapping, unseen_features=False, alwayson_features=False)
(Constructor)

source code 
Parameters:
  • labels - A list of the "known labels" for this encoding.
  • mapping - A dictionary mapping from (fname,fval,label) tuples to corresponding joint-feature indexes. These indexes must be the set of integers from 0...len(mapping). If mapping[fname,fval,label]=id, then self.encode({..., fname:fval, ...}, label)[id] is 1; otherwise, it is 0.
  • unseen_features - If true, then include unseen value features in the generated joint-feature vectors.
  • alwayson_features - If true, then include always-on features in the generated joint-feature vectors.
Overrides: BinaryMaxentFeatureEncoding.__init__
(inherited documentation)

encode(self, featureset, label)

source code 

Given a (featureset, label) pair, return the corresponding vector of joint-feature values. This vector is represented as a list of (index, value) tuples, specifying the value of each non-zero joint-feature.

Returns: list of (int, number)
Overrides: MaxentFeatureEncodingI.encode
(inherited documentation)

labels(self)

source code 
Returns: list
A list of the "known labels" -- i.e., all labels l such that self.encode(fs,l) can be a nonzero joint-feature vector for some value of fs.
Overrides: MaxentFeatureEncodingI.labels
(inherited documentation)

describe(self, fid)

source code 
Returns: str
A string describing the value of the joint-feature whose index in the generated feature vectors is fid.
Overrides: MaxentFeatureEncodingI.describe
(inherited documentation)

length(self)

source code 
Returns: int
The size of the fixed-length joint-feature vectors that are generated by this encoding.
Overrides: MaxentFeatureEncodingI.length
(inherited documentation)

train(cls, train_toks, count_cutoff=0, labels=None, **options)
Class Method

source code 

Construct and return new feature encoding, based on a given training corpus train_toks. See the class description for a description of the joint-features that will be included in this encoding.

Parameters:
  • train_toks - Training data, represented as a list of pairs, the first member of which is a feature dictionary, and the second of which is a classification label.
  • count_cutoff - A cutoff value that is used to discard rare joint-features. If a joint-feature's value is 1 fewer than count_cutoff times in the training corpus, then that joint-feature is not included in the generated encoding.
  • labels - A list of labels that should be used by the classifier. If not specified, then the set of labels attested in train_toks will be used.
  • options - Extra parameters for the constructor, such as unseen_features and alwayson_features.
Overrides: MaxentFeatureEncodingI.train
(inherited documentation)