type StanfordTagger
source code
object --+
|
api.TaggerI --+
|
StanfordTagger
A class for pos tagging with Stanford Tagger. The input is the paths
to:
-
a model trained on training data
-
(optionally) the path to the stanford tagger jar file. If not
specified here, then this jar file must be specified in the CLASSPATH
envinroment variable.
-
(optionally) the encoding of the training data (default: ASCII)
Example:
>>> st = StanfordTagger('bidirectional-distsim-wsj-0-18.tagger')
>>> st.tag('What is the airspeed of an unladen swallow ?'.split())
[('What', 'WP'), ('is', 'VBZ'), ('the', 'DT'), ('airspeed', 'NN'),
('of', 'IN'), ('an', 'DT'), ('unladen', 'JJ'), ('swallow', 'VB'), ('?', '.')]
|
|
|
list of (token, tag)
|
tag(self,
tokens)
Determine the most appropriate tag sequence for the given token
sequence, and return a corresponding list of tagged tokens. |
source code
|
|
|
|
|
|
Inherited from api.TaggerI:
evaluate
|
__init__(self,
path_to_model,
path_to_jar=None,
encoding=None,
verbose=False)
(Constructor)
| source code
|
- Overrides:
object.__init__
- (inherited documentation)
|
|
Determine the most appropriate tag sequence for the given token
sequence, and return a corresponding list of tagged tokens. A tagged
token is encoded as a tuple (token, tag).
- Returns:
list of (token, tag)
- Overrides:
api.TaggerI.tag
- (inherited documentation)
|
|
Apply self.tag() to each element of sentences.
I.e.:
>>> return [self.tag(sent) for sent in sentences]
- Overrides:
api.TaggerI.batch_tag
- (inherited documentation)
|