type HunposTagger
source code
object --+
|
api.TaggerI --+
|
HunposTagger
A class for pos tagging with HunPos. The input is the paths to:
-
a model trained on training data
-
(optionally) the path to the hunpos-tag binary
-
(optionally) the encoding of the training data (default: ISO-8859-1)
Example:
>>> ht = HunposTagger('english.model')
>>> ht.tag('What is the airspeed of an unladen swallow ?'.split())
[('What', 'WP'), ('is', 'VBZ'), ('the', 'DT'), ('airspeed', 'NN'),
('of', 'IN'), ('an', 'DT'), ('unladen', 'NN'), ('swallow', 'VB'), ('?', '.')]
>>> ht.close()
This class communicates with the hunpos-tag binary via pipes. When the
tagger object is no longer needed, the close() method should be called to
free system resources. The class supports the context manager interface;
if used in a with statement, the close() method is invoked
automatically:
>>> with HunposTagger('english.model') as ht:
... ht.tag('What is the airspeed of an unladen swallow ?'.split())
...
[('What', 'WP'), ('is', 'VBZ'), ('the', 'DT'), ('airspeed', 'NN'),
('of', 'IN'), ('an', 'DT'), ('unladen', 'NN'), ('swallow', 'VB'), ('?', '.')]
|
|
__init__(self,
path_to_model,
path_to_bin=None,
encoding='ISO-8859-1',
verbose=False)
Starts the hunpos-tag executable and establishes a connection with
it. |
source code
|
|
|
|
|
|
|
close(self)
Closes the pipe to the hunpos executable. |
source code
|
|
|
|
|
|
|
| __exit__(self,
exc_type,
exc_value,
traceback) |
source code
|
|
list of (token, tag)
|
tag(self,
tokens)
Tags a single sentence: a list of words. |
source code
|
|
|
Inherited from api.TaggerI:
batch_tag,
evaluate
|
__init__(self,
path_to_model,
path_to_bin=None,
encoding='ISO-8859-1',
verbose=False)
(Constructor)
| source code
|
Starts the hunpos-tag executable and establishes a connection with
it.
- Parameters:
- Overrides:
object.__init__
|
|
Tags a single sentence: a list of words. The tokens should not contain
any newline characters.
- Returns:
list of (token, tag)
- Overrides:
api.TaggerI.tag
|