Package nltk :: Package corpus :: Package reader :: Module tagged :: Class MacMorphoCorpusReader
[hide private]
[frames] | no frames]

type MacMorphoCorpusReader

source code

      object --+        
               |        
api.CorpusReader --+    
                   |    
  TaggedCorpusReader --+
                       |
                      MacMorphoCorpusReader

A corpus reader for the MAC_MORPHO corpus. Each line contains a single tagged word, using '_' as a separator. Sentence boundaries are based on the end-sentence tag ('_.'). Paragraph information is not included in the corpus, so each paragraph returned by self.paras() and self.tagged_paras() contains a single sentence.

Instance Methods [hide private]
 
__init__(self, root, fileids, encoding=None, tag_mapping_function=None)
Construct a new Tagged Corpus reader for a set of documents located at the given root directory.
source code
 
_read_block(self, stream) source code

Inherited from TaggedCorpusReader: paras, raw, sents, tagged_paras, tagged_sents, tagged_words, words

Inherited from api.CorpusReader: __repr__, abspath, abspaths, encoding, fileids, open, readme

Inherited from api.CorpusReader (private): _get_root

    Deprecated since 0.9.7

Inherited from api.CorpusReader: files

    Deprecated since 0.9.1

Inherited from api.CorpusReader: items

Inherited from api.CorpusReader (private): _get_items

Instance Variables [hide private]

Inherited from api.CorpusReader (private): _encoding, _fileids, _root

Properties [hide private]

Inherited from api.CorpusReader: root

Method Details [hide private]

__init__(self, root, fileids, encoding=None, tag_mapping_function=None)
(Constructor)

source code 

Construct a new Tagged Corpus reader for a set of documents located at the given root directory. Example usage:

>>> root = '/...path to corpus.../'
>>> reader = TaggedCorpusReader(root, '.*', '.txt')
Parameters:
  • root - The root directory for this corpus.
  • fileids - A list or regexp specifying the fileids in this corpus.
Overrides: api.CorpusReader.__init__
(inherited documentation)