__init__(self,
filename,
startpos=0,
**kwargs)
(Constructor)
| source code
|
Create a new corpus view, based on the file fileid, and
read with block_reader. See the class documentation for
more information.
- Parameters:
fileid - The path to the file that is read by this corpus view.
fileid can either be a string or a PathPointer.
startpos - The file position at which the view will start reading. This can
be used to skip over preface sections.
encoding - The unicode encoding that should be used to read the file's
contents. If no encoding is specified, then the file's contents
will be read as a non-unicode string (i.e., a str).
source - If specified, then use an SourcedStringStream to annotate all strings read
from the file with information about their start offset, end
ofset, and docid. The value of ``source`` will be used as the
docid.
- Overrides:
util.StreamBackedCorpusView.__init__
- (inherited documentation)
|