Package nltk :: Module sourcedstring :: Class ConsecutiveCharStringSource
[hide private]
[frames] | no frames]

type ConsecutiveCharStringSource

source code

  object --+    
           |    
StringSource --+
               |
              ConsecutiveCharStringSource

A StringSource that specifies the source of strings whose characters have consecutive offsets. In particular, the following two properties must hold for all valid indices:

These properties allow the source to be stored using just a start offset and an end offset (along with a docid).

This StringSource can be used to describe byte strings that are indexed using byte offsets or character offsets; or unicode strings that are indexed using character offsets.

Instance Methods [hide private]
 
__init__(self, docid, begin, end)
Create a new StringSource.
source code
 
__len__(self)
Return the length of the string described by this StringSource.
source code
 
__getslice__(self, start, stop)
Return a StringSource describing the location where the specified substring was found.
source code
 
__cmp__(self, other) source code
 
__repr__(self) source code

Inherited from StringSource: __getitem__, __hash__, __str__

Static Methods [hide private]

Inherited from StringSource: __new__

Instance Variables [hide private]

Inherited from StringSource: begin, docid, end

Properties [hide private]
  offsets
A list of offsets specifying the location of each character in the document.
Method Details [hide private]

__init__(self, docid, begin, end)
(Constructor)

source code 

Create a new StringSource. When the StringSource constructor is called directly, it automatically delegates to one of its two subclasses:

In both cases, the arguments must be specified as keyword arguments (not positional arguments).

Overrides: StringSource.__init__
(inherited documentation)

__len__(self)
(Length operator)

source code 

Return the length of the string described by this StringSource. Note that this may not be equal to self.end-self.begin for unicode strings described using byte offsets.

Overrides: StringSource.__len__
(inherited documentation)

__getslice__(self, start, stop)
(Slicling operator)

source code 

Return a StringSource describing the location where the specified substring was found. In particular, if s is the string that this source describes, then return a StringSource describing the location of s[start:stop].

Overrides: StringSource.__getslice__
(inherited documentation)

__cmp__(self, other)
(Comparison operator)

source code 
Overrides: StringSource.__cmp__

__repr__(self)
(Representation operator)

source code 
Overrides: object.__repr__
(inherited documentation)

Property Details [hide private]

offsets

A list of offsets specifying the location of each character in the document. The ith character of the string begins at offset offsets[i] and ends at offset offsets[i+1]. The length of the offsets list is one greater than the list of the string described by this StringSource.