Package nltk :: Package stem :: Module regexp :: Class RegexpStemmer
[hide private]
[frames] | no frames]

type RegexpStemmer

source code

  object --+    
           |    
api.StemmerI --+
               |
              RegexpStemmer

A stemmer that uses regular expressions to identify morphological affixes. Any substrings that match the regular expressions will be removed.

Instance Methods [hide private]
 
__init__(self, regexp, min=0)
Create a new regexp stemmer.
source code
 
stem(self, word)
Strip affixes from the token and return the stem.
source code
 
__repr__(self) source code
Method Details [hide private]

__init__(self, regexp, min=0)
(Constructor)

source code 

Create a new regexp stemmer.

Parameters:
  • regexp (string or regexp) - The regular expression that should be used to identify morphological affixes.
  • min (int) - The minimum length of string to stem
Overrides: object.__init__

stem(self, word)

source code 

Strip affixes from the token and return the stem.

Parameters:
  • token - The token that should be stemmed.
Overrides: api.StemmerI.stem
(inherited documentation)

__repr__(self)
(Representation operator)

source code 
Overrides: object.__repr__
(inherited documentation)