Package nltk :: Package stem :: Module snowball :: Class GermanStemmer
[hide private]
[frames] | no frames]

type GermanStemmer

source code

          object --+            
                   |            
        api.StemmerI --+        
                       |        
_LanguageSpecificStemmer --+    
                           |    
            _StandardStemmer --+
                               |
                              GermanStemmer

The German Snowball stemmer.


Note: A detailed description of the German stemming algorithm can be found under http://snowball.tartarus.org/algorithms /german/stemmer.html.

Instance Methods [hide private]
unicode
stem(self, word)
Stem a German word and return the stemmed form.
source code

Inherited from _StandardStemmer (private): _r1r2_standard, _rv_standard

Inherited from _LanguageSpecificStemmer: __init__, __repr__

Class Variables [hide private]
unicode __vowels = u'aeiouyäöü'
The German vowels.
unicode __s_ending = u'bdfghklmnrt'
Letters that may directly appear before a word final 's'.
unicode __st_ending = u'bdfghklmnt'
Letter that may directly appear before a word final 'st'.
tuple __step1_suffixes = (u'ern', u'em', u'er', u'en', u'es', u'e', ...
Suffixes to be deleted in step 1 of the algorithm.
tuple __step2_suffixes = (u'est', u'en', u'er', u'st')
Suffixes to be deleted in step 2 of the algorithm.
tuple __step3_suffixes = (u'isch', u'lich', u'heit', u'keit', u'end'...
Suffixes to be deleted in step 3 of the algorithm.
Method Details [hide private]

stem(self, word)

source code 

Stem a German word and return the stemmed form.

Parameters:
  • word (str, unicode) - The word that is stemmed.
Returns: unicode
The stemmed form.
Overrides: api.StemmerI.stem

Class Variable Details [hide private]

__step1_suffixes

Suffixes to be deleted in step 1 of the algorithm.
Type:
tuple
Value:
(u'ern', u'em', u'er', u'en', u'es', u'e', u's')

__step3_suffixes

Suffixes to be deleted in step 3 of the algorithm.
Type:
tuple
Value:
(u'isch', u'lich', u'heit', u'keit', u'end', u'ung', u'ig', u'ik')