Package nltk :: Package stem :: Module snowball :: Class DanishStemmer
[hide private]
[frames] | no frames]

type DanishStemmer

source code

          object --+            
                   |            
        api.StemmerI --+        
                       |        
_LanguageSpecificStemmer --+    
                           |    
        _ScandinavianStemmer --+
                               |
                              DanishStemmer

The Danish Snowball stemmer.


Note: A detailed description of the Danish stemming algorithm can be found under http://snowball.tartarus.org/algorithms /danish/stemmer.html.

Instance Methods [hide private]
unicode
stem(self, word)
Stem a Danish word and return the stemmed form.
source code

Inherited from _ScandinavianStemmer (private): _r1_scandinavian

Inherited from _LanguageSpecificStemmer: __init__, __repr__

Class Variables [hide private]
unicode __vowels = u'aeiouyæåø'
The Danish vowels.
unicode __consonants = u'bcdfghjklmnpqrstvwxz'
The Danish consonants.
tuple __double_consonants = (u'bb', u'cc', u'dd', u'ff', u'gg', u'hh...
The Danish double consonants.
unicode __s_ending = u'abcdfghjklmnoprtvyzå'
Letters that may directly appear before a word final 's'.
tuple __step1_suffixes = (u'erendes', u'erende', u'hedens', u'ethed'...
Suffixes to be deleted in step 1 of the algorithm.
tuple __step2_suffixes = (u'gd', u'dt', u'gt', u'kt')
Suffixes to be deleted in step 2 of the algorithm.
tuple __step3_suffixes = (u'elig', u'løst', u'lig', u'els', u'ig')
Suffixes to be deleted in step 3 of the algorithm.
Method Details [hide private]

stem(self, word)

source code 

Stem a Danish word and return the stemmed form.

Parameters:
  • word (str, unicode) - The word that is stemmed.
Returns: unicode
The stemmed form.
Overrides: api.StemmerI.stem

Class Variable Details [hide private]

__double_consonants

The Danish double consonants.
Type:
tuple
Value:
(u'bb',
 u'cc',
 u'dd',
 u'ff',
 u'gg',
 u'hh',
 u'jj',
 u'kk',
...

__step1_suffixes

Suffixes to be deleted in step 1 of the algorithm.
Type:
tuple
Value:
(u'erendes',
 u'erende',
 u'hedens',
 u'ethed',
 u'erede',
 u'heden',
 u'heder',
 u'endes',
...