Package nltk :: Package stem :: Module snowball :: Class FinnishStemmer
[hide private]
[frames] | no frames]

type FinnishStemmer

source code

          object --+            
                   |            
        api.StemmerI --+        
                       |        
_LanguageSpecificStemmer --+    
                           |    
            _StandardStemmer --+
                               |
                              FinnishStemmer

The Finnish Snowball stemmer.


Note: A detailed description of the Finnish stemming algorithm can be found under http://snowball.tartarus.org/algorithms /finnish/stemmer.html.

Instance Methods [hide private]
unicode
stem(self, word)
Stem a Finnish word and return the stemmed form.
source code

Inherited from _StandardStemmer (private): _r1r2_standard, _rv_standard

Inherited from _LanguageSpecificStemmer: __init__, __repr__

Class Variables [hide private]
unicode __vowels = u'aeiouyäö'
The Finnish vowels.
unicode __restricted_vowels = u'aeiouäö'
A subset of the Finnish vowels.
tuple __long_vowels = (u'aa', u'ee', u'ii', u'oo', u'uu', u'ää', u'öö')
The Finnish vowels in their long forms.
unicode __consonants = u'bcdfghjklmnpqrstvwxz'
The Finnish consonants.
tuple __double_consonants = (u'bb', u'cc', u'dd', u'ff', u'gg', u'hh...
The Finnish double consonants.
tuple __step1_suffixes = (u'kaan', u'kään', u'sti', u'kin', u'han', ...
Suffixes to be deleted in step 1 of the algorithm.
tuple __step2_suffixes = (u'nsa', u'nsä', u'mme', u'nne', u'si', u'n...
Suffixes to be deleted in step 2 of the algorithm.
tuple __step3_suffixes = (u'siin', u'tten', u'seen', u'han', u'hen',...
Suffixes to be deleted in step 3 of the algorithm.
tuple __step4_suffixes = (u'impi', u'impa', u'impä', u'immi', u'imma...
Suffixes to be deleted in step 4 of the algorithm.
Method Details [hide private]

stem(self, word)

source code 

Stem a Finnish word and return the stemmed form.

Parameters:
  • word (str, unicode) - The word that is stemmed.
Returns: unicode
The stemmed form.
Overrides: api.StemmerI.stem

Class Variable Details [hide private]

__double_consonants

The Finnish double consonants.
Type:
tuple
Value:
(u'bb',
 u'cc',
 u'dd',
 u'ff',
 u'gg',
 u'hh',
 u'jj',
 u'kk',
...

__step1_suffixes

Suffixes to be deleted in step 1 of the algorithm.
Type:
tuple
Value:
(u'kaan',
 u'kään',
 u'sti',
 u'kin',
 u'han',
 u'hän',
 u'ko',
 u'',
...

__step2_suffixes

Suffixes to be deleted in step 2 of the algorithm.
Type:
tuple
Value:
(u'nsa', u'nsä', u'mme', u'nne', u'si', u'ni', u'an', u'än', u'en')

__step3_suffixes

Suffixes to be deleted in step 3 of the algorithm.
Type:
tuple
Value:
(u'siin',
 u'tten',
 u'seen',
 u'han',
 u'hen',
 u'hin',
 u'hon',
 u'hän',
...

__step4_suffixes

Suffixes to be deleted in step 4 of the algorithm.
Type:
tuple
Value:
(u'impi',
 u'impa',
 u'impä',
 u'immi',
 u'imma',
 u'immä',
 u'mpi',
 u'mpa',
...