Package nltk :: Package stem :: Module snowball
[hide private]
[frames] | no frames]

Module snowball

source code

Snowball stemmers and appendant demo function

This module provides a port of the Snowball stemmers developed by Dr Martin Porter. There is also a demo function demonstrating the different algorithms. It can be invoked directly on the command line. For more information take a look into the class SnowballStemmer.


Author: Peter Michael Stahl

Contacts:
pemistahl@gmail.com, http://twitter.com/pemistahl
Classes [hide private]
SnowballStemmer
A word stemmer based on the Snowball stemming algorithms.
_LanguageSpecificStemmer
This helper subclass offers the possibility to invoke a specific stemmer directly.
PorterStemmer
A word stemmer based on the original Porter stemming algorithm.
_ScandinavianStemmer
This subclass encapsulates a method for defining the string region R1.
_StandardStemmer
This subclass encapsulates two methods for defining the standard versions of the string regions R1, R2, and RV.
DanishStemmer
The Danish Snowball stemmer.
DutchStemmer
The Dutch Snowball stemmer.
EnglishStemmer
The English Snowball stemmer.
FinnishStemmer
The Finnish Snowball stemmer.
FrenchStemmer
The French Snowball stemmer.
GermanStemmer
The German Snowball stemmer.
HungarianStemmer
The Hungarian Snowball stemmer.
ItalianStemmer
The Italian Snowball stemmer.
NorwegianStemmer
The Norwegian Snowball stemmer.
PortugueseStemmer
The Portuguese Snowball stemmer.
RomanianStemmer
The Romanian Snowball stemmer.
RussianStemmer
The Russian Snowball stemmer.
SpanishStemmer
The Spanish Snowball stemmer.
SwedishStemmer
The Swedish Snowball stemmer.
Functions [hide private]
 
demo()
This function provides a demonstration of the Snowball stemmers.
source code
Function Details [hide private]

demo()

source code 

This function provides a demonstration of the Snowball stemmers.

After invoking this function and specifying a language, it stems an excerpt of the Universal Declaration of Human Rights (which is a part of the NLTK corpus collection) and then prints out the original and the stemmed text.