Package nltk :: Package tokenize :: Module texttiling
[hide private]
[frames] | no frames]

Module texttiling

source code

Classes [hide private]
TextTilingTokenizer
A section tokenizer based on the TextTiling algorithm.
TokenTableField
A field in the token table holding parameters for each token, used later in the process
TokenSequence
A token list with its original length and its index
Functions [hide private]
 
smooth(x, window_len=11, window='flat')
smooth the data using a window with requested size.
source code
 
demo(text=None) source code
Variables [hide private]
  DEFAULT_SMOOTHING = [0]
  BLOCK_COMPARISON = 0
  HC = 1
  LC = 0
  VOCABULARY_INTRODUCTION = 1
Function Details [hide private]

smooth(x, window_len=11, window='flat')

source code 
smooth the data using a window with requested size.

This method is based on the convolution of a scaled window with the signal.
The signal is prepared by introducing reflected copies of the signal 
(with the window size) in both ends so that transient parts are minimized
in the begining and end part of the output signal.

input:
    x: the input signal 
    window_len: the dimension of the smoothing window; should be an odd integer
    window: the type of window from 'flat', 'hanning', 'hamming', 'bartlett', 'blackman'
        flat window will produce a moving average smoothing.

output:
    the smoothed signal

example:

t=linspace(-2,2,0.1)
x=sin(t)+randn(len(t))*0.1
y=smooth(x)

see also: 

numpy.hanning, numpy.hamming, numpy.bartlett, numpy.blackman, numpy.convolve
scipy.signal.lfilter

TODO: the window parameter could be the window itself if an array instead of a string