Package nltk :: Package toolbox :: Module toolbox :: Class ToolboxData
[hide private]
[frames] | no frames]

type ToolboxData

source code

    object --+    
             |    
StandardFormat --+
                 |
                ToolboxData

Instance Methods [hide private]
 
parse(self, grammar=None, **kwargs) source code
ElementTree._ElementInterface
_record_parse(self, key=None, **kwargs)
Returns an element tree structure corresponding to a toolbox data file with all markers at the same level.
source code
 
_tree2etree(self, parent) source code
ElementTree._ElementInterface
_chunk_parse(self, grammar=None, top_node='record', trace=0, **kwargs)
Returns an element tree structure corresponding to a toolbox data file parsed according to the chunk grammar.
source code

Inherited from StandardFormat: __init__, close, fields, open, open_string, raw_fields

Method Details [hide private]

_record_parse(self, key=None, **kwargs)

source code 

Returns an element tree structure corresponding to a toolbox data file with all markers at the same level.

Thus the following Toolbox database:

   \_sh v3.0  400  Rotokas Dictionary
   \_DateStampHasFourDigitYear
   
   \lx kaa
   \ps V.A
   \ge gag
   \gp nek i pas
   
   \lx kaa
   \ps V.B
   \ge strangle
   \gp pasim nek

after parsing will end up with the same structure (ignoring the extra whitespace) as the following XML fragment after being parsed by ElementTree:

   <toolbox_data>
       <header>
           <_sh>v3.0  400  Rotokas Dictionary</_sh>
           <_DateStampHasFourDigitYear/>
       </header>

       <record>
           <lx>kaa</lx>
           <ps>V.A</ps>
           <ge>gag</ge>
           <gp>nek i pas</gp>
       </record>
       
       <record>
           <lx>kaa</lx>
           <ps>V.B</ps>
           <ge>strangle</ge>
           <gp>pasim nek</gp>
       </record>
   </toolbox_data>
Parameters:
  • key (string) - Name of key marker at the start of each record. If set to None (the default value) the first marker that doesn't begin with an underscore is assumed to be the key.
  • kwargs (keyword arguments dictionary) - Keyword arguments passed to StandardFormat.fields()
Returns: ElementTree._ElementInterface
contents of toolbox data divided into header and records

_chunk_parse(self, grammar=None, top_node='record', trace=0, **kwargs)

source code 

Returns an element tree structure corresponding to a toolbox data file parsed according to the chunk grammar.

Parameters:
  • grammar (string) - Contains the chunking rules used to parse the database. See chunk.RegExp for documentation.
  • top_node (string) - The node value that should be used for the top node of the chunk structure.
  • trace (int) - The level of tracing that should be used when parsing a text. 0 will generate no tracing output; 1 will generate normal tracing output; and 2 or higher will generate verbose tracing output.
  • kwargs (dictionary) - Keyword arguments passed to toolbox.StandardFormat.fields()
Returns: ElementTree._ElementInterface
Contents of toolbox data parsed according to the rules in grammar