A TECHNIQUE TO PARSE DOCUMENTS IN PARALLEL BY GENERATING A META-DFA TO ENABLE SPECULATIVE PARSING

   
 

LEAD INVENTOR:

Kenneth Chiu

CONTACT INFORMATION:

Scott Hancock
Assistant Director for Licensing Technology Transfer and Innovation Partnerships
Tel: 607-777-5874
Fax: 607-777-5788
shancock@binghamton.edu

DESCRIPTION:

XML has emerged as the standard for data storage and management; however, some of the characteristics which led to XML’s success, such as its verbose and self-descriptive nature, can incur significant performance penalties. In order to be usable, the data within the XML document must be parsed: divided analyzed, and categorized. The recent adoption of multi-core processors in computers poses an opportunity for increasing XML parsing speeds. Parsing XML in parallel is challenging since it is difficult to start parsing in the middle of an XML document without the preceding content. However, this meta-DFA invention, a series software implemented automata, addresses this problem by simultaneously considering all possibilities while parsing an XML document and using a skeleton outline generated by a parallel preparsing approach. Thus, the preceding context is not needed because all preceding contexts are considered simultaneously. Initial implementation of meta-DFA on Linux shows a 3.3 times speedup of preparsing with a four core processor.

 

POTENTIAL APPLICATIONS:

  • Distributed computing (e.g., scientific modeling, airline ticketing, bioinformatics, logistics)
  • Ubiquitous computing, communication between all mobile devices and a computer
  • Embedded applications (e.g., “smart routers” for networking use at enterprise level)
  • Managing large data structures, e.g., financial transactions and records and crime statistics and court records

 

ADVANTAGES:

  • This technology has a long tail with the increasing use and innovation in multicore processors.
  • Meta-DFA may be implemented alongside other XML accelerating techniques complementarily, with no additional hardware required.

LIMITATIONS:

  • A multicore computer is required to implement meta-DFA.

ADDITIONAL INFORMATION:

  • Computer Science Professor Chiu is an expert in enhancing XML performance, especially using wide-scaled loosely-coupled distributed systems for scientific computing applications. This invention is part of a portfolio which includes RB-259, “A Technique to Enable Parallel XML Parsing by Generating an Outline of the XML Document Using Preparsing Pass” and RB-261, “A Technique to Structurally Partition an XML Document for Parallel Parsing using an Outline of the XML Document.”
  • Yinfei Pan, Ying Zhang, Kenneth Chiu and Wei Lu, “Parallel XML Parsing Using Meta-DFAs,” 3rd IEEE International Conference on e-Science and Grid Computing, Bangalore, India, December 10-13, 2007.

 

PATENT STATUS:

Patenting strategy is under evaluation.