Export or edit this event...

Portland Perl Mongers -- XML with Xtra X

Free Geek
1731 SE 10th Avenue
Portland, OR 97214, US (map)
Public WiFi

Access Notes

Please register for class via Eventbrite: https://freegeek.eventbrite.com Please check in at the front desk when you arrive to let them know you are here for the class. Bags must be checked at the front entrance.

Website

Description

How to learn to parse huge XML documents by doing it wrong for 5 years speaker: Tyler Riddle

When XML documents can't fit into memory the vast majority of solutions available on CPAN are no longer available to you; when the XML documents are so large they take up to 16 hours to process with the standard tools for handling large documents your hands are tied even more. Tyler will cover his learning experiences creating the Parse::MediaWikiDump and MediaWiki::DumpFile modules which are made to handle the 24 gigabyte English Wikipedia dump files in a reasonable time frame.

1) Real world benchmarks of C and perl libraries used to process huge

XML documents. 

2) The dirty little secret about XS and what it means for you in this

context. 

3) The evolution of the implementation of a nice interface around event

oriented (SAX style) XML parsing. 

4) Why XML::LibXML::Reader and XML::CompactTree are your friends and

how to tame them.

As always, the meeting will be followed by social hour at the Lucky Lab.

Share

Tags