Parsing XML with regular expressions – Part 1

Many applications now have the ability to produce XML reports, while perl does have modules available to parse this information I find regular expressions are faster on extremely large data sets.
A small example.
<date> <start>Thu Mar 4 23:27:03 2010</start> <end>Thu Mar 4 23:58:37 2010</end> </date>
Get the XML you need to parse into an array, you can use perl’s open or a shell command to do so.
$target = "/path/to/what/you/want.xml"; open(FH, $target) || die("Could not open file!"); @file=<FH>;
OR
@file = `cat /path/to/what/you/want.xml`;
Now you just need to step through the array, you can use a foreach loop for this.
foreach(@file) {
}
Now going through the file you can use the ‘…’ regex to match between two markers and then get down to what you are looking for.
foreach(@file) { if(/<date>/ ... /<\/date>/) { if(/<start>(.+)<\/start>/) { $report_start = $1; } if(/<end>(.+)<\/end>/) { $report_end = $1; } } }
 Comment on this Post