Irregular Expressions

Mar 20 2010   8:37PM GMT

Parsing XML with regular expressions – Part 1

Dan O'Connor Dan O'Connor Profile: Dan O'Connor

Many applications now have the ability to produce XML reports, while perl does have modules available to parse this information I find regular expressions are faster on extremely large data sets.

A small example.

<date>
	<start>Thu Mar  4 23:27:03 2010</start>
	<end>Thu Mar  4 23:58:37 2010</end>
</date>

Get the XML you need to parse into an array, you can use perl’s open or a shell command to do so.

$target = "/path/to/what/you/want.xml";
open(FH, $target) || die("Could not open file!");
@file=<FH>;

OR

@file = `cat /path/to/what/you/want.xml`;

Now you just need to step through the array, you can use a foreach loop for this.

foreach(@file) {
}

Now going through the file you can use the ‘…’ regex to match between two markers and then get down to what you are looking for.

foreach(@file) {

	if(/<date>/ ... /<\/date>/) {
		if(/<start>(.+)<\/start>/) {
			$report_start = $1;
		}
		if(/<end>(.+)<\/end>/) {
			$report_end = $1;
		}
	}
}

 Comment on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

Share this item with your network: