Irregular Expressions

Mar 20 2010   8:37PM GMT

Parsing XML with regular expressions – Part 1



Posted by: Dan O'Connor
Tags:
parse xml perl
perl
perl xml
xml

Many applications now have the ability to produce XML reports, while perl does have modules available to parse this information I find regular expressions are faster on extremely large data sets.

A small example.

<date>
	<start>Thu Mar  4 23:27:03 2010</start>
	<end>Thu Mar  4 23:58:37 2010</end>
</date>

Get the XML you need to parse into an array, you can use perl’s open or a shell command to do so.

$target = "/path/to/what/you/want.xml";
open(FH, $target) || die("Could not open file!");
@file=<FH>;

OR

@file = `cat /path/to/what/you/want.xml`;

Now you just need to step through the array, you can use a foreach loop for this.

foreach(@file) {
}

Now going through the file you can use the ‘…’ regex to match between two markers and then get down to what you are looking for.

foreach(@file) {

	if(/<date>/ ... /<\/date>/) {
		if(/<start>(.+)<\/start>/) {
			$report_start = $1;
		}
		if(/<end>(.+)<\/end>/) {
			$report_end = $1;
		}
	}
}

 Comment on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: