Irregular Expressions

Mar 28 2010   9:14PM GMT

Parsing XML with regular expressions – Part 2



Posted by: Dan O'Connor
Tags:
parse xml
perl regex
regex group
regex xml

You cam also run into XML formatted like this.

                <global>
                        <pref name="trusted_ca" value="cacert.pem" />
                        <pref name="hide_toolbar" value="no" />
                        <pref name="hide_msglog" value="no" />
                        <pref name="auto_enable_new_plugins" value="yes" />
                        <pref name="use_client_cert" value="no" />
                        <pref name="nessusd_port" value="yes" />
                        <pref name="nessusd_user" value="openvas" />
                        <pref name="paranoia_level" value="yes" />
                        <pref name="targets" value="192.168.0.197" />
                        <pref name="name" value="Report 20100304-235837" />
                        <pref name="comment" value="" />
                </global>

While this might look daunting it’s easy to pull anything that you want out with almost the same code as in the last example.

foreach (@file) {
	if(/<global>/ ... /<\/global>/) {
		if(/<pref name="(.+)" value="Report (.+)" \/>/) {
			$name = $1;
			$report = $2;
		}
	}
}

In a regex a “()” indicates a group, in perl you can refer to these groups by starting at 1. A “.” is a wild card match and a “+” states that it will match at least once but will continue to match if it can.

 Comment on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: