I have an older application that publishes an RSS feed. The content of the <description> element is an abstract from a RichText field. It appears I have a recurring problem with users pasting in content from MS Word that contain characters XML parsers can't handle.
Is there a simple (or even not so simple) way to scrub the RichText field and replace/delete these characters? Currently I am doing a substitution for higher ASCII values (ie: ) and CDATA tags.
I've also changed the encoding from UTF 8 to iso-8859-1 but this hasn't cured the underlying problem.
Be glad to kiss the virtual feet for a solution... :-)
Free Guide: Managing storage for virtual environments
Complete a brief survey to get a complimentary 70-page whitepaper featuring the best methods and solutions for your virtual environment, as well as hypervisor-specific management advice from TechTarget experts. Don’t miss out on this exclusive content!
No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.
Your password has been sent to:firstname.lastname@example.org
To follow this tag...
Thanks! We'll email you when relevant content is added and updated.
Share this item with your network: