I have an older application that publishes an RSS feed. The content of the <description> element is an abstract from a RichText field. It appears I have a recurring problem with users pasting in content from MS Word that contain characters XML parsers can't handle.
Is there a simple (or even not so simple) way to scrub the RichText field and replace/delete these characters? Currently I am doing a substitution for higher ASCII values (ie: ) and CDATA tags.
I've also changed the encoding from UTF 8 to iso-8859-1 but this hasn't cured the underlying problem.
Be glad to kiss the virtual feet for a solution... :-)
Free Guide: Managing storage for virtual environments
Complete a brief survey to get a complimentary 70-page whitepaper featuring the best methods and solutions for your virtual environment, as well as hypervisor-specific management advice from TechTarget experts. Don’t miss out on this exclusive content!