I have an older application that publishes an RSS feed. The content of the <description> element is an abstract from a RichText field. It appears I have a recurring problem with users pasting in content from MS Word that contain characters XML parsers can't handle.
Is there a simple (or even not so simple) way to scrub the RichText field and replace/delete these characters? Currently I am doing a substitution for higher ASCII values (ie: ) and CDATA tags.
I've also changed the encoding from UTF 8 to iso-8859-1 but this hasn't cured the underlying problem.
Be glad to kiss the virtual feet for a solution... :-)