I am parsing an xml file that contains an xml url and a text file url as values. OpenXML appears to be unable to read neither the xml url nor the url to the text file. It reads the urls to pdf and html locations fine. How can I get the process to get the xml url as text into my sql table. My next step would be to parse each xml for each record to retrieve some data out of each, but that will come later. I first need to be able to read the url. Right now all I get is null values.
Here is a sample of one record (there are about 5000 in the xml):
<document id="1028" doc_type="3"> <url_xml>https://www.someplace.com/docs/xml/something.xml</url_xml> <url_pdf>https://www.someplace.com/docs/pdf/something.pdf</url_pdf> <url_html>https://www.someplace.com/docs/html/something.html</url_html> <url_txt>https://www.someplace.com/docs/html2/something.txt</url_txt> <headline>Some Text</headline> <dateadded>20080924</dateadded> <active>1</active> <published>1</published> </document>
Here is my code:
INSERT
INTO #WorkingTable
SELECT
* FROM OPENROWSET (BULK 'C:ProcessesBlueMatrixBMdocs.xml', SINGLE_BLOB) AS data
SELECT
@XML = Data FROM #WorkingTable
EXEC
sp_xml_preparedocument @hDoc OUTPUT, @XML
SELECT
getdate(), id, doc_type, url_xml, url_pdf, url_html, url_text, headline, dateadded, active
FROM
OPENXML(@hDoc, '/doc_url/document', 2)
WITH
(id int '@id', doc_type tinyint '@doc_type', url_xml varchar(200), url_pdf varchar(200), url_html varchar(200), url_text varchar(200), headline varchar(200), dateadded datetime, active tinyint );
EXEC
sp_xml_removedocument @hDoc
Thanks for any help you can provide.
Software/Hardware used:
Windows Server 2008 R2, SQL Server 2008
ASKED:
June 28, 2011 5:21 PM
UPDATED:
June 28, 2011 7:00 PM