HTML Tables to Spreadsheet

25 pts.
Tags:
HTML
Is anyone aware of a way to take HTML tables of a poorly coded online site and convert them directly into a spreadsheet format?
I've been using a HTML clean up tool online, but I have THOUSANDS of items to inventory.
0

Answer Wiki

Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.

Discuss This Question: 4  Replies

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.
  • Subhendu Sen
    If understood correctly, you can save a web page in "Web page, html only" then you can convert to your desire format/ csv format.
    141,290 pointsBadges:
    report
  • TheRealRaven
    Saving the pages might be the best choice; but if they are "poorly coded", the results can be almost anything. If they were created with tools like some older Microsoft HTML generator tools out of Word, etc., the code might be truly terrible.

    If they were mostly manually created, they might not even be "tables" at all. Without examining a significant number of pages, we can't know what might work at all, much less work well.
    36,340 pointsBadges:
    report
  • Cabrona
    Thank you for your input-

    I have found a way to scrape the html to a google sheet, which is easy to convert to CSV. Gladly, the tables seem to scrape without error, so maybe the <td> is good!

    *The only hangups I have:
    1. the data imported does not display images.
    2. I can't get a full inventory...the page itself has subpages within subpages, and list items in addition to the vast amount of tables. I am currently mapping out the webpage, verifying that I have the tables imported, and doing a line by line crosscheck to verify I got all of the <li> as well.

    The scraping formula (no images) looks like this: 

    =ImportHtml(URL, "table", num)

    The URL must be in "quotation marks"
    I put "0" as the "num" variable-it just works better

    The article about this can be found on:
    https://eagereyes.org/data/scrape-tables-using-google-docs
    25 pointsBadges:
    report
  • Cabrona
    Since the tables I have scraped seem to be coded well enough, I am definitely going to try to save as "web page format, html only" as Subhendu suggested. Not sure why I didn't think of that! I will let you know what happens! Thank you.

    25 pointsBadges:
    report

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

Thanks! We'll email you when relevant content is added and updated.

Following

Share this item with your network: