[Solved] Error reading data from the internet

PostPosted: Thu Nov 03, 2016 2:29 pm
by go2visions

Is it possible to automatically fetch (one time) an HTML page, scrape the page and input the data into Calc?

If we can, please direct me to the correct Help documentation section.

Re: Scrape an HTML page

PostPosted: Thu Nov 03, 2016 3:03 pm
by Villeroy
menu:Insert>Link to external data...
Paste the URL and wait a few seconds until the table sections are analysed.
Select the section you need.

However, this is a very bad approach. The table data of your html come from some other data source and they are dumped into the html in order to be read by humans. You should access the other data source directly.
If you have no access to the original source because you harvest data from some other companies, there are far more efficient tools than an office suite.

Re: Scrape an HTML page

PostPosted: Thu Nov 03, 2016 7:04 pm
by go2visions
This is the process I a currently use:

1. I go to the external website
2. I press [CTRL]+[A} to select the entire page
3. Next, [CTRL]+[C]
4. Go to an OpenOffice sheet and [CTRL]+[V}
5. This copies the external HTML sheet into OpenOffice
6. It then converts the HTML data to the data I can use

This is very time consuming as I have sometimes 20-30+ pages to convert a day.

As a test, the website I am trying to fetch through Calc is
This link requires user id and password validation.
When I am logged in, it opens the pasted link in Calc to the Firefox HTML page without error.

The problem is when I go to menu:Insert>Link to external data, I receive the error message
Error reading data from the Internet.
Server error message:.

I have Java 1.8.0_111 installed and running

I would like to insert a column of hyperlinks and have Calc read each page externally, automatically.
Will this be possible? Or what do you suggest to be my best alternative approach?

Re: Error reading data from the internet

PostPosted: Fri Nov 04, 2016 4:34 pm
by jrkrideau

Re: [Solved] Error reading data from the internet

PostPosted: Wed Feb 19, 2020 5:16 pm
by Chainmailguy
There seems to be an issue with authentication. I get the same failure when visiting https:// and success when visiting http://
