[Solved] Converting XML Feed to CSV
[Solved] Converting XML Feed to CSV
Hi friends,
I'm struggling here a bit.
I have an 8mb xml feed that I'd like to convert to CSV.
The feed looks like this:
<channel>
<title>W3Schools Home Page</title>
<link>https://www.w3schools.com</link>
<description>Free web building tutorials</description>
<item>
<title>RSS Tutorial</title>
<link>https://www.w3schools.com/xml/xml_rss.asp</link>
<description>New RSS tutorial on W3Schools</description>
</item>
<item>
<title>XML Tutorial</title>
<link>https://www.w3schools.com/xml</link>
<description>New XML tutorial on W3Schools</description>
</item>
</channel>
What are the steps to import the xml into calc and then get into columns eg:
Title, Link, Description etc?
I have tried to apply the same techniques with videos shown for Excel but can't find anything options relating to Calc.
Please help. Cheers
I'm struggling here a bit.
I have an 8mb xml feed that I'd like to convert to CSV.
The feed looks like this:
<channel>
<title>W3Schools Home Page</title>
<link>https://www.w3schools.com</link>
<description>Free web building tutorials</description>
<item>
<title>RSS Tutorial</title>
<link>https://www.w3schools.com/xml/xml_rss.asp</link>
<description>New RSS tutorial on W3Schools</description>
</item>
<item>
<title>XML Tutorial</title>
<link>https://www.w3schools.com/xml</link>
<description>New XML tutorial on W3Schools</description>
</item>
</channel>
What are the steps to import the xml into calc and then get into columns eg:
Title, Link, Description etc?
I have tried to apply the same techniques with videos shown for Excel but can't find anything options relating to Calc.
Please help. Cheers
Last edited by MrProgrammer on Sun Jan 03, 2021 10:03 pm, edited 1 time in total.
Reason: Tagged ✓ [Solved]
Reason: Tagged ✓ [Solved]
OpenOffice 4.1.5 on Windows 10
Re: Converting XML Feed to CSV
It is not possible unless you are a programmer. A programmer would not even need Calc nor Excel.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Re: Converting XML Feed to CSV
I'm irritated again.
Basically XML describes trees. It also has means to embed tables into these trees. The given example is a tree that cannot be represented by a table, imo, except tables in table cells are allowed. Text processors (and html editors) can do it. Spreadsheets (and csv) can't. In what way am I wrong?
Basically XML describes trees. It also has means to embed tables into these trees. The given example is a tree that cannot be represented by a table, imo, except tables in table cells are allowed. Text processors (and html editors) can do it. Spreadsheets (and csv) can't. In what way am I wrong?
Code: Select all
<channel>
<title>W3Schools Home Page
</title>
<link>https://www.w3schools.com
</link>
<description>Free web building tutorials
</description>
<item>
<title>RSS Tutorial
</title>
<link>https://www.w3schools.com/xml/xml_rss.asp
</link>
<description>New RSS tutorial on W3Schools
</description>
</item>
<item>
<title>XML Tutorial
</title>
<link>https://www.w3schools.com/xml
</link>
<description>New XML tutorial on W3Schools
</description>
</item>
</channel>
On Windows 10: LibreOffice 24.2 (new numbering) and older versions, PortableOpenOffice 4.1.7 and older, StarOffice 5.2
---
Lupp from München
---
Lupp from München
Re: Converting XML Feed to CSV
Right. There is one channel with one title, one link, one description but multiple items. This requires at least 2 normalized tables for items and channels.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
Re: Converting XML Feed to CSV
This is the sort of thing for which regular expressions are excellent. You may need to split the file into several smaller parts to make processing faster.
A regular expressions search with <[^>]+> will strip all tags leaving your example as
It should now be pretty easy to structure it so it can be put into csv form with another regular expression search which recognises the web addresses (Google for many examples) and deletes the previous and the following line feeds, and replaces them by, say, %.
You now have lines like
Name the file fred.csv and open it with Calc. Sort on any column to gather all the blank rows together and remove them. Save as a .ods.
See attached file.
See [Tutorial] How to record a macro (and Regular Expressions) for more information.
Showing that a problem has been solved helps others searching so, if your problem is now solved, please view your first post in this thread and click the Edit button (top right in the post) and add [Solved] in front of the subject.
A regular expressions search with <[^>]+> will strip all tags leaving your example as
Code: Select all
W3Schools Home Page
https://www.w3schools.com
Free web building tutorials
RSS Tutorial
https://www.w3schools.com/xml/xml_rss.asp
New RSS tutorial on W3Schools
XML Tutorial
https://www.w3schools.com/xml
New XML tutorial on W3Schools
Edit: Not required. Use the Alternate Find and Replace add-on to search for empty paragraphs and replace them by £. Now replace end of paragraphs by %. With a little thought you will quickly find searches which get the three items onto a single line separated by % used as the separator. |
Code: Select all
W3Schools Home Page%https://www.w3schools.com%Free web building tutorials%
%
RSS Tutorial%https://www.w3schools.com/xml/xml_rss.asp%New RSS tutorial on W3Schools%
%
%
XML Tutorial%https://www.w3schools.com/xml%New XML tutorial on W3Schools%
See attached file.
See [Tutorial] How to record a macro (and Regular Expressions) for more information.
Showing that a problem has been solved helps others searching so, if your problem is now solved, please view your first post in this thread and click the Edit button (top right in the post) and add [Solved] in front of the subject.
- Attachments
-
- three lines.ods
- (9.48 KiB) Downloaded 89 times
LO 6.4.4.2, Windows 10 Home 64 bit
See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.
Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.
Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
Re: Converting XML Feed to CSV
Applaudung to this ingenious solution I would nonetheless insist on keeping the different levels apart.
The XML file/string/stream presentd by the OP is not convertible to a table.
We can, however define one level (I would suggest the second one) to be the mother of our table.
For getting XML streams and for extracting content based on the Xpath syntax Calc in LibreOffice V4.2 and higher provides the functions WEBSERVICE() and FILTERXML() for compatibility with MS Excel.
Against what I would find reasonable XPATH actually allows to extract something like "columns by name" independent of the node level.
See https://www.w3schools.com/xml/xml_xpath.asp, https://en.wikipedia.org/wiki/XPath, or better https://de.wikipedia.org/wiki/XPath.
Concerning the approach by John_Ha I would also point another time to the coming LibreOffice V6.2.0 and its new REGEX() function. Presently you can test with V6.2.0.0Beta1 which is still a Dev version and installs in parallel.
The attached example using FILTERXML() will only run under LibO V4.2 or higher, as already mentioned.
The XML file/string/stream presentd by the OP is not convertible to a table.
We can, however define one level (I would suggest the second one) to be the mother of our table.
For getting XML streams and for extracting content based on the Xpath syntax Calc in LibreOffice V4.2 and higher provides the functions WEBSERVICE() and FILTERXML() for compatibility with MS Excel.
Against what I would find reasonable XPATH actually allows to extract something like "columns by name" independent of the node level.
See https://www.w3schools.com/xml/xml_xpath.asp, https://en.wikipedia.org/wiki/XPath, or better https://de.wikipedia.org/wiki/XPath.
Concerning the approach by John_Ha I would also point another time to the coming LibreOffice V6.2.0 and its new REGEX() function. Presently you can test with V6.2.0.0Beta1 which is still a Dev version and installs in parallel.
The attached example using FILTERXML() will only run under LibO V4.2 or higher, as already mentioned.
- Attachments
-
- aoo96107_FilerXml_Xpath_demo1.ods
- (19.92 KiB) Downloaded 87 times
On Windows 10: LibreOffice 24.2 (new numbering) and older versions, PortableOpenOffice 4.1.7 and older, StarOffice 5.2
---
Lupp from München
---
Lupp from München
Re: Converting XML Feed to CSV
Another simpler method is to put all the lines into a spreadsheet as below and use the spreadsheet sort.
Now delete all the blank rows. Adding a working column B which numbers all rows. Add another C which records if the line is blank or not. Sort on C then B. This places all blank rows together. Delete the blank rows. The remaining rows are still in the original order.
Now just move the 2nd cell A2 to B1 and the third cell A3 to C1. Repeat for all rows.
As always the quality of the result will depend on the quality of the data. This method assumes we always have three lines separated by one or more blank lines
Showing that a problem has been solved helps others searching so, if your problem is now solved, please view your first post in this thread and click the Edit button (top right in the post) and add [Solved] in front of the subject.
Code: Select all
W3Schools Home Page
https://www.w3schools.com
Free web building tutorials
RSS Tutorial
https://www.w3schools.com/xml/xml_rss.asp
New RSS tutorial on W3Schools
XML Tutorial
https://www.w3schools.com/xml
New XML tutorial on W3Schools
Now just move the 2nd cell A2 to B1 and the third cell A3 to C1. Repeat for all rows.
As always the quality of the result will depend on the quality of the data. This method assumes we always have three lines separated by one or more blank lines
Showing that a problem has been solved helps others searching so, if your problem is now solved, please view your first post in this thread and click the Edit button (top right in the post) and add [Solved] in front of the subject.
LO 6.4.4.2, Windows 10 Home 64 bit
See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.
Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.
Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
- MrProgrammer
- Moderator
- Posts: 4895
- Joined: Fri Jun 04, 2010 7:57 pm
- Location: Wisconsin, USA
Re: Converting XML Feed to CSV
Calc was not designed for XML. You are using the wrong tool so the task will be difficult, like driving screws with a hammer.nguyenmm wrote:I have an 8mb xml feed that I'd like to convert to CSV.
A programmer would use XSLT, which was designed for XML manipulation. A non-programmer would use a tool like QXmlEdit.Villeroy wrote:A programmer would not even need Calc nor Excel.
But we could use an unnormalized table. In this case the OP would want to filter the table by Key=Idescription. Also see [Solved] XML file to Calc columns.Villeroy wrote:This requires at least 2 normalized tables for items and channels.
If this solved your problem please go to your first post use the Edit button and add [Solved] to the start of the title. You can select the green checkmark icon at the same time.
[Tutorial] Ten concepts that every Calc user should know
Mr. Programmer
AOO 4.1.7 Build 9800, MacOS 13.6.3, iMac Intel. The locale for any menus or Calc formulas in my posts is English (USA).
AOO 4.1.7 Build 9800, MacOS 13.6.3, iMac Intel. The locale for any menus or Calc formulas in my posts is English (USA).
Re: Converting XML Feed to CSV
As long as the hammer is heavy enough that works quite well for meMrProgrammer wrote:the task will be difficult, like driving screws with a hammer.
As for spreading jam - I agree a knife performs better than a hammer.
LO 6.4.4.2, Windows 10 Home 64 bit
See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.
Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.
Remember: Always save your Writer files as .odt files. - see here for the many reasons why.