Question about character encoding

Discuss the word processor

Question about character encoding

Postby lis123 » Sun Sep 24, 2017 7:21 pm

Hello, I am looking for some information, and hoping one of you kind people could help me.

What type of character encoding does a .odt file use is it UTF-8 ?? Also, what is the default character set, is it ISO 8859-1 ?

Any insight into the encoding and character set used by .odt would be super helpful.

Open office 4.1.1 on windows
Posts: 1
Joined: Sun Sep 24, 2017 7:17 pm

Re: Question about character encoding

Postby John_Ha » Sun Sep 24, 2017 9:03 pm

I don't know but a .odt file is actually a ZIP file. Unzip it and look at content.xml. Does that answer your question? If not see The Apache OpenOffice Wiki
AOO 4.1.5, Windows 7 Home 64 bit

See the Writer Manual, the Writer FAQ, the Writer Tutorials and the Writer guide.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
Posts: 5689
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: Question about character encoding

Postby Jan_J » Sun Sep 24, 2017 10:33 pm

The editor works internally with unicode.

Inside .odt ZIP container, most of the content is stored as XML files. XML rules state that encoding must be utf-8, unless preamble declaration change it explicitly.
Code: Select all   Expand viewCollapse view
<?xml version="1.0" encoding="current_encoding_name_here" ?>
In practice my documents are always utf-8.
JJ ∙
LO (5.0|5.1) ∙ AOO 4.1.2 ∙ Python (2.7|3.5) ∙ Unicode 8 ∙ L[sup]A[/sup]T[sub]E[/sub]X 2ε ∙ XML ∙ Unix tools ∙ Linux (2.6|3.x) ∙ Fedora ∙ CentOS ∙ SUSE
Posts: 140
Joined: Wed Apr 29, 2009 1:42 pm
Location: Poland

Return to Writer

Who is online

Users browsing this forum: No registered users and 37 guests