Question about character encoding

Discuss the word processor
Post Reply
lis123
Posts: 1
Joined: Sun Sep 24, 2017 7:17 pm

Question about character encoding

Post by lis123 »

Hello, I am looking for some information, and hoping one of you kind people could help me.

What type of character encoding does a .odt file use is it UTF-8 ?? Also, what is the default character set, is it ISO 8859-1 ?

Any insight into the encoding and character set used by .odt would be super helpful.

Thanks!!!
Open office 4.1.1 on windows
John_Ha
Volunteer
Posts: 9584
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: Question about character encoding

Post by John_Ha »

I don't know but a .odt file is actually a ZIP file. Unzip it and look at content.xml. Does that answer your question? If not see The Apache OpenOffice Wiki
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
Jan_J
Posts: 167
Joined: Wed Apr 29, 2009 1:42 pm
Location: Poland

Re: Question about character encoding

Post by Jan_J »

The editor works internally with unicode.

Inside .odt ZIP container, most of the content is stored as XML files. XML rules state that encoding must be utf-8, unless preamble declaration change it explicitly.

Code: Select all

<?xml version="1.0" encoding="current_encoding_name_here" ?>
In practice my documents are always utf-8.
JJ ∙ https://forum.openoffice.org/pl/
LO (7.6) ∙ Python (3.11|3.10) ∙ Unicode 15 ∙ LᴬTEX 2ε ∙ XML ∙ Unix tools ∙ Linux (Rocky|CentOS)
Post Reply