[Solved] Save international text in file as Unicode UTF-8

Creating a macro - Writing a Script - Using the API (OpenOffice Basic, Python, BeanShell, JavaScript)
Post Reply
User avatar
Evgeniy
Posts: 43
Joined: Thu Jan 09, 2020 9:31 pm
Location: Russia

[Solved] Save international text in file as Unicode UTF-8

Post by Evgeniy »

Im want to save text string to file, for example im using China chars, or specific German chars...
if im use Open for Ountput it save this chars as "??????,????. ?????,?????!" But if im use msgbox for debug it show true chars...

Code: Select all

	text="MICV —機械化步兵作戰車輛。"
	FileNo = Freefile
	Open Filename For Output As #FileNo
	Print #FileNo, "<HTML><BODY>"
	Print #FileNo, text
	Print #FileNo, "</BODY></HTML>"
	Close #FileNo
How to save text as Unicode UTF-8?
Last edited by Evgeniy on Sat Jan 18, 2020 10:57 am, edited 1 time in total.
OpenOffice 4.1.7 OS: Win10 x32 + Win10 x64
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: Save international text in file as Unicode UTF-8

Post by JeJe »

Try writing the UTF-8 bom to the file before your string, bytes - EF BB BF.

Then when an application loads the file it will recognise it as UTF-8.

http://www.unicode.org/faq/utf_bom.html#BOM

Edit:
also try converting the string to a byte array before writing to the file


Edit: scrap that. Use Writer

viewtopic.php?f=7&t=56793
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: Save international text in file as Unicode UTF-8

Post by JeJe »

As you're on Windows there is a WideCharToMultiByte Windows Api function
A page on using it here (but not in OO which doesn't have StrPtr or VarPtr, you'd need to rewrite without those if that's even possible)

https://di-mgt.com.au/howto-convert-vba ... -utf8.html

The only way may be put the string in a Writer document and save with a UTF-8 filter.
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: Save international text in file as Unicode UTF-8

Post by JeJe »

You can do UTF-16 easily. MICV —機械化步兵作戰車輛。will be saved correctly

Code: Select all


sub test

pth ="C:\tmp\tesssst.txt" 'change to whatever
st= "MICV —機械化步兵作戰車輛。"
setUnicodeFileString pth,st

 getunicodefilestring pth,st
 msgbox st
end sub

sub setUnicodeFileString(pth as string,st as string)

	dim b() as byte,f,flen as long
	f=freefile

	if dir(pth,0) <>"" then
		KILL PTH
	end if

	if left(st,1) <> chr(&HFEFF) then st = chr(&HFEFF)& st
	B= ST 'convert string to byte array and save file
	Open pth For binary access write As #f
	put #f,,b
	Close #f

end sub

sub getUnicodeFileString(pth as string,st as string) 

	dim b() as byte,f,flen as long

	if dir(pth,0) <>"" then
		f=freefile

		flen = filelen(pth)

		if flen<>0 then

			redim b(flen -1) 'redim byte array to size of file
			Open pth For binary access read As #f
			Seek #f,1
			get #f,,b
			Close #f
			st =b 'convert to string
		end if
	end if
end sub


Edit: added file open sub
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
musikai
Volunteer
Posts: 294
Joined: Wed Nov 11, 2015 12:19 am

Re: Save international text in file as Unicode UTF-8

Post by musikai »

Evgeniy wrote:

Code: Select all

	text="MICV —機械化步兵作戰車輛。"
	FileNo = Freefile
	Open Filename For Output As #FileNo
	Print #FileNo, "<HTML><BODY>"
	Print #FileNo, text
	Print #FileNo, "</BODY></HTML>"
	Close #FileNo
How to save text as Unicode UTF-8?
This works only for ascii-text. (And don't use "text" as variable.)
You have to either try to understand Jeje's solution (I don't ) or use this way to write text files:

Code: Select all

sub write_utf8
strDatnam = "C:\Users\Yourname\Desktop\testit.html"   'edit it !!!!!!
textstring="MICV —機械化步兵作戰車輛。"

oSFA = CreateUnoService("com.sun.star.ucb.SimpleFileAccess")
If oSFA.exists( strDatnam ) Then oSFA.kill( strDatnam )
oTextoutputStream = CreateUnoService("com.sun.star.io.TextOutputStream")
ooutputStream = oSFA.openFileWrite(strDatnam)
oTextoutputStream.setOutputStream(ooutputStream)

oTextoutputStream.writeString("<HTML><BODY>" & CHR$(13) & CHR$(10)) 
oTextoutputStream.writeString(textstring & CHR$(13) & CHR$(10)) 
oTextoutputStream.writeString("</BODY></HTML>" & CHR$(13) & CHR$(10)) 

oTextoutputStream.closeOutput()
end sub
Win7 Pro, Lubuntu 15.10, LO 4.4.7, OO 4.1.3
Free Project: LibreOffice Songbook Architect (LOSA)
http://struckkai.blogspot.de/2015/04/li ... itect.html
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Save international text in file as Unicode UTF-8

Post by Villeroy »

Write the HTML into a text document and store it as plain text.

Code: Select all

	Dim p(0) as new com.sun.star.beans.PropertyValue
	p(0).Name = "FilterName"
	p(0).Value = "Text"
	ThisComponent.storeToURL(sURL, p())
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
Evgeniy
Posts: 43
Joined: Thu Jan 09, 2020 9:31 pm
Location: Russia

Re: Save international text in file as Unicode UTF-8

Post by Evgeniy »

Wow! Thanks for Replys try that methods on next week.
OpenOffice 4.1.7 OS: Win10 x32 + Win10 x64
User avatar
Evgeniy
Posts: 43
Joined: Thu Jan 09, 2020 9:31 pm
Location: Russia

Re: [Solved] Save international text in file as Unicode UTF-

Post by Evgeniy »

Thanks for All!

I'm used SFA. Its work right.

Function:

Code: Select all

Sub SaveFile( path As String, content As String )
	oSFA = CreateUnoService("com.sun.star.ucb.SimpleFileAccess")
	' delete file if it exist
	If oSFA.exists( strDatnam ) Then oSFA.kill(path)
	oTextoutputStream = CreateUnoService("com.sun.star.io.TextOutputStream")
	oOutputStream = oSFA.openFileWrite(path)
	oTextoutputStream.setOutputStream(oOutputStream)
	oTextoutputStream.writeString(content)
	oTextoutputStream.closeOutput()
End Sub
Example of using:

Code: Select all

	Dim CRLF As String
	CRLF = chr(13)+chr(10)

	SaveFile(FilePickSave,"<HTML><BODY>"+CRLF+html+"</BODY></HTML>")
OpenOffice 4.1.7 OS: Win10 x32 + Win10 x64
musikai
Volunteer
Posts: 294
Joined: Wed Nov 11, 2015 12:19 am

Re: [Solved] Save international text in file as Unicode UTF-

Post by musikai »

Evgeniy wrote:Thanks for All!

I'm used SFA. Its work right.

Function:

Code: Select all

Sub SaveFile( path As String, content As String )
	oSFA = CreateUnoService("com.sun.star.ucb.SimpleFileAccess")
	' delete file if it exist
	If oSFA.exists( strDatnam ) Then oSFA.kill(path)
	oTextoutputStream = CreateUnoService("com.sun.star.io.TextOutputStream")
	oOutputStream = oSFA.openFileWrite(path)
	oTextoutputStream.setOutputStream(oOutputStream)
	oTextoutputStream.writeString(content)
	oTextoutputStream.closeOutput()
End Sub
Example of using:

Code: Select all

	Dim CRLF As String
	CRLF = chr(13)+chr(10)

	SaveFile(FilePickSave,"<HTML><BODY>"+CRLF+html+"</BODY></HTML>")
great!
only

Code: Select all

If oSFA.exists( strDatnam ) Then oSFA.kill(path)
must be

Code: Select all

If oSFA.exists( path) Then oSFA.kill(path)
because otherwise you will get an error if it tries to delete a non existing file.
Win7 Pro, Lubuntu 15.10, LO 4.4.7, OO 4.1.3
Free Project: LibreOffice Songbook Architect (LOSA)
http://struckkai.blogspot.de/2015/04/li ... itect.html
Post Reply