Page 1 of 1
IEEE 754 Double: Conversion to textual representations and the roundtrip issue
Posted: Fri Nov 10, 2023 11:45 pm
by Lupp
In RAM living spreadsheets represent numbers in
IEEE 754 Double standard.
For the View there are lots o formats.
Writing to the file a conversion to a decimal representation is used.
There is not and cannot be a conversion between dyadic and decimal representations preserving accuracy/resolution completely, and therefore a roundtrip between these representations never can be assured to work reliably.
In addition for historical reasons the decimal representation is "mid-endian" if so-called scientific notation is (needs to be) used. First comes the most significant sign (if any), then a mantissa, and then the more significant magnitude of the amount.
Since the live representation is dyadic, it can rather easily be converted to a representation in hexadfecadic digits. The reverse conversion is also feasible (even with a bit less effort).
And mainly: Both conversions are round-trip-proof (if I not am, on error).
Some time ago I wrote raw Basic code for the mentioned conversions:
Code: Select all
REM The "Calc" representation of a number here means the default.
REM This is IEEE 754 Double in RAM, but always shown in a specific format
REM in spreadsheets, and stored to files in a rounded decimal format.
REM The "DblHEX" representation means "like in RAM", everyone of the 8 bytes
REM converted to two hexadecadic digits. Everything "more significant first".
Function calcToDblHex(pNum As Double) As String
calcToDblHex = ":fail:"
On Local Error Goto fail
amount = Abs(pNum)
sign = Sgn(pNum)
rawDyadicExponent = Int(Log(amount) / Log(2))
rawFloatCut = (amount*2^-rawDyadicExponent - 1)
currMantissa = rawFloatCut*(&H1000000)*(&H10000000)'+2^52
offsetDyaExponent = Int(rawDyadicExponent + 1023)
lead12 = IIf(sign<0, &H800, 0) + offsetDyaExponent
startH = Right(Hex(&H1000 + lead12), 3)
out = ""
For j = 0 To 5
quot = Int(currMantissa/&H100)
remi = currMantissa - (quot*&H100)
currMantissa = quot
out = Right(Hex(&H100 + remi), 2) & out
Next j
quot = currMantissa\&H10
remi = currMantissa - (quot*&H10)
currMantissa = quot
out = startH & Right(Hex(&H10 + remi), 1) & out
calcToDblHex = out
fail:
End Function
Function dblHexToCalc(pDblHex As String) As Variant
dblHexToCalc = ":fail:"
If Len(pDblHex)<>16 Then Exit Function
On Local Error Goto fail
Dim bytes(7) As Byte
For j = 0 To 7
bytes(j) = Val("&H" & Mid(pDblHex, j*2+1, 2))
Next j
signBit = (bytes(0)>127)
eLeft = bytes(0) AND &H7F
eRight = (bytes(1) AND &HF0) \ 16
dyaExponent = (eLeft*16 + eRight) - 1023
bytes(1) = (bytes(1) AND &H0F) OR &H10 REM !!!!
Dim mantissa As Double
manStart = CDbl(bytes(1)) / 16
mantissa = CDbl(0)
For k = 7 To 2 Step -1
mantissa = mantissa/256 + CDbl(bytes(k))
Next k
mantissa = (mantissa/(256*16) + manStart)
res = mantissa * (2^dyaExponent) * IIf(signBit, -1, 1)
dblHexToCalc = res
fail:
End Function
The above code should even work in very old versions of "our software".
The demos below contain randomly generated examples. The AOO version uses the volatile
pseudo-random-generators while the other version using the
.NV versions will only run in a rather recent LibreOffice.
(Of course there will be a more efficient solution with some Python module.)
Re: IEEE 754 Double: Conversion to textual representations and the roundtrip issue
Posted: Sat Nov 11, 2023 8:47 pm
by karolus
Hallo
Of course there will be a more efficient solution with some Python module
No need for »some module« only builtin-methods of float required:
Code: Select all
x = 4908163662587.91
h = x.hex().upper()
print( f"{x = } convert to hex: {h = } , {float.fromhex(h) == x = }")
### x = 4908163662587.91 convert to hex: h = '0X1.1DB15680BEFA4P+42' , float.fromhex(h) == x = True
Re: IEEE 754 Double: Conversion to textual representations and the roundtrip issue
Posted: Sat Nov 11, 2023 11:41 pm
by Lupp
Yes. The world is full of funny ideas.
Here:
Write the mantissa of a normalized floating-point representation of a Double number with the point as the "hex-separator" in HEX, but append the magnitude as a power of 2 with a
decimal integer as the exponent and "P" (and a sign?) as the separator.
Present this like a
.hex() property of the number.
(I didn't understand the ".upper()". Shall it simply cause the letters occurring in the mantissa to be output in upper case?)
BTW:
The hex() format reproduces the mid-endian-mistake of the traditional "normalized scientific format".
Where would I get the sign if negative? Why is the sign suppressed for non-negative numbers?
It's questionable, of course, if we should continue this conversation in the SNIPPETS branch.
Re: IEEE 754 Double: Conversion to textual representations and the roundtrip issue
Posted: Sun Nov 12, 2023 3:59 am
by karolus
Hallo
The …upper() was for the convinience of you, to compare the outcome for
Code: Select all
x hex_by_Lupp hex_by_python_float.hex().upper()
4908163662587,91 4291DB15680BEFA4 0X1.1DB15680BEFA4P+42
you should be able to compare your prefix '42' with 'P+42'
roundtrip? → you should be able to read and understand
float.fromhex(h) == x = True
@Wolfgang Jäger: woher nimmst du eigentlich die Hybris zu meinen alle anderen einschliesslich der Python-entwickler sind blöd und du bist der einzige ders richtig macht?
Natürlich funktioniert der Roundtrip wie du schon an der print-Ausgabe sehen kannst. und natürlich funktionierts mit negativen Zahlen.
Du weisst genausogut wie ich daß 64bit_floats nicht mehr als ~15signifikante Dezimalstellen hergeben.
Und du darfst das gerne selbst ausprobieren.
Edit: Courtesy translation for viewers, since this is an English-speaking forum -- MrProgrammer, forum moderator:
@Wolfgang Jäger: where do you get the hubris to think that everyone else, including the Python developers, is stupid and you are the only one who does it right?
Of course the roundtrip works, as you can see from the print output, and of course it works with negative numbers.
You know as well as I do that 64bit_floats don't give more than ~15 significant decimal places.
And you are welcome to try it out yourself.
|
Re: IEEE 754 Double: Conversion to textual representations and the roundtrip issue
Posted: Sun Nov 12, 2023 9:34 am
by Zizi64
Here is an another approach without the many calculations (Pipe technology). The first two function is for the conversion and the others are the helper functions:
Code: Select all
REM ***** BASIC *****
Option Explicit
Function myPipeDouble2HexStr(myValue as double) as string
Dim oPipe as object
Dim i As Integer
Dim sResult as string
oPipe = myCreatePipe()
oPipe.Predecessor.writeDouble(myValue) REM: '.writeFloat' for the Single precision numbers
Do While oPipe.Successor.available() > 0
i = oPipe.Successor.readByte()
i = i AND 255
If i < 16 Then sResult = sResult & "0"
sResult = sResult & Hex(i)
Loop
myClosePipe(oPipe)
myPipeDouble2HexStr = sResult
End function
'_____________________________________________________________________________________________
Function myPipeHexStr2Double(sHexStr as string) as double
Dim oPipe
Dim myValue As double
Dim i as integer
Dim aBytes(8) as integer
Dim sTwoChars as string
If myCheckHexStr(sHexStr) <> 16 then REM: '<> 8 then' for the Single precision numbers
myPipeHexStr2Double = 0
MsgBox("Error in the Hexa string",0, "myPipeHexStr2Double:")
Exit Function
end if
For i = 0 To 7
sTwoChars = left(sHexStr, 2)
aBytes(i) = myHEX2DEC(sTwoChars)
sHexStr = right(sHexStr, len(sHexStr) - 2)
next i
oPipe = myCreatePipe()
oPipe.Predecessor.writeBytes(aBytes())
myValue = oPipe.Successor.readDouble() REM: '.readFloat' for the Single precision numbers
myClosePipe(oPipe)
myPipeHexStr2Double = myValue
end function
'_____________________________________________________________________________________________
'_____________________________________________________________________________________________
Function myCreatePipe() As Object
Dim oPipe ' Pipe Service.
Dim oDataInp ' DataInputStream Service.
Dim oDataOut ' DataOutputStream Service.
oPipe = createUNOService ("com.sun.star.io.Pipe")
oDataInp = createUNOService ("com.sun.star.io.DataInputStream")
oDataOut = createUNOService ("com.sun.star.io.DataOutputStream")
oDataInp.setInputStream(oPipe)
oDataOut.setOutputStream(oPipe)
myCreatePipe = oPipe
End Function
'_____________________________________________________________________________________________
Sub myClosePipe(oPipe)
oPipe.Successor.closeInput
oPipe.Predecessor.closeOutput
oPipe.closeInput
oPipe.closeOutput
End Sub
'_____________________________________________________________________________________________
function myHex2Dec(sHexa as string) as integer
Dim oFunctionAccess as object
Dim Arg
oFunctionAccess = createUnoService( "com.sun.star.sheet.FunctionAccess" )
Arg = Array(sHexa)
myHex2Dec = oFunctionAccess.CallFunction( "HEX2DEC", Arg)
end function
'_____________________________________________________________________________________________
function myCheckHexStr(sHexStr as string) as integer
Dim i as integer
Dim iLen as integer
iLen = len(sHexStr)
For i = 1 to iLen
if instr("0123456789aAbBcCdDeEfF", mid(sHexStr,i,1)) = 0 then
myCheckHexStr = -1
Exit function
end if
next i
myCheckHexStr = iLen
end function
'_____________________________________________________________________________________________
Re: IEEE 754 Double: Conversion to textual representations and the roundtrip issue
Posted: Sun Nov 12, 2023 11:27 am
by karolus
@Zizi64:
the »pipe«technic looks nice, but it is very slow…
Re: IEEE 754 Double: Conversion to textual representations and the roundtrip issue
Posted: Sun Nov 12, 2023 4:40 pm
by Lupp
karolus wrote: ↑Sun Nov 12, 2023 3:59 am
Hallo ...
Concerning some supposed errors and also concerning the slightly offensive remarks (in German mostly) I answered to @karolus via PM.
In my original post I missed to state clearly:
The snippets supplied here weren't supposed to be useful in everyday standard context.
Interested users who understand what I called "
the roundtrip issue" may
in rare cases want to store standard numbers from Calc cells (IEEE 754 Double) to the file in a textual format simulating a RAM-dump, and thus allowing to get them back to RAM later without any changed bit.
My original concern was only clear to those who studied the demo sheet showing that roundtrips via decimal standard representation can't be assured to work without any (tiny) error.
@Zizi64: Thanks for posting the alternative. I never used that service.
Re: IEEE 754 Double: Conversion to textual representations and the roundtrip issue
Posted: Sun Nov 12, 2023 10:48 pm
by Lupp
I could meanwhile verify that the .hex() method of Python applicable to Double numbers actually uses three different bases:
The mantissa is given in HEX (with a leading 1. due to noramlisation).
The order of magnitude is given as a power of 2 (base implicit to the separator "P"),
but the exponent followimg the "P" is given as a decimal integer. We have base 10 now.
Be sure to understand that the example given by @karolus in his post from 2023-11-12 03:59 (MET ?) ist misleading. The "42" which he called a "prefix" contained in the 16-HEX-digit-representation of the example 4908163662587.91 (decimal) is none. It simply is the 2-digit-HEX for the most significant byte of the IEEE Double as it is represented in RAM. Its highest bit gives the sign ("0" for non-negative in this case). The remaining 7 bits need to be combined with the higher half-byte following (represented by the HEX "9" in the example) to get an 11-bit exponent designator (base 2). To get the actual twos-exponent the bias (offset) 1023 specified for IEEE 754 Double must be subtracted. It is a misleading accident that the result (dyadic still) is 00000101010 which makes 42 when converted to decimal. To compare this pair of decimal digits to the pair of leading HEX digits also shown as "42" has no fundament.
Re: IEEE 754 Double: Conversion to textual representations and the roundtrip issue
Posted: Sun Nov 12, 2023 11:12 pm
by karolus
Hallo
Sorry for not digging into any detail, in a first test, I had accidentally taken a number where the two leading characters of your implementation match the exponent of the python-representation and took the wrong conclusions.
However, this does not change the fact that python can "calculate" back and forth just as clearly as your implementation.
Re: IEEE 754 Double: Conversion to textual representations and the roundtrip issue
Posted: Sun Nov 12, 2023 11:53 pm
by Lupp
karolus wrote: ↑Sun Nov 12, 2023 11:12 pm
Sorry for not digging into any detail, in a first test, I had accidentally taken a number where the two leading characters of your implementation match the exponent of the python-representation and took the wrong conclusions.
No problem. Errors are one of our deeply built-in issues. And any human activity would go to a halt if we never took the chance to decide based on insufficient information or without having checked every also thinkable alternative.
karolus wrote: ↑Sun Nov 12, 2023 11:12 pm
However, this does not change the fact that python can "calculate" back and forth just as clearly as your implementation.
If we replace "clearly" by "correctly": No doubt.
Or, if any, I would rather doubt my own code. Python has the better developers, of course, and Basic supports no "absolute" access to RAM content. (The service Zizi64 suggested may do in a sense.)
This doesn't mean that I'm convinced they (Python) found te best way to present the result of the
.hex() method "cleanly".
Re: IEEE 754 Double: Conversion to textual representations and the roundtrip issue
Posted: Mon Nov 13, 2023 9:09 am
by karolus
Hallo
found te best way to present
There is no one and only
best way to representing a float in hexadecimal, python simply follows its own convention to start
any hexadecimal with `0x` the `1.` may tell the parser `hey I'm a float`, and the "tail" `p[+-]decimal_exponent` is possibly a more robust demarcation to random characters behind?
Re: IEEE 754 Double: Conversion to textual representations and the roundtrip issue
Posted: Tue Nov 14, 2023 4:11 pm
by Lupp
karolus wrote: ↑Mon Nov 13, 2023 9:09 am
Hallo
found the best way to present
There is no one and only
best way ...

Fortunately there is a clearly
worst way to program for OOo/LibO documents, and that is to use Basic for whatever.

May it be that there even is a
best way with this very special respect?
It should be a good idea anyway to respect everybody's opinion where opinions are acceptable at all, and to look for probably relevant reasons (or ask for them) where an argument might be of interest.
In the given case developers (Python project) decided not just for themselves, but also decided what a user is offered, and for what they will need additional steps.
If I had been asked for an appropriate output format of a
hex() method or function I had easily accepted the starting "0x". This is arbitrary in a sense since there isn't yet an ISO standard afaik. Surely I had dissuaded from abandoning the "decreasing-significance-principle" and from using a decimal representation in the output for a
part of the information. There is additional reasoning, of course, and I'm prepared to explain my thoughts if somebody asks for.
However, this should better be continued under "General Discussion" then instead of in this forum titled "Code Snippets" and starting with the "no-questions-rule".