Hello. I need to understand how spell checker parses suffix files
Especially the arabic one as it comes to me most complex one
It starts like this
FLAG long
AF 333
AF TbTc # 1
AF TbTcff # 2
AF TaTbTcTdTeTfThTiTjTkTlTmTnToTpTqTrTsTtTuTvTxTycc # 3
AF TbTcTdTeTf # 4
AF TbTcTe # 5
So what does AF mean?
What does TbTcTe mean?
AM الإضافية ####
What does AM mean?
IGNORE ًٌٍَُِّْـٰ
KEY ضصثقفغعهخحجد¦شسيبلاتنمكط¦ئءؤرﻻىةوزظ¦ضشئ¦صسء¦ثيؤ¦قبر¦فلﻻ¦غاى¦عتة¦هنو¦خمز¦حكظ¦جط
What does IGNORE Key do?
ICONV ﻼ لا
MAP ضص
REP ^هى$ هي
What does ICONV, MAP, REP mean?
PFX and SFX are exaplined here but still very poorly : http://www.openoffice.org/lingucomponent/affix.readme
Ok also for example how do i parse these
SFX AD وء وءه/309 وء
SFX BA 0 0/299 .
I mean it must have some generic rules to parse all these suffixes etc
Where can i find them?
here the arabic aff file : http://pastebin.com/KkdwBsH1
Here few examples from .dic file
تتجلطين/231
تتجلط/233
تتجلطان/240
يتجلطون/239
يتجلطا/237
تتجلطا/237
تجلطان/256
تجلطنا/232
نتجلطن/232
تجلطتا/230
تجلطن/262
تجلطي/256
أتجلط/243
يتجلطنان/230
تجلطتما/242
تتجلطوا/236
تتمحوران/240
تمحورت/230
Need urgent help to understand how to parse suffix files
Need urgent help to understand how to parse suffix files
OpenOffice 3.1 on Windows 8
Re: Need urgent help to understand how to parse suffix files
You should consult the Hunspell documentation at
https://sourceforge.net/projects/hunspe ... mentation/
https://sourceforge.net/projects/hunspe ... mentation/
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
Re: Need urgent help to understand how to parse suffix files
i have been working on this since yesterdayRoryOF wrote:You should consult the Hunspell documentation at
https://sourceforge.net/projects/hunspe ... mentation/
hunspell has unmunch command but it fails for UTF8 dictionaries
are there any way to do this properly
which will read each line of the dictionary and generate all possible words that are determined by aff file
OpenOffice 3.1 on Windows 8