Page 1 of 1
IsAlpha string function for OO
Posted: Mon Sep 04, 2023 4:14 pm
by JeJe
Some posts in another thread were talking about Scriptforge, which has had me looking at their string functions and the OO i18n module.
Scriptforge is part of LibreOffice only though.
Here's an IsAlpha function for OO users. Slightly different from Scriptforge's. They discarded the api function's ability to work on only part of a string. There is a Windows API function for this too - that's not cross platform but only needs the declare.
I notice Scriptforge has a Capitalize function which looks to be covered already by strConv.
Feel free to post a better version than mine, other string functions...
Edit: note I've only used UnicodeType.UPPERCASE_LETTER or UnicodeType.LOWERCASE_LETTER to decide whether is Alpha.
TITLECASE_LETTER may be needed for some languages. The Constants are here:
https://www.openoffice.org/api/docs/com ... eType.html
Code: Select all
Sub Main
MSGBOX IsAlphaOO("àén66ΣlPµp9(",0,2)
MSGBOX IsAlphaOO("àén66ΣlPµp9(",3,4)
end sub
function IsAlphaOO(st as string,optional zerobasedA as long,optional zerobasedB as long) as boolean
dim n as long, aLocale,i as long,CharClassification, a as long,b as long
CharClassification = createUNOService("com.sun.star.i18n.CharacterClassification")
aLocale = ThisComponent.CharLocale
lenst = len(St)
if lenst > 0 then
if ismissing(zerobasedA) = true then
a=0
b= lenst-1
else
a =zerobasedA
b= zerobasedB
if (a<0 or a >= lenst or b<a or b<0 or b>= lenst) then exit function
end if
For i = a to b
n = CharClassification.getType(st, i, aLocale)
if (n <>1 and n <>2) then exit function 'com.sun.star.i18n.UnicodeType.UPPERCASE_LETTER .LOWERCASE_LETTER
Next
IsAlphaOO = true
end if
End function
Re: IsAlpha string function for OO
Posted: Tue Sep 05, 2023 12:25 am
by JeJe
An isAlpha function is just one case where you look for all characters being of certain unicode types (either upper or lower were chosen in my original post)
The more general function ContainsOnlyUnicodeTypes below allows search for all characters being of only a chosen unicode type or a chosen array of unicode types
ContainsUnicodeTypes is a general function allowing search for contains at least one character being of a given unicode type or array of unicode types
EDIT: CHANGED FIRST SUB NAME TO LESS CONFUSING ContainsOnlyUnicodeTypes
Edit2: minor correction in test sub descriptions
Code: Select all
Option Explicit
REM ***** BASIC *****
Sub testSub
'''''''ContainsOnlyUnicodeTypes - every character of chosen unicode types
dim UnicodeTypes
'for isalpha choose unicode types upper and lower case
' UnicodeTypes = array(com.sun.star.i18n.UnicodeType.UPPERCASE_LETTER,com.sun.star.i18n.UnicodeType.LOWERCASE_LETTER)
' msgbox ContainsOnlyUnicodeTypes("777ppppp888",UnicodeTypes)
' msgbox ContainsOnlyUnicodeTypes("777ppppp888",UnicodeTypes,3,5)
' msgbox ContainsOnlyUnicodeTypes("777ppppp888",UnicodeTypes,1,5)
' UnicodeTypes = com.sun.star.i18n.UnicodeType.UPPERCASE_LETTER
' msgbox ContainsOnlyUnicodeTypes("777ppppp888",UnicodeTypes)
' msgbox ContainsOnlyUnicodeTypes("PRTUER",UnicodeTypes)
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'''''''ContainsUnicodeTypes - contain at least one character of chosen unicode types
' UnicodeTypes = com.sun.star.i18n.UnicodeType.DECIMAL_DIGIT_NUMBER
' msgbox ContainsUnicodeTypes("777ppppp888",UnicodeTypes)
' msgbox ContainsUnicodeTypes("pppppp",UnicodeTypes)
' UnicodeTypes = array(com.sun.star.i18n.UnicodeType.UPPERCASE_LETTER,com.sun.star.i18n.UnicodeType.LOWERCASE_LETTER)
' msgbox ContainsUnicodeTypes("9K832737",UnicodeTypes,1,3)
end sub
function ContainsOnlyUnicodeTypes(st as string,UnicodeTypes,optional zerobasedA as long,optional zerobasedB as long) as boolean
dim n as long, aLocale,i as long,CharClassification,lenst as long,ub as long, j as long, found as boolean,a as long, b as long
CharClassification = createUNOService("com.sun.star.i18n.CharacterClassification")
aLocale = ThisComponent.CharLocale
lenst= len(st)
if lenSt >0 then
if ismissing(zerobasedA) = true then
a=0
b= lenst-1
else
a =zerobasedA
b= zerobasedB
if (a<0 or a >= lenst or b<a or b<0 or b>= lenst) then exit function
end if
if vartype(Unicodetypes) > 8192 then 'array
ub = ubound(Unicodetypes)
ContainsOnlyUnicodeTypes =true
For i = a to b
n = CharClassification.getType(st, i, aLocale)
found = false
for j = 0 to ub
if n= UnicodeTypes(j) then
found = true
exit for
end if
next
if found = false then
ContainsOnlyUnicodeTypes = false
exit for
end if
Next
else
ContainsOnlyUnicodeTypes =true
For i = a to b
n = CharClassification.getType(st, i, aLocale)
if n<> UnicodeTypes then
ContainsOnlyUnicodeTypes = false
exit for
end if
Next
end if
end if
End function
function ContainsUnicodeTypes(st as string,UnicodeTypes,optional zerobasedA as long,optional zerobasedB as long) as boolean
dim n as long, aLocale,i as long,CharClassification,lenst as long,ub as long, j as long,a as long, b as long
CharClassification = createUNOService("com.sun.star.i18n.CharacterClassification")
aLocale = ThisComponent.CharLocale
lenst= len(st)
if lenSt >0 then
if ismissing(zerobasedA) = true then
a=0
b= lenst-1
else
a =zerobasedA
b= zerobasedB
if (a<0 or a >= lenst or b<a or b<0 or b>= lenst) then exit function
end if
if vartype(Unicodetypes) > 8192 then 'array
ub = ubound(Unicodetypes)
For i = a to b
n = CharClassification.getType(st, i, aLocale)
for j = 0 to ub
if n= UnicodeTypes(j) then
ContainsUnicodeTypes = true
exit function
end if
next
Next
else
For i = a to b
n = CharClassification.getType(st, i, aLocale)
if n= UnicodeTypes then
ContainsUnicodeTypes = true
exit for
end if
Next
end if
end if
End function
Re: IsAlpha string function for OO
Posted: Tue Sep 05, 2023 3:39 am
by JeJe
CharacterClassification's parseAnyToken is another way to write an IsAlpha function
Code: Select all
Option Explicit
'https://www.openoffice.org/api/docs/common/ref/com/sun/star/i18n/KParseTokens.html
sub testContainsOnlyParseTokens
dim parseTokens
with com.sun.star.i18n.KParseTokens 'For IsAlpha result use below tokens perhaps
parseTokens = .ASC_UPALPHA or .ASC_LOALPHA or .UNI_UPALPHA or .UNI_LOALPHA
end with
msgbox ContainsOnlyParseTokens("TEoooEREu",parseTokens)
msgbox ContainsOnlyParseTokens("TEooo EREu",parseTokens)
msgbox ContainsOnlyParseTokens("6",parseTokens)
msgbox ContainsOnlyParseTokens("Tu",parseTokens)
end sub
Function ContainsOnlyParseTokens (aText,parseTokens) as boolean
dim alocale,npos,nStartCharFlags,aUserDefinedCharactersStart,nContCharFlags ,aUserDefinedCharactersCont,ret,CharClassification
if len(atext) >0 then
CharClassification = createUNOService("com.sun.star.i18n.CharacterClassification")
aLocale = ThisComponent.CharLocale
nPos =0
nStartCharFlags = parseTokens
aUserDefinedCharactersStart = ""
nContCharFlags = parseTokens
aUserDefinedCharactersCont = ""
ret = CharClassification.parseAnyToken( aText,nPos,aLocale,nStartCharFlags,aUserDefinedCharactersStart,nContCharFlags,aUserDefinedCharactersCont )
if ret.tokentype = com.sun.star.i18n.KParseType.IDENTNAME then
ContainsOnlyParseTokens=( ret.CharLen= len(aText))
end if
end if
End function
Or we could trim the function in that code down to just:
Code: Select all
Function ContainsOnlyParseTokens (aText,parseTokens) as boolean
dim ret,CharClassification
if len(atext) >0 then
CharClassification = createUNOService("com.sun.star.i18n.CharacterClassification")
ret = CharClassification.parseAnyToken( aText,0,ThisComponent.CharLocale,parseTokens,"",parseTokens,"")
if ret.tokentype = com.sun.star.i18n.KParseType.IDENTNAME then ContainsOnlyParseTokens=( ret.CharLen= len(aText))
end if
End function
Re: IsAlpha string function for OO
Posted: Fri Sep 08, 2023 6:02 pm
by MrProgrammer
In an OpenOffice spreadsheet one can test if a cell's content is alphabetic, alphanumeric, numeric, or hexadecimal using the SEARCH function, as long as option
Enable regular expressions in formulas is set.
Options are set with OpenOffice → Preferences on a Mac, Tools → Options on other platforms. To test that a cell contains only alphabetic characters, SEARCH looks for a
non-alphabetic character, that is,
[^[:alpha:]]. If found, the test fails; if not found, the test succeeds.
I think that
JeJe's functions are for use in a macro and are not intended to be called by a Calc formula, but I can imagine situations where it's helpful to perform these tests in a spreadsheet.
Re: IsAlpha string function for OO
Posted: Fri Sep 08, 2023 6:52 pm
by JeJe
Calc's Search is available for strings via functionAccess, but there's the same requirement to enable regular expressions under Options.
https://wiki.openoffice.org/wiki/Docume ... H_function
Re: IsAlpha string function for OO
Posted: Fri Sep 08, 2023 8:04 pm
by Villeroy
RegularExpressions, MatchWholeCell etc. are properties of service FunctionAccess.
Code: Select all
Function isAlnum(strVar) As Boolean
ofa = createUnoService("com.sun.star.sheet.FunctionAccess")
ofa.RegularExpressions = True
x = False
on error resume next
x = ofa.callFunction("SEARCH", Array("^[[:alpha:]]+$", strVar))
isAlnum = x
End Function
Re: IsAlpha string function for OO
Posted: Fri Sep 08, 2023 10:00 pm
by JeJe
Interesting. I also notice looking at that, FunctionAccess can, in a limited way find a string across more than 1 paragraphs - something you can't do with a regular Writer document search.
eg: With a Writer document with this text
Blah blah
happy
bats
Blah blah
happy to bats including the paragraph mark characters is found.
Code: Select all
ofa = createUnoService("com.sun.star.sheet.FunctionAccess")
ofa.RegularExpressions = true
str2 =thiscomponent.text.string
STR1 = "happy" & chr(13) & chr(10) & chr(13) & chr(10) & chr(13) & chr(10) & chr(13) & chr(10) & chr(13) & chr(10) & "bats"
x = ofa.callFunction("SEARCH", array(STR1,STR2))
msgbox x '=14
A regular search yields a void
Code: Select all
oSearch = thiscomponent.createSearchDescriptor()
With oSearch
.SearchString ="happy" & chr(13) & chr(10) & chr(13) & chr(10) & chr(13) & chr(10) & chr(13) & chr(10) & chr(13) & chr(10) & "bats"
.SearchRegularExpression = True
End With
oFound = thiscomponent.findFirst(oSearch)
mri ofound 'void
Edit: testing on OO
Re: IsAlpha string function for OO
Posted: Fri Sep 08, 2023 10:06 pm
by karolus
Maybe you should take a look at the string methods that have been around for 20 years in python, instead of (like the guys from the "script-forge" site) trying to recreate it somehow.
https://docs.python.org/3/library/stdty ... tr.isalnum
Re: IsAlpha string function for OO
Posted: Fri Sep 08, 2023 10:21 pm
by JeJe
Those functions are looking a bit like my second post - searching using the Unicode character types. Except mine is a more flexible general function which allows any choice or array of those. With a short look, doesn't seem to be one that does that. I'd never have coded it so quickly if I'd been faffing on with macros in text files rather than the Basic IDE.
Still, use python is another way indeed...
Re: IsAlpha string function for OO
Posted: Fri Sep 08, 2023 10:54 pm
by karolus
Except mine is a more flexible general function which allows any choice or array of those
youre sure?!
Code: Select all
somestring = "0123456789௨௦௨௧٢٠٢١२०२१"
somestring.isnumeric()
-> True
In the text, besides the digits 0-9, there is the number "2023" in Tamil, Devanagari and Arabic-indic
Re: IsAlpha string function for OO
Posted: Fri Sep 08, 2023 11:08 pm
by JeJe
Did you look at my second post?
The unicode types are here again:
http://www.openoffice.org/api/docs/comm ... eType.html
How, using your python link, can you do an examination of: only comprising of any chosen array of those?
Re: IsAlpha string function for OO
Posted: Fri Sep 08, 2023 11:29 pm
by karolus
If you need something like "contains-upper", you just take a simple regular expression. so what?
Re: IsAlpha string function for OO
Posted: Fri Sep 08, 2023 11:44 pm
by JeJe
That's not the same thing. The answer is you can't. And my function is more flexible. And you didn't look at it.
Re: IsAlpha string function for OO
Posted: Sat Sep 09, 2023 7:31 am
by karolus
JeJe wrote: ↑Fri Sep 08, 2023 11:44 pm
The answer is you can't. … And you didn't look at it.
if you believe that... never mind!
No, it's too stupid for me to dig through twelve meters of code.
Then to write something generic that does nothing but test a set of numbers to see if they fit between various upper and lower bounds.
Re: IsAlpha string function for OO
Posted: Sat Sep 09, 2023 8:24 am
by JeJe
You're protesting loudly.
If I want to see if a string is comprised only of UPPERCASE_LETTER, DECIMAL_DIGIT_NUMBER, or MATH_SYMBOL there's no looking through twelve meters of code, or writing anything else except easily writing the following call for my function:
Code: Select all
UnicodeTypes = array(com.sun.star.i18n.UnicodeType.UPPERCASE_LETTER,com.sun.star.i18n.UnicodeType.DECIMAL_DIGIT_NUMBER, com.sun.star.i18n.UnicodeType.MATH_SYMBOL )
msgbox ContainsOnlyUnicodeTypes("HHH∂∃∄∅888",UnicodeTypes)
Its more flexible and concise having a generic sub or function rather than individual subs... and it would be a lot of subs... for each possible combination - only a few of which are available at your link.
Re: IsAlpha string function for OO
Posted: Sat Sep 09, 2023 9:39 am
by karolus
Hallo
For example, I hate it when I have to scroll right and left to read this code, so the first thing I would do (
if I would):

- thats_python.png (22.19 KiB) Viewed 12954 times