IsAlpha string function for OO

Shared Libraries
Forum rules
For sharing working examples of macros / scripts. These can be in any script language supported by OpenOffice.org [Basic, Python, Netbean] or as source code files in Java or C# even - but requires the actual source code listing. This section is not for asking questions about writing your own macros.
Post Reply
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

IsAlpha string function for OO

Post by JeJe »

Some posts in another thread were talking about Scriptforge, which has had me looking at their string functions and the OO i18n module.
Scriptforge is part of LibreOffice only though.

Here's an IsAlpha function for OO users. Slightly different from Scriptforge's. They discarded the api function's ability to work on only part of a string. There is a Windows API function for this too - that's not cross platform but only needs the declare.

I notice Scriptforge has a Capitalize function which looks to be covered already by strConv.

Feel free to post a better version than mine, other string functions...



Edit: note I've only used UnicodeType.UPPERCASE_LETTER or UnicodeType.LOWERCASE_LETTER to decide whether is Alpha.

TITLECASE_LETTER may be needed for some languages. The Constants are here:

https://www.openoffice.org/api/docs/com ... eType.html

Code: Select all

Sub Main
	MSGBOX IsAlphaOO("àén66ΣlPµp9(",0,2)
	MSGBOX IsAlphaOO("àén66ΣlPµp9(",3,4)
end sub


function IsAlphaOO(st as string,optional zerobasedA as long,optional zerobasedB as long) as boolean
	dim n as long, aLocale,i as long,CharClassification, a as long,b as long
	CharClassification = createUNOService("com.sun.star.i18n.CharacterClassification")
	aLocale = ThisComponent.CharLocale
	lenst = len(St)
	if lenst > 0 then
	if ismissing(zerobasedA) = true then
	a=0
	b= lenst-1
	else 
	a =zerobasedA
	b= zerobasedB 
	if (a<0 or a >= lenst or b<a or b<0 or b>= lenst) then exit function
	end if
		For i = a to b
			n = CharClassification.getType(st, i, aLocale)
			if (n <>1 and n <>2) then exit function	'com.sun.star.i18n.UnicodeType.UPPERCASE_LETTER .LOWERCASE_LETTER 
		Next
		IsAlphaOO = true
	end if
End function

Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: IsAlpha string function for OO

Post by JeJe »

An isAlpha function is just one case where you look for all characters being of certain unicode types (either upper or lower were chosen in my original post)

The more general function ContainsOnlyUnicodeTypes below allows search for all characters being of only a chosen unicode type or a chosen array of unicode types

ContainsUnicodeTypes is a general function allowing search for contains at least one character being of a given unicode type or array of unicode types

EDIT: CHANGED FIRST SUB NAME TO LESS CONFUSING ContainsOnlyUnicodeTypes
Edit2: minor correction in test sub descriptions

Code: Select all

Option Explicit
	REM  *****  BASIC  *****

Sub testSub

'''''''ContainsOnlyUnicodeTypes - every character of chosen unicode types
	dim UnicodeTypes
	
	'for isalpha choose unicode types upper and lower case
'	UnicodeTypes = array(com.sun.star.i18n.UnicodeType.UPPERCASE_LETTER,com.sun.star.i18n.UnicodeType.LOWERCASE_LETTER)
'	msgbox ContainsOnlyUnicodeTypes("777ppppp888",UnicodeTypes)
'	msgbox ContainsOnlyUnicodeTypes("777ppppp888",UnicodeTypes,3,5)
'	msgbox ContainsOnlyUnicodeTypes("777ppppp888",UnicodeTypes,1,5)

'	UnicodeTypes = com.sun.star.i18n.UnicodeType.UPPERCASE_LETTER
'	msgbox ContainsOnlyUnicodeTypes("777ppppp888",UnicodeTypes)
'	msgbox ContainsOnlyUnicodeTypes("PRTUER",UnicodeTypes)

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

'''''''ContainsUnicodeTypes - contain at least one character of chosen unicode types

'	UnicodeTypes = com.sun.star.i18n.UnicodeType.DECIMAL_DIGIT_NUMBER
'	msgbox ContainsUnicodeTypes("777ppppp888",UnicodeTypes)
'	msgbox ContainsUnicodeTypes("pppppp",UnicodeTypes)

'	UnicodeTypes = array(com.sun.star.i18n.UnicodeType.UPPERCASE_LETTER,com.sun.star.i18n.UnicodeType.LOWERCASE_LETTER)
'	msgbox ContainsUnicodeTypes("9K832737",UnicodeTypes,1,3)
	
end sub


function ContainsOnlyUnicodeTypes(st as string,UnicodeTypes,optional zerobasedA as long,optional zerobasedB as long) as boolean
	dim n as long, aLocale,i as long,CharClassification,lenst as long,ub as long, j as long, found as boolean,a as long, b as long

	CharClassification = createUNOService("com.sun.star.i18n.CharacterClassification")
	aLocale = ThisComponent.CharLocale
	lenst= len(st)
	if lenSt >0 then
		if ismissing(zerobasedA) = true then
			a=0
			b= lenst-1
		else
			a =zerobasedA
			b= zerobasedB
			if (a<0 or a >= lenst or b<a or b<0 or b>= lenst) then exit function
		end if

		if vartype(Unicodetypes) > 8192 then 'array
			ub = ubound(Unicodetypes)
			ContainsOnlyUnicodeTypes =true

			For i = a to b
				n = CharClassification.getType(st, i, aLocale)
				found = false
				for j = 0 to ub
					if n= UnicodeTypes(j) then
						found = true
						exit for
					end if
				next
				if found = false then
					ContainsOnlyUnicodeTypes = false
					exit for
				end if
			Next

		else

			ContainsOnlyUnicodeTypes =true
			For i = a to b
				n = CharClassification.getType(st, i, aLocale)
				if n<> UnicodeTypes then
					ContainsOnlyUnicodeTypes = false
					exit for
				end if
			Next

		end if
	end if
End function



function ContainsUnicodeTypes(st as string,UnicodeTypes,optional zerobasedA as long,optional zerobasedB as long) as boolean
	dim n as long, aLocale,i as long,CharClassification,lenst as long,ub as long, j as long,a as long, b as long

	CharClassification = createUNOService("com.sun.star.i18n.CharacterClassification")
	aLocale = ThisComponent.CharLocale
	lenst= len(st)
	if lenSt >0 then
		if ismissing(zerobasedA) = true then
			a=0
			b= lenst-1
		else
			a =zerobasedA
			b= zerobasedB
			if (a<0 or a >= lenst or b<a or b<0 or b>= lenst) then exit function
		end if

		if vartype(Unicodetypes) > 8192 then 'array
			ub = ubound(Unicodetypes)

			For i = a to b
				n = CharClassification.getType(st, i, aLocale)
				for j = 0 to ub
					if n= UnicodeTypes(j) then
						ContainsUnicodeTypes = true
						exit function
					end if
				next
			Next

		else

			For i = a to b
				n = CharClassification.getType(st, i, aLocale)
				if n= UnicodeTypes then
					ContainsUnicodeTypes = true
					exit for
				end if
			Next
		end if
	end if
End function

Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: IsAlpha string function for OO

Post by JeJe »

CharacterClassification's parseAnyToken is another way to write an IsAlpha function

Code: Select all

Option Explicit

'https://www.openoffice.org/api/docs/common/ref/com/sun/star/i18n/KParseTokens.html

sub testContainsOnlyParseTokens

	dim parseTokens
	with com.sun.star.i18n.KParseTokens 'For IsAlpha result use below tokens perhaps
		parseTokens = .ASC_UPALPHA or .ASC_LOALPHA or .UNI_UPALPHA or .UNI_LOALPHA
	end with

	msgbox ContainsOnlyParseTokens("TEoooEREu",parseTokens)
	msgbox ContainsOnlyParseTokens("TEooo EREu",parseTokens)
	msgbox ContainsOnlyParseTokens("6",parseTokens)
	msgbox ContainsOnlyParseTokens("Tu",parseTokens)

end sub

Function ContainsOnlyParseTokens (aText,parseTokens) as boolean
	dim alocale,npos,nStartCharFlags,aUserDefinedCharactersStart,nContCharFlags ,aUserDefinedCharactersCont,ret,CharClassification
	if len(atext) >0 then
		CharClassification = createUNOService("com.sun.star.i18n.CharacterClassification")
		aLocale = ThisComponent.CharLocale
		nPos =0
		nStartCharFlags = parseTokens
		aUserDefinedCharactersStart = ""
		nContCharFlags = parseTokens
		aUserDefinedCharactersCont = ""
		ret = CharClassification.parseAnyToken( aText,nPos,aLocale,nStartCharFlags,aUserDefinedCharactersStart,nContCharFlags,aUserDefinedCharactersCont )
		if ret.tokentype = com.sun.star.i18n.KParseType.IDENTNAME then
			ContainsOnlyParseTokens=( ret.CharLen= len(aText))
		end if
	end if
End function
Or we could trim the function in that code down to just:

Code: Select all


Function ContainsOnlyParseTokens (aText,parseTokens) as boolean
	dim ret,CharClassification
	if len(atext) >0 then
		CharClassification = createUNOService("com.sun.star.i18n.CharacterClassification")
		ret = CharClassification.parseAnyToken( aText,0,ThisComponent.CharLocale,parseTokens,"",parseTokens,"")
		if ret.tokentype = com.sun.star.i18n.KParseType.IDENTNAME then ContainsOnlyParseTokens=( ret.CharLen= len(aText))
	end if
End function
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
User avatar
MrProgrammer
Moderator
Posts: 4909
Joined: Fri Jun 04, 2010 7:57 pm
Location: Wisconsin, USA

Re: IsAlpha string function for OO

Post by MrProgrammer »

In an OpenOffice spreadsheet one can test if a cell's content is alphabetic, alphanumeric, numeric, or hexadecimal using the SEARCH function, as long as option Enable regular expressions in formulas is set. Options are set with OpenOffice → Preferences on a Mac, Tools → Options on other platforms. To test that a cell contains only alphabetic characters, SEARCH looks for a non-alphabetic character, that is, [^[:alpha:]]. If found, the test fails; if not found, the test succeeds.
202309062050.ods
Tests using SEARCH function
(16.39 KiB) Downloaded 220 times

I think that JeJe's functions are for use in a macro and are not intended to be called by a Calc formula, but I can imagine situations where it's helpful to perform these tests in a spreadsheet.
Mr. Programmer
AOO 4.1.7 Build 9800, MacOS 13.6.3, iMac Intel.   The locale for any menus or Calc formulas in my posts is English (USA).
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: IsAlpha string function for OO

Post by JeJe »

Calc's Search is available for strings via functionAccess, but there's the same requirement to enable regular expressions under Options.

https://wiki.openoffice.org/wiki/Docume ... H_function
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: IsAlpha string function for OO

Post by Villeroy »

RegularExpressions, MatchWholeCell etc. are properties of service FunctionAccess.

Code: Select all

Function isAlnum(strVar) As Boolean
	ofa = createUnoService("com.sun.star.sheet.FunctionAccess")
    ofa.RegularExpressions = True
    x = False
    on error resume next
    	x = ofa.callFunction("SEARCH", Array("^[[:alpha:]]+$", strVar))
    isAlnum = x
End Function
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: IsAlpha string function for OO

Post by JeJe »

Interesting. I also notice looking at that, FunctionAccess can, in a limited way find a string across more than 1 paragraphs - something you can't do with a regular Writer document search.

eg: With a Writer document with this text
Blah blah

happy




bats
Blah blah
happy to bats including the paragraph mark characters is found.

Code: Select all


	ofa = createUnoService("com.sun.star.sheet.FunctionAccess")
    ofa.RegularExpressions = true
    str2 =thiscomponent.text.string
	STR1 = "happy" & chr(13) & chr(10) & chr(13) & chr(10) & chr(13) & chr(10) & chr(13) & chr(10) & chr(13) & chr(10)  & "bats"
	x = ofa.callFunction("SEARCH", array(STR1,STR2))
	msgbox x '=14
A regular search yields a void

Code: Select all


 oSearch = thiscomponent.createSearchDescriptor()
 With oSearch
 .SearchString ="happy" & chr(13) & chr(10) & chr(13) & chr(10) & chr(13) & chr(10) & chr(13) & chr(10) & chr(13) & chr(10)  & "bats"
 .SearchRegularExpression = True
 End With
 oFound = thiscomponent.findFirst(oSearch)
 mri ofound 'void
Edit: testing on OO
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
User avatar
karolus
Volunteer
Posts: 1160
Joined: Sat Jul 02, 2011 9:47 am

Re: IsAlpha string function for OO

Post by karolus »

Maybe you should take a look at the string methods that have been around for 20 years in python, instead of (like the guys from the "script-forge" site) trying to recreate it somehow.
https://docs.python.org/3/library/stdty ... tr.isalnum
AOO4, Libreoffice 6.1 on Rasbian OS (on ARM)
Libreoffice 7.4 on Debian 12 (Bookworm) (on RaspberryPI4)
Libreoffice 7.6 flatpak on Debian 12 (Bookworm) (on RaspberryPI4)
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: IsAlpha string function for OO

Post by JeJe »

karolus wrote: Fri Sep 08, 2023 10:06 pm Maybe you should take a look at the string methods that have been around for 20 years in python, instead of (like the guys from the "script-forge" site) trying to recreate it somehow.
https://docs.python.org/3/library/stdty ... tr.isalnum
Those functions are looking a bit like my second post - searching using the Unicode character types. Except mine is a more flexible general function which allows any choice or array of those. With a short look, doesn't seem to be one that does that. I'd never have coded it so quickly if I'd been faffing on with macros in text files rather than the Basic IDE.

Still, use python is another way indeed...
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
User avatar
karolus
Volunteer
Posts: 1160
Joined: Sat Jul 02, 2011 9:47 am

Re: IsAlpha string function for OO

Post by karolus »

Except mine is a more flexible general function which allows any choice or array of those
youre sure?!

Code: Select all

somestring = "0123456789௨௦௨௧٢٠٢١२०२१"
somestring.isnumeric()

-> True
In the text, besides the digits 0-9, there is the number "2023" in Tamil, Devanagari and Arabic-indic
AOO4, Libreoffice 6.1 on Rasbian OS (on ARM)
Libreoffice 7.4 on Debian 12 (Bookworm) (on RaspberryPI4)
Libreoffice 7.6 flatpak on Debian 12 (Bookworm) (on RaspberryPI4)
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: IsAlpha string function for OO

Post by JeJe »

Did you look at my second post?

The unicode types are here again:

http://www.openoffice.org/api/docs/comm ... eType.html

How, using your python link, can you do an examination of: only comprising of any chosen array of those?
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
User avatar
karolus
Volunteer
Posts: 1160
Joined: Sat Jul 02, 2011 9:47 am

Re: IsAlpha string function for OO

Post by karolus »

If you need something like "contains-upper", you just take a simple regular expression. so what?
AOO4, Libreoffice 6.1 on Rasbian OS (on ARM)
Libreoffice 7.4 on Debian 12 (Bookworm) (on RaspberryPI4)
Libreoffice 7.6 flatpak on Debian 12 (Bookworm) (on RaspberryPI4)
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: IsAlpha string function for OO

Post by JeJe »

That's not the same thing. The answer is you can't. And my function is more flexible. And you didn't look at it.
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
User avatar
karolus
Volunteer
Posts: 1160
Joined: Sat Jul 02, 2011 9:47 am

Re: IsAlpha string function for OO

Post by karolus »

JeJe wrote: Fri Sep 08, 2023 11:44 pm The answer is you can't. … And you didn't look at it.
if you believe that... never mind!

No, it's too stupid for me to dig through twelve meters of code.
Then to write something generic that does nothing but test a set of numbers to see if they fit between various upper and lower bounds.
AOO4, Libreoffice 6.1 on Rasbian OS (on ARM)
Libreoffice 7.4 on Debian 12 (Bookworm) (on RaspberryPI4)
Libreoffice 7.6 flatpak on Debian 12 (Bookworm) (on RaspberryPI4)
JeJe
Volunteer
Posts: 2785
Joined: Wed Mar 09, 2016 2:40 pm

Re: IsAlpha string function for OO

Post by JeJe »

You're protesting loudly.

If I want to see if a string is comprised only of UPPERCASE_LETTER, DECIMAL_DIGIT_NUMBER, or MATH_SYMBOL there's no looking through twelve meters of code, or writing anything else except easily writing the following call for my function:

Code: Select all

	UnicodeTypes = array(com.sun.star.i18n.UnicodeType.UPPERCASE_LETTER,com.sun.star.i18n.UnicodeType.DECIMAL_DIGIT_NUMBER, com.sun.star.i18n.UnicodeType.MATH_SYMBOL  )
	msgbox ContainsOnlyUnicodeTypes("HHH∂∃∄∅888",UnicodeTypes)
Its more flexible and concise having a generic sub or function rather than individual subs... and it would be a lot of subs... for each possible combination - only a few of which are available at your link.
Windows 10, Openoffice 4.1.11, LibreOffice 7.4.0.3 (x64)
User avatar
karolus
Volunteer
Posts: 1160
Joined: Sat Jul 02, 2011 9:47 am

Re: IsAlpha string function for OO

Post by karolus »

Hallo
For example, I hate it when I have to scroll right and left to read this code, so the first thing I would do (if I would):
thats_python.png
thats_python.png (22.19 KiB) Viewed 4427 times
AOO4, Libreoffice 6.1 on Rasbian OS (on ARM)
Libreoffice 7.4 on Debian 12 (Bookworm) (on RaspberryPI4)
Libreoffice 7.6 flatpak on Debian 12 (Bookworm) (on RaspberryPI4)
Post Reply