Incorporated RegEx engine: How to get full access?

Creating a macro - Writing a Script - Using the API (OpenOffice Basic, Python, BeanShell, JavaScript)
Post Reply
User avatar
Lupp
Volunteer
Posts: 3552
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Incorporated RegEx engine: How to get full access?

Post by Lupp »

Is there an interface giving full access to the incorporated RegEx engine (ICU or a simplified derivative?), not only to the parameters controlled via a "F&R" dialogue?

(Alternatively:)

Is there an extension supplying additional functions for Calc which use RegEx on an advanced level, doing an efficient "F&R" based on RegEx, an enhanced

Code: Select all

SUBSTITUTE.REGEX(TextToWorkOn, SearchFor, ReplaceWith, OptionsAndFlags)
, and

Code: Select all

SEARCH.REGEX(RegExToSearchFor, TextToSearchIn, ArrayOfMatches)
where ArrayOfMatches at least gives the positions and lengths of matches and the information created when parentheses were used in the RegEx, e.g.?

Is there an external program based on the ICU RegEx engine, controllable in a handy way from user code for AOO (or LibO)?
On Windows 10: LibreOffice 24.2 (new numbering) and older versions, PortableOpenOffice 4.1.7 and older, StarOffice 5.2
---
Lupp from München
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Incorporated RegEx engine: How to get full access?

Post by Villeroy »

It might be easy (whithout diving into the sources) to build an extension calling the programming language's regex functions from callable add-in functions. Python, for instance, supports Perl-ish regexes out of the box. The difficult part would be the packaging rather than the programming.
With HSQL2 you can use Java regexes to split and replace strings. Just save your strings in a database. It's not as difficult as packaging UNO extensions.
http://hsqldb.org/doc/guide/builtinfunc ... _functions starting with REGEXP_
Example DBSplitting strings using SQL regexp functions in HSQL 2.3.3
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
Lupp
Volunteer
Posts: 3552
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Re: Incorporated RegEx engine: How to get full access?

Post by Lupp »

Danke für das Interesse und die Links!
On Windows 10: LibreOffice 24.2 (new numbering) and older versions, PortableOpenOffice 4.1.7 and older, StarOffice 5.2
---
Lupp from München
User avatar
MrProgrammer
Moderator
Posts: 4906
Joined: Fri Jun 04, 2010 7:57 pm
Location: Wisconsin, USA

Re: Incorporated RegEx engine: How to get full access?

Post by MrProgrammer »

I have not investigated [Calc][oxt] A function for all python string methods but it might be worth a look.
Mr. Programmer
AOO 4.1.7 Build 9800, MacOS 13.6.3, iMac Intel.   The locale for any menus or Calc formulas in my posts is English (USA).
User avatar
Lupp
Volunteer
Posts: 3552
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Re: Incorporated RegEx engine: How to get full access?

Post by Lupp »

Thank you for your interest. I actually am not happy with the idea to additionally depend on a software provided by an independent foundation and possibly being or getting incompatible with something incorporated in AOO (or LibO). Now and then I came about a notice that under a new version of Python something did no longer work as expected...
Do you know the RegEx engine used by Python? Is it developed and maintained by Python themselves, or what else?
Does someone definitely know that the uno API does not provide an object (interface) offering at least a relevant subset of RegEx related methods covering the functionality of functions/methods described in paragraphs 7.2.2 through 7.2.5 of https://docs.python.org/2/library/re.html ? I could not find one, but relying on uno api docu I rarely feel my results trustworthy. (Bureaucratic and often insignificant in my understanding.)
On Windows 10: LibreOffice 24.2 (new numbering) and older versions, PortableOpenOffice 4.1.7 and older, StarOffice 5.2
---
Lupp from München
User avatar
MrProgrammer
Moderator
Posts: 4906
Joined: Fri Jun 04, 2010 7:57 pm
Location: Wisconsin, USA

Re: Incorporated RegEx engine: How to get full access?

Post by MrProgrammer »

Lupp wrote:Do you know the RegEx engine used by Python?
I don't use Python myself, but apparently the language provides both the re library (Secret Labs' Regular Expression engine = SRE) and the regex library (Perl Compatible Regular Expression engine = PCRE).
Mr. Programmer
AOO 4.1.7 Build 9800, MacOS 13.6.3, iMac Intel.   The locale for any menus or Calc formulas in my posts is English (USA).
User avatar
Lupp
Volunteer
Posts: 3552
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Re: Incorporated RegEx engine: How to get full access?

Post by Lupp »

Thanks again!
On Windows 10: LibreOffice 24.2 (new numbering) and older versions, PortableOpenOffice 4.1.7 and older, StarOffice 5.2
---
Lupp from München
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Incorporated RegEx engine: How to get full access?

Post by Villeroy »

All these extensions that work with some office version and don't work (pystring Err:504) with some other versions are perfect examples why virtually nobody develops anything serious for Open/LibreOffice.
Another extension that came up today is AdjustRowHeight extension. I can read the source code. I can read the related documentation. I stare at it over and over again. For the life of me, I do not understand how it works or how one gets the idea to put things together like this. This is by far too complicated. I have never seen anything that complicated when writing add-ins for MS Office.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
hanya
Volunteer
Posts: 885
Joined: Fri Nov 23, 2007 9:27 am
Location: Japan

Re: Incorporated RegEx engine: How to get full access?

Post by hanya »

Villeroy wrote:Another extension that came up today is AdjustRowHeight extension. I can read the source code. I can read the related documentation. I stare at it over and over again. For the life of me, I do not understand how it works or how one gets the idea to put things together like this. This is by far too complicated. I have never seen anything that complicated when writing add-ins for MS Office.
I'm the one who have written such strange extension, it is there in my github. Just few lines are enough to provide such function by macros. But I tried to provide the way to see the state of the IsAdjustHeightEnabled property in the menu entry. If we want to add such entry, we need protocol handler which supports feature events (can be seen in Dispatcher._notify method). It was just small experiment to test my knowledge of the extension creation.
Please, edit this thread's initial post and add "[Solved]" to the subject line if your problem has been solved.
Apache OpenOffice 4-dev on Xubuntu 14.04
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Incorporated RegEx engine: How to get full access?

Post by Villeroy »

Of course I know who writes this stuff. In most cases it's you. MS Office has millions of developers who are able to turn ideas into office add-ins.
Just one question: I want LibreOffice to to handle hyperlinks with protocol vnd.sun.star.script: like OpenOffice does. Could I use your protocol handler as a template? Could I replace the Python code, some names and mytools.calc.AdjustHeight:* with vnd.sun.star.script:* and get a working extension which calls macros from hyperlinks?
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
hanya
Volunteer
Posts: 885
Joined: Fri Nov 23, 2007 9:27 am
Location: Japan

Re: Incorporated RegEx engine: How to get full access?

Post by hanya »

Villeroy wrote:Of course I know who writes this stuff. In most cases it's you. MS Office has millions of developers who are able to turn ideas into office add-ins.
Just one question: I want LibreOffice to to handle hyperlinks with protocol vnd.sun.star.script: like OpenOffice does. Could I use your protocol handler as a template? Could I replace the Python code, some names and mytools.calc.AdjustHeight:* with vnd.sun.star.script:* and get a working extension which calls macros from hyperlinks?
See the following file which declaring protocol handler in the office: http://opengrok.adfinis-sygroup.org/sou ... ler.xcu#54
We can see vnd.sun.star.script and other protocols defined with their protocol handler. So we can implement such a thing as URI handled by its protocol handler. My extension can be used as template for such extension. And it was released under public domain.
Please, edit this thread's initial post and add "[Solved]" to the subject line if your problem has been solved.
Apache OpenOffice 4-dev on Xubuntu 14.04
hanya
Volunteer
Posts: 885
Joined: Fri Nov 23, 2007 9:27 am
Location: Japan

Re: Incorporated RegEx engine: How to get full access?

Post by hanya »

There is com.sun.star.util.TextSearch service which provides way to use internal regular expression engine through the API. Substitution function can be written with it line the following in the Basic. I have not test the code well, so it might contain some bugs.

Code: Select all

Sub Main
  a = RegExSubst("ABCdefg", "(c)", "$\1%", 1)
  
End Sub


Function RegExSubst(targetStr as string, searchForStr as string, replaceStr as string, flags as integer) As String
  Dim options as new com.sun.star.util.SearchOptions
  options.algorithmType = com.sun.star.util.SearchAlgorithms.REGEXP
  flagOp = ""
  if (flags and 1) = 1 then
    flagOp = "i"
  end if
  flagStr = ""
  if len(flagOp) > 0 then
    flagStr = "(?" & flagOp & ")"
  end if
  options.searchForString = flagStr & searchForStr
  ts = CreateUnoService("com.sun.star.util.TextSearch")
  ts.setOptions(options)
  r = ts.searchForward(targetStr, 0, Len(targetStr))
  if r.subRegExpressions = 0 then
    RegExSubst = targetStr
  else
    subCount = r.subRegExpressions
    replaceText = replaceStr
    i = 0
    start = 1
    Do
      ref = "\" & CStr(i)
      n = InStr(start, replaceText, ref)
      if n > 0 then
        refStr = mid(targetStr, r.startOffset(i) +1, r.endOffset(i) - r.startOffset(0))
        replaceText = left(replaceText, n-1) & refStr & right(replaceText, len(replaceText) -(n + len(ref))+1)
        start = n + len(replaceText)
      else
        i = i + 1
      end if
    Loop While i < subCount
    
    RegExSubst = left(targetStr, r.startOffset(0)) & replaceText & mid(targetStr, r.endOffset(0) +1, 65535)
  end if
End Function
Please, edit this thread's initial post and add "[Solved]" to the subject line if your problem has been solved.
Apache OpenOffice 4-dev on Xubuntu 14.04
Post Reply