Page 1 of 1

Incorporated RegEx engine: How to get full access?

Posted: Sat Jul 02, 2016 11:35 am
by Lupp
Is there an interface giving full access to the incorporated RegEx engine (ICU or a simplified derivative?), not only to the parameters controlled via a "F&R" dialogue?

(Alternatively:)

Is there an extension supplying additional functions for Calc which use RegEx on an advanced level, doing an efficient "F&R" based on RegEx, an enhanced

Code: Select all

SUBSTITUTE.REGEX(TextToWorkOn, SearchFor, ReplaceWith, OptionsAndFlags)
, and

Code: Select all

SEARCH.REGEX(RegExToSearchFor, TextToSearchIn, ArrayOfMatches)
where ArrayOfMatches at least gives the positions and lengths of matches and the information created when parentheses were used in the RegEx, e.g.?

Is there an external program based on the ICU RegEx engine, controllable in a handy way from user code for AOO (or LibO)?

Re: Incorporated RegEx engine: How to get full access?

Posted: Sat Jul 02, 2016 1:14 pm
by Villeroy
It might be easy (whithout diving into the sources) to build an extension calling the programming language's regex functions from callable add-in functions. Python, for instance, supports Perl-ish regexes out of the box. The difficult part would be the packaging rather than the programming.
With HSQL2 you can use Java regexes to split and replace strings. Just save your strings in a database. It's not as difficult as packaging UNO extensions.
http://hsqldb.org/doc/guide/builtinfunc ... _functions starting with REGEXP_
Example DBSplitting strings using SQL regexp functions in HSQL 2.3.3

Re: Incorporated RegEx engine: How to get full access?

Posted: Sat Jul 02, 2016 2:34 pm
by Lupp
Danke für das Interesse und die Links!

Re: Incorporated RegEx engine: How to get full access?

Posted: Sun Jul 03, 2016 7:12 am
by MrProgrammer
I have not investigated [Calc][oxt] A function for all python string methods but it might be worth a look.

Re: Incorporated RegEx engine: How to get full access?

Posted: Sun Jul 03, 2016 9:54 pm
by Lupp
Thank you for your interest. I actually am not happy with the idea to additionally depend on a software provided by an independent foundation and possibly being or getting incompatible with something incorporated in AOO (or LibO). Now and then I came about a notice that under a new version of Python something did no longer work as expected...
Do you know the RegEx engine used by Python? Is it developed and maintained by Python themselves, or what else?
Does someone definitely know that the uno API does not provide an object (interface) offering at least a relevant subset of RegEx related methods covering the functionality of functions/methods described in paragraphs 7.2.2 through 7.2.5 of https://docs.python.org/2/library/re.html ? I could not find one, but relying on uno api docu I rarely feel my results trustworthy. (Bureaucratic and often insignificant in my understanding.)

Re: Incorporated RegEx engine: How to get full access?

Posted: Mon Jul 04, 2016 5:20 am
by MrProgrammer
Lupp wrote:Do you know the RegEx engine used by Python?
I don't use Python myself, but apparently the language provides both the re library (Secret Labs' Regular Expression engine = SRE) and the regex library (Perl Compatible Regular Expression engine = PCRE).

Re: Incorporated RegEx engine: How to get full access?

Posted: Mon Jul 04, 2016 10:10 am
by Lupp
Thanks again!

Re: Incorporated RegEx engine: How to get full access?

Posted: Mon Jul 04, 2016 11:49 pm
by Villeroy
All these extensions that work with some office version and don't work (pystring Err:504) with some other versions are perfect examples why virtually nobody develops anything serious for Open/LibreOffice.
Another extension that came up today is AdjustRowHeight extension. I can read the source code. I can read the related documentation. I stare at it over and over again. For the life of me, I do not understand how it works or how one gets the idea to put things together like this. This is by far too complicated. I have never seen anything that complicated when writing add-ins for MS Office.

Re: Incorporated RegEx engine: How to get full access?

Posted: Tue Jul 05, 2016 12:51 pm
by hanya
Villeroy wrote:Another extension that came up today is AdjustRowHeight extension. I can read the source code. I can read the related documentation. I stare at it over and over again. For the life of me, I do not understand how it works or how one gets the idea to put things together like this. This is by far too complicated. I have never seen anything that complicated when writing add-ins for MS Office.
I'm the one who have written such strange extension, it is there in my github. Just few lines are enough to provide such function by macros. But I tried to provide the way to see the state of the IsAdjustHeightEnabled property in the menu entry. If we want to add such entry, we need protocol handler which supports feature events (can be seen in Dispatcher._notify method). It was just small experiment to test my knowledge of the extension creation.

Re: Incorporated RegEx engine: How to get full access?

Posted: Tue Jul 05, 2016 11:41 pm
by Villeroy
Of course I know who writes this stuff. In most cases it's you. MS Office has millions of developers who are able to turn ideas into office add-ins.
Just one question: I want LibreOffice to to handle hyperlinks with protocol vnd.sun.star.script: like OpenOffice does. Could I use your protocol handler as a template? Could I replace the Python code, some names and mytools.calc.AdjustHeight:* with vnd.sun.star.script:* and get a working extension which calls macros from hyperlinks?

Re: Incorporated RegEx engine: How to get full access?

Posted: Wed Jul 06, 2016 4:16 pm
by hanya
Villeroy wrote:Of course I know who writes this stuff. In most cases it's you. MS Office has millions of developers who are able to turn ideas into office add-ins.
Just one question: I want LibreOffice to to handle hyperlinks with protocol vnd.sun.star.script: like OpenOffice does. Could I use your protocol handler as a template? Could I replace the Python code, some names and mytools.calc.AdjustHeight:* with vnd.sun.star.script:* and get a working extension which calls macros from hyperlinks?
See the following file which declaring protocol handler in the office: http://opengrok.adfinis-sygroup.org/sou ... ler.xcu#54
We can see vnd.sun.star.script and other protocols defined with their protocol handler. So we can implement such a thing as URI handled by its protocol handler. My extension can be used as template for such extension. And it was released under public domain.

Re: Incorporated RegEx engine: How to get full access?

Posted: Thu Jul 14, 2016 8:33 pm
by hanya
There is com.sun.star.util.TextSearch service which provides way to use internal regular expression engine through the API. Substitution function can be written with it line the following in the Basic. I have not test the code well, so it might contain some bugs.

Code: Select all

Sub Main
  a = RegExSubst("ABCdefg", "(c)", "$\1%", 1)
  
End Sub


Function RegExSubst(targetStr as string, searchForStr as string, replaceStr as string, flags as integer) As String
  Dim options as new com.sun.star.util.SearchOptions
  options.algorithmType = com.sun.star.util.SearchAlgorithms.REGEXP
  flagOp = ""
  if (flags and 1) = 1 then
    flagOp = "i"
  end if
  flagStr = ""
  if len(flagOp) > 0 then
    flagStr = "(?" & flagOp & ")"
  end if
  options.searchForString = flagStr & searchForStr
  ts = CreateUnoService("com.sun.star.util.TextSearch")
  ts.setOptions(options)
  r = ts.searchForward(targetStr, 0, Len(targetStr))
  if r.subRegExpressions = 0 then
    RegExSubst = targetStr
  else
    subCount = r.subRegExpressions
    replaceText = replaceStr
    i = 0
    start = 1
    Do
      ref = "\" & CStr(i)
      n = InStr(start, replaceText, ref)
      if n > 0 then
        refStr = mid(targetStr, r.startOffset(i) +1, r.endOffset(i) - r.startOffset(0))
        replaceText = left(replaceText, n-1) & refStr & right(replaceText, len(replaceText) -(n + len(ref))+1)
        start = n + len(replaceText)
      else
        i = i + 1
      end if
    Loop While i < subCount
    
    RegExSubst = left(targetStr, r.startOffset(0)) & replaceText & mid(targetStr, r.endOffset(0) +1, 65535)
  end if
End Function