[Calc] Split string into an array of single words

Creating a macro - Writing a Script - Using the API

[Calc] Split string into an array of single words

Postby jari0471 » Fri Aug 13, 2010 4:46 am

Split string into an array of single words (similar to separation of "arguments" in shell command line)

Code: Select all   Expand viewCollapse view
Sub test
  Dim p As String
  p1="aa bb  cc   dd"
  a1=Split(p1," ")
  Print a1(0)
  Print a1(1)
  Print a1(2)
  Print a1(3)
End Sub

Displayed is "aa", "bb", " " and "cc".
Displayed should be "aa", "bb", "cc" and "dd".

There is 2 spaces between "bb" and "cc" and 3 spaces between "cc" and "dd" in p1.
Is there another command or different option for this command that can do separation of words (arguments)?
If a large file is read to calc sheet, it will take a long time if the splitting function is slow. Thus fast (and simple) operation is requested.
Last edited by jari0471 on Fri Aug 13, 2010 6:59 am, edited 1 time in total.
Open Office 3.2.1
Red Hat Enterprise Linux Client release 5.4 (Tikanga)
jari0471
 
Posts: 8
Joined: Fri Aug 13, 2010 4:21 am

Re: Split string into an array of single words

Postby Charlie Young » Fri Aug 13, 2010 5:39 am

I get aa bb cc dd running the code copied exactly from above, but where are the strings coming from? If they are in, or could be gotten into spreadsheet cells, you could use Data > Text to Columns.
Apache OpenOffice 4.1.1
Windows XP
User avatar
Charlie Young
Volunteer
 
Posts: 1559
Joined: Fri May 14, 2010 1:07 am

Re: Split string into an array of single words

Postby jari0471 » Fri Aug 13, 2010 6:55 am

The forum seemed to remove additional "spaces" between the quotes.
Last edited by jari0471 on Fri Aug 13, 2010 7:09 am, edited 1 time in total.
Open Office 3.2.1
Red Hat Enterprise Linux Client release 5.4 (Tikanga)
jari0471
 
Posts: 8
Joined: Fri Aug 13, 2010 4:21 am

Re: Split string into an array of single words

Postby jari0471 » Fri Aug 13, 2010 6:59 am

There is 2 spaces between "bb" and "cc" and 3 spaces between "cc" and "dd" in p1 (the forum posting seems to remove the additional spaces automatically).
Open Office 3.2.1
Red Hat Enterprise Linux Client release 5.4 (Tikanga)
jari0471
 
Posts: 8
Joined: Fri Aug 13, 2010 4:21 am

Re: Split string into an array of single words

Postby RoryOF » Fri Aug 13, 2010 7:15 am

jari0471 wrote:There is 2 spaces between "bb" and "cc" and 3 spaces between "cc" and "dd" in p1 (the forum posting seems to remove the additional spaces automatically).


In that case, and in all cases where the layout/spacing is important, you should post using the "Code" option in the Full Editor window. This option is not available in the Quick Reply window - you have to switch to Full Editor.
Apache OpenOffice 4.1.7 on Xubuntu 18.04.3 (mostly 64 bit version) and very infrequently on Win2K/XP
User avatar
RoryOF
Moderator
 
Posts: 29861
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Split string into an array of single words

Postby FJCC » Fri Aug 13, 2010 7:27 am

Here is a not-very-elegant solution for the example given
Code: Select all   Expand viewCollapse view
Sub test
p1="aa bb  cc   dd"
a1=Split(p1," ")
count = 0
Dim a2(ubound(a1))
For Each Word in a1
   If Word <> "" then
      a2(count) = Word
      count = count+1
   end if
next Word
Redim Preserve a2(count)
Print a2(0)
Print a2(1)
Print a2(2)
Print a2(3)
End Sub
Windows 10 and Linux Mint, since 2017
If your question is answered, please go to your first post, select the Edit button, and add [Solved] to the beginning of the title.
FJCC
Moderator
 
Posts: 7376
Joined: Sat Nov 08, 2008 8:08 pm
Location: Colorado, USA

Re: Split string into an array of single words

Postby jari0471 » Fri Aug 13, 2010 8:20 am

The solution you give can count the number of "words", but can be come "painfully" slow do to the for-loop, when there is 1000's of strings to work with... (Thus I try find internal function, java -code or any other solution that gives very fast speed also).
Open Office 3.2.1
Red Hat Enterprise Linux Client release 5.4 (Tikanga)
jari0471
 
Posts: 8
Joined: Fri Aug 13, 2010 4:21 am

Re: Split string into an array of single words

Postby RoryOF » Fri Aug 13, 2010 9:08 am

If you take your strings into a word processor, such as Writer, its Find and Replace can quickly simplify multiple spaces to single spaces, then change them to new lines or other delimiters, so that the file can be reopened in Calc using the CSV import mechanism.
Apache OpenOffice 4.1.7 on Xubuntu 18.04.3 (mostly 64 bit version) and very infrequently on Win2K/XP
User avatar
RoryOF
Moderator
 
Posts: 29861
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: Split string into an array of single words

Postby jari0471 » Fri Aug 13, 2010 9:47 am

The idea using writer is possible. However the text is read from a file "iFile" with Line Input (that gives single string for a line).
The original file(s) should not be modified, since they are data files.
Code: Select all   Expand viewCollapse view
 
  iFile = Freefile
  Open sPath+"/"+sFile For Input As iFile

  While (not eof(iFile))

    Line Input #iFile,sValue
    sValue=Trim(sValue)
    If sValue<>"" Then

Maybe there is another way reading the files that can do the splitting the words.
Some of the lines (documentation, etc. other data) in the original file are not intended to be used, thus the read-routine is complicated.
Open Office 3.2.1
Red Hat Enterprise Linux Client release 5.4 (Tikanga)
jari0471
 
Posts: 8
Joined: Fri Aug 13, 2010 4:21 am

Re: Split string into an array of single words

Postby Hagar Delest » Fri Aug 13, 2010 10:06 am

jari0471 wrote:The forum seemed to remove additional "spaces" between the quotes.

Use BBCode (I've edited your first post).
AOO 4.1.7 on Xubuntu 19.10 and 4.1.5 on Windows 10 (with winPenPack port).
User avatar
Hagar Delest
Moderator
 
Posts: 28620
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: [Calc] Split string into an array of single words

Postby RoryOF » Fri Aug 13, 2010 11:45 am

The idea using writer is possible. However the text is read from a file "iFile" with Line Input (that gives single string for a line).
The original file(s) should not be modified, since they are data files


In that case, you should be working on copies of the files, not on the originals. The Writer path then seems most practical.
Apache OpenOffice 4.1.7 on Xubuntu 18.04.3 (mostly 64 bit version) and very infrequently on Win2K/XP
User avatar
RoryOF
Moderator
 
Posts: 29861
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: [Calc] Split string into an array of single words

Postby RPG » Fri Aug 13, 2010 12:45 pm

Hello

Maybe you can use the function replace. It is not document but I believe it is normal function. I'm not sure if it is only in the go-oo version or in all version of OOo.
As far I knew it can have six parameters
a) the string
b) The new part
c) the old part
d) where to start
e) How often to change a part
f) Case sensitive

Code: Select all   Expand viewCollapse view
Sub test
  Dim p As String
  p1="aa bb    cc   dd"
  p1=replace(p1,"  "," ")
  a1=Split(p1," ")
  msgbox  a1(0) & chr(13) & _
   a1(1) & chr(13) & _
   a1(2) & chr(13) & _
   a1(3)
End Sub



Romke
LibreOffice 6.2.5.2 on openSUSE Leap 15
RPG
Volunteer
 
Posts: 2176
Joined: Tue Apr 14, 2009 7:15 pm
Location: Netherlands

Re: [Calc] Split string into an array of single words

Postby B Marcelly » Fri Aug 13, 2010 2:37 pm

Hi,
I took this as a recreation challenge :P

Here is a general solution, it works with words separated by any length of space.
It tries to replace a long series of spaces by one space, then tries with a shorter series, etc.
See how powerful is split/join 8-)
Code: Select all   Expand viewCollapse view
Dim phrase As String
phrase = "hello OOoBasic programmers, this   is       a  solution"

Dim nbSpaces As Long, v As Variant

nbSpaces = 5  ' initial value, greater or equal to 2
Do While nbSpaces > 1
  v = split(phrase, space(nbSpaces))
  if UBound(v()) = 0  then
    nbSpaces = nbSpaces -1
  else
    phrase = join(v, " ")
  end if
Loop
v = split(phrase, " ")
' *** display of vector v() ***
MsgBox(join(v, chr(13)), 0, "List of words")

The initial value of nbSpaces can optimize the number of loops, if you know what length of spaces you may usually have. But even a start value of 2 can do the job.
Bernard

OpenOffice.org 1.1.5 / Apache OpenOffice 4.1.1 / LibreOffice 5.0.5
MS-Windows 7 Home SP1
B Marcelly
Volunteer
 
Posts: 1160
Joined: Mon Oct 08, 2007 1:26 am
Location: France, Paris area

Re: [Calc] Split string into an array of single words

Postby Charlie Young » Fri Aug 13, 2010 8:10 pm

OK, here's my attempt, which doesn't use the Basic Split function. I, um, borrowed Bernard's trick for displaying the results.

Code: Select all   Expand viewCollapse view
Sub TestStrSplit
   Dim a
   Dim s As String
   Dim i As Integer
   
   s = "hello OOoBasic programmers, this   is       another  solution"

   a = StrSplit(s," ")
   MsgBox(join(a,chr(13))
End Sub

Function StrSplit(s As String, SplitChar As String)
   Dim i As Integer
   Dim n As Integer
   Dim m As String
   Dim t As String
   Dim a(0) As String
   Dim InWord As Boolean
      
   t =""
   InWord = False
   i = 1
   n = 0
      
   do while i   <=   len(s)
      m = mid(s,i,1)
      if m <> SplitChar then
         InWord = True
         t = t & m
      else
         if InWord then
            Redim Preserve a(n) As String
            a(n) = t
            n = n + 1
            t = ""
            InWord = False
         endif
      endif
      i = i + 1
   loop
   if InWord then
      Redim Preserve a(n) As String
      a(n) = t
   endif
   
   StrSplit = a
End Function
Apache OpenOffice 4.1.1
Windows XP
User avatar
Charlie Young
Volunteer
 
Posts: 1559
Joined: Fri May 14, 2010 1:07 am

Re: [Calc] Split string into an array of single words

Postby jari0471 » Sat Aug 14, 2010 6:07 am

Since the original data is in text file (and used line input to place it to a string line by line),
it could be also possible to read the original text file word by word and no split is required at all.
But I could not find how to do this either.
Open Office 3.2.1
Red Hat Enterprise Linux Client release 5.4 (Tikanga)
jari0471
 
Posts: 8
Joined: Fri Aug 13, 2010 4:21 am

Re: [Calc] Split string into an array of single words

Postby eddyparkinson » Tue May 30, 2017 8:46 am

Bug fix - handle blank cells
Code: Select all   Expand viewCollapse view
REM  *****  BASIC  *****
Sub TestStrSplit_clone
   Dim a
   Dim s As String
   Dim i As Integer
   
   s = "hello OOoBasic programmers, this   is       another  solution"

   a = StrSplit(s," ")
   MsgBox(join(a,chr(13))
   a = StrSplit(Nothing," ")
   MsgBox(join(a,chr(13))
End Sub

Function StrSpliter(s, SplitChar As String)
  StrSpliter = "a"
End Function


Function StrSplit(SplitThis, SplitChar As String)
   Dim s As String

   if IsNull(SplitThis) or IsEmpty(SplitThis) then s = ""  else s =  SplitThis
   Dim i As Integer
   Dim n As Integer
   Dim m As String
   Dim t As String
   Dim a(0) As String
   Dim InWord As Boolean
     
   t =""
   InWord = False
   i = 1
   n = 0
     
   do while i   <=   len(s)
      m = mid(s,i,1)
      if m <> SplitChar then
         InWord = True
         t = t & m
      else
         if InWord then
            Redim Preserve a(n) As String
            a(n) = t
            n = n + 1
            t = ""
            InWord = False
         endif
      endif
      i = i + 1
   loop
   if InWord then
      Redim Preserve a(n) As String
      a(n) = t
   endif
   
   StrSplit = a
End Function
Open Office 4.1.2 - Windows 10
eddyparkinson
 
Posts: 10
Joined: Thu Mar 09, 2017 5:08 am

Re: [Calc] Split string into an array of single words

Postby RPG » Wed Jun 14, 2017 10:47 am

Hello

It seems to me that in the last post is not understand that B Marcelly and Charlie Young did play a little with some commands. I think it is also good to make a real short example of how to solve the problem.

In starBASIC there are two commands JOIN and SPLIT who are real powerful as also B Marcelly pointed out. Most of the time those two commands are used with a single character to join or split values. But it is also possible to use more characters then one character, B Marcelly did use it in his example in the split part.

I do prefer the command replace. In the time when this thread started the replace command was not in all version of OOo. It is now in all new version of OpenOffice. I am not sure if it works in all cases but I have test it as much I can. It is also only an example to learn to use REPLACE and SPLIT.

It is me not clear what people want do with the StrSplit function. I want only place here how to use replace and split. When you knew the function replace, split, join then it is more easy to make short and easy code.
Code: Select all   Expand viewCollapse view
Function StrSplit(SplitThis, SplitChar As String)
dim tempory  ' we define a tempory variable
' Be aware that we use the variabele first for the string value.
' We set it as an empty string
tempory="" ' Make sure we have an empty string.
if typename(SplitThis)="String" then
' The parameter is a string in this case we need a string   
   tempory=replace(SplitThis,"  ", " ") ' erase first all double spaces
' Be aware that that tempory is change from a single variabele to an array variabele
   tempory =SPLIT(tempory,SplitChar) ' Split string to array
end if
StrSplit=tempory ' Give back to the function an empty string or an array as the original function was
End Function


Romke
LibreOffice 6.2.5.2 on openSUSE Leap 15
RPG
Volunteer
 
Posts: 2176
Joined: Tue Apr 14, 2009 7:15 pm
Location: Netherlands

Re: [Calc] Split string into an array of single words

Postby jarkky » Tue Nov 12, 2019 9:52 am

At the end (after 9 years) the below is the latest solution:

Code: Select all   Expand viewCollapse view
Sub Test_string
  Print parse_string("a b c")(2)
  Print parse_string("")(0)
End Sub

Function parse_string(stri as String) as Array

  lst=Len(stri)
  Dim ssv(30) as String
  Dim ssvr(0) as String

  i=0 : n=0
  Do While i<lst
    Do While i<lst
      f=0
      If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
        f=1
        i=i+1
      End If
      If f=0 Then
        Exit Do
      End If
    Loop
    If i=lst Then
      Exit Do
    End If
    If Mid(stri,i+1,1)="'" Then
      f=0 : i=i+1
      ssv(n)=""
      Do While i<lst
        If Mid(stri,i+1,1)="'" Then
          f=1 : i=i+1
          Exit Do
        Else
          ssv(n)=ssv(n)+Mid(stri,i+1,1)
        End If
        i=i+1
      Loop
      If f=0 Then
        Print "String has no closing quotation!"
        parse_string="@closing_quotation_missing"
        Exit Function
      End If
      If i<lst Then
        f=0
        If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
          f=1
        End If
        If f=0 Then
          Print "After closing quotation at column "+Str(i)+" string separator is required!"
          parse_string="@string_sepator_missing"
          Exit Function
        End If
        i=i+1
      End If
      n=n+1
    Else
      ssv(n)=Mid(stri,i+1,1)
      i=i+1
      Do While i<lst
        f=0
        If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
          f=1
        End If
        If f=0 Then
          If Mid(stri,i+1,1)="'" Then
            Print "Quotation character at column "+Str(i)+" not allowed in middle of string!"
            parse_string="@wrong_quotation_type"
            Exit Function
          End If
          ssv(n)=ssv(n)+Mid(stri,i+1,1)
        End If
        i=i+1
        If f=1 Then
          Exit Do
        End If
      Loop
      n=n+1
    End If
  Loop
  If (n>0) Then
    Redim ssvr(n-1) as String
  End If
  For j=0 To n-1
    ssvr(j)=ssv(j)
  Next j
  parse_string=ssvr
 
End Function




It looks like the "parse_string" command as this is missing in the openoffice macro commands.
Last edited by RusselB on Wed Nov 13, 2019 12:08 am, edited 1 time in total.
Reason: Code tags added
OpenOffice 3.1 on CentOS Linux release 8.0.1905 (Core)
jarkky
 
Posts: 7
Joined: Sat Oct 12, 2019 5:18 pm

Re: [Calc] Split string into an array of single words

Postby jarkky » Tue Nov 12, 2019 10:26 am

Below is possibly the right method.
To test if string is empty it looks like requiring using ubound&lbound.
Not sure if there is any test command for empty arrays, such as "If v=[] Then ... End If"

Code: Select all   Expand viewCollapse view
Sub Test_string
  Print parse_string("a b c")(2)
  ev=parse_string(" ")
  Print lbound(ev),":",ubound(ev)
End Sub

Function parse_string(stri as String) as Array

  lst=Len(stri)
  Dim ssv(30) as String
  Dim ssvr() as String

  i=0 : n=0
  Do While i<lst
    Do While i<lst
      f=0
      If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
        f=1
        i=i+1
      End If
      If f=0 Then
        Exit Do
      End If
    Loop
    If i=lst Then
      Exit Do
    End If
    If Mid(stri,i+1,1)="'" Then
      f=0 : i=i+1
      ssv(n)=""
      Do While i<lst
        If Mid(stri,i+1,1)="'" Then
          f=1 : i=i+1
          Exit Do
        Else
          ssv(n)=ssv(n)+Mid(stri,i+1,1)
        End If
        i=i+1
      Loop
      If f=0 Then
        Print "String has no closing quotation!"
        parse_string="@closing_quotation_missing"
        Exit Function
      End If
      If i<lst Then
        f=0
        If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
          f=1
        End If
        If f=0 Then
          Print "After closing quotation at column "+Str(i)+" string separator is required!"
          parse_string="@string_sepator_missing"
          Exit Function
        End If
        i=i+1
      End If
      n=n+1
    Else
      ssv(n)=Mid(stri,i+1,1)
      i=i+1
      Do While i<lst
        f=0
        If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
          f=1
        End If
        If f=0 Then
          If Mid(stri,i+1,1)="'" Then
            Print "Quotation character at column "+Str(i)+" not allowed in middle of string!"
            parse_string="@wrong_quotation_type"
            Exit Function
          End If
          ssv(n)=ssv(n)+Mid(stri,i+1,1)
        End If
        i=i+1
        If f=1 Then
          Exit Do
        End If
      Loop
      n=n+1
    End If
  Loop
  If (n>0) Then
    Redim ssvr(n-1) as String
  End If
  For j=0 To n-1
    ssvr(j)=ssv(j)
  Next j
  parse_string=ssvr
 
End Function
Last edited by RusselB on Wed Nov 13, 2019 12:09 am, edited 1 time in total.
Reason: Code tags added
OpenOffice 3.1 on CentOS Linux release 8.0.1905 (Core)
jarkky
 
Posts: 7
Joined: Sat Oct 12, 2019 5:18 pm

Re: [Calc] Split string into an array of single words

Postby jarkky » Tue Nov 12, 2019 10:54 am

Using "Redim preserve" at the last part will then help avoid to copy the vector contents into another vector:

Code: Select all   Expand viewCollapse view
Sub Test_string
  Print parse_string("a b c")(2)
  ev=parse_string(" ")
  Print lbound(ev),":",ubound(ev)
End Sub

Function parse_string(stri as String) as Array

  lst=Len(stri)
  Dim ssv(30) as String
  Dim ssvr() as String

  i=0 : n=0
  Do While i<lst
    Do While i<lst
      f=0
      If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
        f=1
        i=i+1
      End If
      If f=0 Then
        Exit Do
      End If
    Loop
    If i=lst Then
      Exit Do
    End If
    If Mid(stri,i+1,1)="'" Then
      f=0 : i=i+1
      ssv(n)=""
      Do While i<lst
        If Mid(stri,i+1,1)="'" Then
          f=1 : i=i+1
          Exit Do
        Else
          ssv(n)=ssv(n)+Mid(stri,i+1,1)
        End If
        i=i+1
      Loop
      If f=0 Then
        Print "String has no closing quotation!"
        parse_string="@closing_quotation_missing"
        Exit Function
      End If
      If i<lst Then
        f=0
        If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
          f=1
        End If
        If f=0 Then
          Print "After closing quotation at column "+Str(i)+" string separator is required!"
          parse_string="@string_sepator_missing"
          Exit Function
        End If
        i=i+1
      End If
      n=n+1
    Else
      ssv(n)=Mid(stri,i+1,1)
      i=i+1
      Do While i<lst
        f=0
        If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
          f=1
        End If
        If f=0 Then
          If Mid(stri,i+1,1)="'" Then
            Print "Quotation character at column "+Str(i)+" not allowed in middle of string!"
            parse_string="@wrong_quotation_type"
            Exit Function
          End If
          ssv(n)=ssv(n)+Mid(stri,i+1,1)
        End If
        i=i+1
        If f=1 Then
          Exit Do
        End If
      Loop
      n=n+1
    End If
  Loop
  If (n>0) Then
    Redim preserve ssv(n-1) as String
    parse_string=ssv
  Else
    parse_string=ssvr
  End If
 
End Function
Last edited by RusselB on Wed Nov 13, 2019 12:09 am, edited 1 time in total.
Reason: Code tags added
OpenOffice 3.1 on CentOS Linux release 8.0.1905 (Core)
jarkky
 
Posts: 7
Joined: Sat Oct 12, 2019 5:18 pm

Re: [Calc] Split string into an array of single words

Postby jarkky » Tue Nov 12, 2019 12:25 pm

I have tried to check more and thinking if function error return should occur with global variables (to return full error descriptions).
This case if openoffice had global global variable for error return for each command I could use those same variable here. But this may require more checking...
Also if the Mid()-function index starts at index=1 rather than at index=0 can be problem as well as that redim can not make empty strings.


Code: Select all   Expand viewCollapse view
Sub Test_string
  Print parse_string("a b c")(2)
  Print erf
  ev=parse_string(" ")
  Print lbound(ev),":",ubound(ev)
  Print erf
  ev=parse_string("a b 'c")
  Print erf
  If erf=True Then
    Print ers
  End If
End Sub

REM Function error return is defined at the global variable level.
REM This could be possible method for all the functions.
REM erf : Flag TRUE/FALSE (error occur or not)
REM ers : Error name as string
REM erm : Error message as string
Global erf as Boolean
Global ers as String
Global erm as String

Function parse_string(stri as String) as Array

  lst=Len(stri)
  Dim ssv(30) as String
  Dim ssvr() as String

  i=0 : n=0
  Do While i<lst
    Do While i<lst
      f=0
      REM The Mid()-command requires i+1 argument when i starts from zero. Can this be problem?
      If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
        f=1
        i=i+1
      End If
      If f=0 Then
        Exit Do
      End If
    Loop
    If i=lst Then
      Exit Do
    End If
    If Mid(stri,i+1,1)="'" Then
      f=0 : i=i+1
      ssv(n)=""
      Do While i<lst
        If Mid(stri,i+1,1)="'" Then
          f=1 : i=i+1
          Exit Do
        Else
          ssv(n)=ssv(n)+Mid(stri,i+1,1)
        End If
        i=i+1
      Loop
      If f=0 Then
        erm="String has no closing quotation!"
        ers="@closing_quotation_missing"
        erf=TRUE
        parse_string=ssvr
        Print erm
        Exit Function
      End If
      If i<lst Then
        f=0
        If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
          f=1
        End If
        If f=0 Then
          erm="After closing quotation at column "+Str(i)+" string separator is required!"
          ers="@string_sepator_missing"
          erf=TRUE
          parse_string=ssvr
          Print erm
          Exit Function
        End If
        i=i+1
      End If
      n=n+1
    Else
      ssv(n)=Mid(stri,i+1,1)
      i=i+1
      Do While i<lst
        f=0
        If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
          f=1
        End If
        If f=0 Then
          If Mid(stri,i+1,1)="'" Then
            erm="Quotation character at column "+Str(i)+" not allowed in middle of string!"
            ers="@wrong_quotation_placement"
            erf=TRUE
            parse_string=ssvr
            Print erm
            Exit Function
          End If
          ssv(n)=ssv(n)+Mid(stri,i+1,1)
        End If
        i=i+1
        If f=1 Then
          Exit Do
        End If
      Loop
      n=n+1
    End If
  Loop
  REM redim does not seem to operate with negative value (to make empty string)!
  If (n>-1) Then
    Redim preserve ssv(n-1) as String
    parse_string=ssv
  Else
    parse_string=ssvr
  End If
  erm=""
  ers=""
  erf=FALSE
End Function
Last edited by RusselB on Wed Nov 13, 2019 12:10 am, edited 1 time in total.
Reason: Code tags added
OpenOffice 3.1 on CentOS Linux release 8.0.1905 (Core)
jarkky
 
Posts: 7
Joined: Sat Oct 12, 2019 5:18 pm

Re: [Calc] Split string into an array of single words

Postby jarkky » Tue Nov 12, 2019 1:30 pm

The error handling in the user functions would be probably using different variables that as "erl" & "err" by openoffice own functions:

Code: Select all   Expand viewCollapse view
Sub Test_String
  Print parse_string("a b c")(2)
  Print erf
  ev=parse_string(" ")
  Print lbound(ev),":",ubound(ev)
  Print erf
  ev=parse_string("a b 'c")
  Print erf
  If erf=True Then
    Print ers
  End If
End Sub

REM Function error return is defined at the global variable level.
REM This could be possible method for all the functions.
REM The user function can not stop into the error automatically or detect error line number.
REM Errors must be checked from the erf-flag. erf is operated by all commands, while
REM erc, ers and erm are filled in the function only in case erf=TRUE
REM If erc, ers and erm would be arrays, then all the errors in the program could be logged
REM into memory or a file for later inspection. Also I can not fill the "erl" and "err"
REM variables for user defined functions. That is if I have my own error message outside
REM the openoffice errors the program will not halt for that case.
REM Error handling should be available for the user defined functions also...
REM erf : Flag TRUE/FALSE (error occurred in last command or not)
REM erc : Command that produced the error
REM ers : Error name as string
REM erm : Error message as string
Global erf as Boolean
Global erc as String
Global ers as String
Global erm as String
REM The above parameters could be initialized globally, but how?

Function parse_string(stri as String) as Array

  lst=Len(stri)
  Dim ssv(30) as String
  Dim ssvr() as String

  i=1 : n=0
  Do While i<=lst
    Do While i<=lst
      f=0
      If Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9) Then
        f=1
        i=i+1
      End If
      If f=0 Then
        Exit Do
      End If
    Loop
    If i>lst Then
      Exit Do
    End If
    If Mid(stri,i,1)="'" Then
      f=0 : i=i+1
      ssv(n)=""
      Do While i<=lst
        If Mid(stri,i,1)="'" Then
          f=1 : i=i+1
          Exit Do
        Else
          ssv(n)=ssv(n)+Mid(stri,i,1)
        End If
        i=i+1
      Loop
      If f=0 Then
        erc="parse_string()"
        erm="String has no closing quotation!"
        ers="@closing_quotation_missing"
        erf=TRUE
        parse_string=ssvr
        Print erm
        Exit Function
      End If
      If i<=lst Then
        f=0
        If Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9) Then
          f=1
        End If
        If f=0 Then
          erc="parse_string()"
          erm="After closing quotation at column "+Str(i)+" string separator is required!"
          ers="@string_sepator_missing"
          erf=TRUE
          parse_string=ssvr
          Print erm
          Exit Function
        End If
        i=i+1
      End If
      n=n+1
    Else
      ssv(n)=Mid(stri,i,1)
      i=i+1
      Do While i<=lst
        f=0
        If Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9) Then
          f=1
        End If
        If f=0 Then
          If Mid(stri,i,1)="'" Then
            erc="parse_string()"
            erm="Quotation character at column "+Str(i)+" not allowed in middle of string!"
            ers="@wrong_quotation_placement"
            erf=TRUE
            parse_string=ssvr
            Print erm
            Exit Function
          End If
          ssv(n)=ssv(n)+Mid(stri,i,1)
        End If
        i=i+1
        If f=1 Then
          Exit Do
        End If
      Loop
      n=n+1
    End If
  Loop
  REM redim does not seem to operate with negative value (to make empty string)!
  If (n>0) Then
    Redim preserve ssv(n-1) as String
    parse_string=ssv
  Else
    parse_string=ssvr
  End If
  erf=FALSE
End Function
Last edited by RusselB on Wed Nov 13, 2019 12:12 am, edited 1 time in total.
Reason: Code tags added
OpenOffice 3.1 on CentOS Linux release 8.0.1905 (Core)
jarkky
 
Posts: 7
Joined: Sat Oct 12, 2019 5:18 pm

Re: [Calc] Split string into an array of single words

Postby jarkky » Tue Nov 12, 2019 3:38 pm

It looks like the "split string" could work as below:


Code: Select all   Expand viewCollapse view
Sub Test_String
  Print parse_string("a b c")(2)
  Print erf
  ev=parse_string(" ")
  Print lbound(ev),":",ubound(ev)
  Print erf
  ev=parse_string("a b 'c")
  Print erf
  If erf=True Then
    Print ers
  End If
End Sub

REM Function error return is defined at the global variable level.
REM This could be possible method for all the functions.
REM The user function can not stop into the error automatically or detect error line number.
REM Errors must be checked from the erf-flag. erf is operated by all commands, while
REM erc, ers and erm are filled in the function only in case erf=TRUE
REM If erc, ers and erm would be arrays, then all the errors in the program could be logged
REM into memory or a file for later inspection. Also I can not fill the "erl" and "err"
REM variables for user defined functions. That is if I have my own error message outside
REM the openoffice errors the program will not halt for that case.
REM Error handling should be available for the user defined functions also...
REM erf : Flag TRUE/FALSE (error occurred in last command or not)
REM erc : Command that produced the error
REM ers : Error name as string
REM erm : Error message as string
Global erf as Boolean
Global erc as String
Global ers as String
Global erm as String
REM The above parameters could be initialized globally, but how?

Function parse_string(stri as String) as Array

  lst=Len(stri)
  Dim ssv(30) as String
  Dim ssvr() as String

  i=1 : n=0
  Do While i<=lst
    Do While (Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9))
      i=i+1
      If i>lst Then
        Exit Do
      End If
    Loop
    If i>lst Then
      Exit Do
    End If
    If Mid(stri,i,1)="'" Then
      f=0 : i=i+1
      ssv(n)=""
      Do While i<=lst
        If Mid(stri,i,1)="'" Then
          f=1 : i=i+1
          Exit Do
        Else
          ssv(n)=ssv(n)+Mid(stri,i,1)
        End If
        i=i+1
      Loop
      If f=0 Then
        erc="parse_string()"
        erm="String has no closing quotation!"
        ers="@closing_quotation_missing"
        erf=TRUE
        parse_string=ssvr
        Print erm
        Exit Function
      End If
      If i<=lst Then
        If Not(Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9)) Then
          erc="parse_string()"
          erm="After closing quotation at column "+Str(i)+" string separator is required!"
          ers="@string_sepator_missing"
          erf=TRUE
          parse_string=ssvr
          Print erm
          Exit Function
        End If
        i=i+1
      End If
    Else
      ssv(n)=Mid(stri,i,1)
      i=i+1
      Do While i<=lst
        If Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9) Then
          i=i+1
          Exit Do
        End If
        If Mid(stri,i,1)="'" Then
          erc="parse_string()"
          erm="Quotation character at column "+Str(i)+" not allowed in middle of string!"
          ers="@wrong_quotation_placement"
          erf=TRUE
          parse_string=ssvr
          Print erm
          Exit Function
        End If
        ssv(n)=ssv(n)+Mid(stri,i,1)
        i=i+1
      Loop
    End If
    n=n+1
  Loop
  REM redim does not seem to operate with negative value (to make empty string)!
  If (n>0) Then
    Redim preserve ssv(n-1) as String
    parse_string=ssv
  Else
    parse_string=ssvr
  End If
  erf=FALSE
End Function



Not sure of the function error handling and also it looks like the "Redim" can not make empty array...
Last edited by RusselB on Wed Nov 13, 2019 12:12 am, edited 1 time in total.
Reason: Code tags added
OpenOffice 3.1 on CentOS Linux release 8.0.1905 (Core)
jarkky
 
Posts: 7
Joined: Sat Oct 12, 2019 5:18 pm

Re: [Calc] Split string into an array of single words

Postby JeJe » Wed Nov 13, 2019 1:52 am

Getting rid of suplus spaces in the initial post can be done easily. Turning a string into a byte array *should* make things faster than using mid...

Code: Select all   Expand viewCollapse view


Sub test
   Dim p As String
   p1="aa bb  cc   dd"

   dim b() as byte, c as long,start as boolean
   b() = p1: c =-2

   for i = 0 to ubound(b) step 2
      if b(i) = 32 and b(i+1) = 0 then
         if start = true then
            c= c+2
            b(c) =b(i)
            b(c+1) =0
            start = false
         end if
      else
         c= c+2
         b(c) =b(i)
         b(c+1) =b(i+1)
         start = true
      end if
   next
   redim preserve b(c)
   p1= b

   a1=Split(p1," ")
   Print a1(0)
   Print a1(1)
   Print a1(2)
   Print a1(3)

End Sub


Openoffice 4.1.2
Windows 8
JeJe
Volunteer
 
Posts: 623
Joined: Wed Mar 09, 2016 2:40 pm

Re: [Calc] Split string into an array of single words

Postby jarkky » Wed Nov 13, 2019 12:40 pm

That looks possible, but I extended to remove the "tabular" or "\t" also:


Code: Select all   Expand viewCollapse view
Sub Test_String
  Print parse_string("""a b c"" ""\"""" c")(1)
  Print parse_string("a b c")(2)
  Print erf
  ev=parse_string(" ")
  Print lbound(ev),":",ubound(ev)
  Print erf
  ev=parse_string("a b 'c")
  Print erf
  If erf=True Then
    Print ers
  End If
End Sub

REM Function error return is defined at the global variable level.
REM This could be possible method for all the functions.
REM The user function can not stop into the error automatically or detect error line number.
REM Errors must be checked from the erf-flag. erf is operated by all commands, while
REM erc, ers and erm are filled in the function only in case erf=TRUE
REM If erc, ers and erm would be arrays, then all the errors in the program could be logged
REM into memory or a file for later inspection. Also I can not fill the "erl" and "err"
REM variables for user defined functions. That is if I have my own error message outside
REM the openoffice errors the program will not halt for that case.
REM Error handling should be available for the user defined functions also...
REM erf : Flag TRUE/FALSE (error occurred in last command or not)
REM erc : Command that produced the error
REM ers : Error name as string
REM erm : Error message as string
Global erf as Boolean
Global erc as String
Global ers as String
Global erm as String
REM The above parameters could be initialized globally, but how?

Function parse_string(stri as String) as Array
  If Not (TypeName(stri)="String") Then
    erc="parse_string()"
    erm="Input argument can be only string type!"
    ers="@wrong_argument_type"
    erf=TRUE
    parse_string=""
    Print erm
    Exit Function   
  End If
  lst=Len(stri)
  Dim ssv(30) as String
  Dim ssvr() as String
  i=1 : n=0
  Do While i<=lst
    Do While (Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9))
      i=i+1
      If i>lst Then
        Exit Do
      End If
    Loop
    If i>lst Then
      Exit Do
    End If
    If Mid(stri,i,1)="""" Then
      f=0 : i=i+1
      ssv(n)=""
      Do While i<=lst
        If Mid(stri,i,1)="""" Then
          If Mid(stri,i-1,1)="\" Then
            ssv(n)=Left(ssv(n),Len(ssv(n))-1)
          Else
            f=1 : i=i+1
            Exit Do
          End If
        End If
        ssv(n)=ssv(n)+Mid(stri,i,1)
        i=i+1
      Loop
      If f=0 Then
        erc="parse_string()"
        erm="String has no closing quotation!"
        ers="@closing_quotation_missing"
        erf=TRUE
        parse_string=ssvr
        Print erm
        Exit Function
      End If
      If i<=lst Then
        If Not(Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9)) Then
          erc="parse_string()"
          erm="After closing quotation at column "+Str(i)+" string separator is required!"
          ers="@string_sepator_missing"
          erf=TRUE
          parse_string=ssvr
          Print erm
          Exit Function
        End If
        i=i+1
      End If
    Else
      ssv(n)=Mid(stri,i,1)
      i=i+1
      Do While i<=lst
        If Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9) Then
          i=i+1
          Exit Do
        End If
        If Mid(stri,i,1)="""" Then
          If Mid(stri,i-1,1)="\" Then
            ssv(n)=Left(ssv(n),Len(ssv(n))-1)
          Else
            erc="parse_string()"
            erm="Quotation character at column "+Str(i)+" not allowed in middle of string!"
            ers="@wrong_quotation_placement"
            erf=TRUE
            parse_string=ssvr
            Print erm
            Exit Function
          End If
        End If
        ssv(n)=ssv(n)+Mid(stri,i,1)
        i=i+1
      Loop
    End If
    n=n+1
  Loop
  REM redim does not seem to operate with negative value (to make empty string)!
  If (n>0) Then
    Redim preserve ssv(n-1) as String
    parse_string=ssv
  Else
    parse_string=ssvr
  End If
  erf=FALSE
End Function



Here also even input arguments are defined as "String" -type the User should check that they come as "string"-type into the function...

 Edit: Please use the CODE tags around program code for readability and easy copying 
OpenOffice 3.1 on CentOS Linux release 8.0.1905 (Core)
jarkky
 
Posts: 7
Joined: Sat Oct 12, 2019 5:18 pm

Re: [Calc] Split string into an array of single words

Postby KaduLeite » Tue Dec 03, 2019 7:03 pm

jari0471 wrote:Split string into an array of single words (similar to separation of "arguments" in shell command line)

Code: Select all   Expand viewCollapse view
Sub test
  Dim p As String
  p1="aa bb  cc   dd"
  a1=Split(p1," ")
  Print a1(0)
  Print a1(1)
  Print a1(2)
  Print a1(3)
End Sub

Displayed is "aa", "bb", " " and "cc".
Displayed should be "aa", "bb", "cc" and "dd".

There is 2 spaces between "bb" and "cc" and 3 spaces between "cc" and "dd" in p1.
Is there another command or different option for this command that can do separation of words (arguments)?
If a large file is read to calc sheet, it will take a long time if the splitting function is slow. Thus fast (and simple) operation is requested.



As an exercise, I think this solve the problem with a smaller and simpler code.
To ignore other elements, like points or commas, just add them to the test with AND.

Code: Select all   Expand viewCollapse view
Function SpaceSplit(p1 as String) as Array
  Dim i as Integer, a1(), a2()
  a1=Split(Trim(p1)," ")
  For i = 0 to Ubound(a1) 'For each element in a1
    If a1(i)<>"" Then 'If element is different to ""
       Redim Preserve a2(UBound(a2)+1) 'Increases the size of a2 in 1 element
       a2(UBound(a2))=a1(i) 'Add the non "" element to a2
    End If
  Next
  SpaceSplit = a2 ' Function returns a2, an array with each word
End Function


sub test
   dim s1()
   s1 = spacesplit("aa      bb   cc    dd    ee")
end sub


Kadu Leite
LibreOffice 5.2 on Windows 10 / LibreOffice 6.2 on Ubuntu 19.10
KaduLeite
 
Posts: 5
Joined: Fri Apr 06, 2018 8:35 pm

Re: [Calc] Split string into an array of single words

Postby JeJe » Wed Dec 04, 2019 12:43 pm

Redim Preserve is an expensive operation to have in a loop... that doesn't matter with just a few items of course...
Openoffice 4.1.2
Windows 8
JeJe
Volunteer
 
Posts: 623
Joined: Wed Mar 09, 2016 2:40 pm

Re: [Calc] Split string into an array of single words

Postby KaduLeite » Thu Dec 05, 2019 7:55 pm

JeJe wrote:Redim Preserve is an expensive operation to have in a loop... that doesn't matter with just a few items of course...

Je Je,

You're absolutely right.
What about that?

Code: Select all   Expand viewCollapse view
Function SpaceSplit(p1 as String) as Array
  Dim i as Integer, j as Integer, a1()
  a1=Split(Trim(p1)," ")
  Dim a2(Ubound(a1))
  For i = 0 to Ubound(a1)
    If a1(i)<>"" Then
       a2(j)=a1(i)
       j = j+1
    End If
  Next
  Redim Preserve a2(j-1)
  SpaceSplit = a2
End Function


It uses the same logic, but writes to an array with the same size as a1 and just when a2 is OK, it redim preserve with the correct size.

Kadu Leite
LibreOffice 5.2 on Windows 10 / LibreOffice 6.2 on Ubuntu 19.10
KaduLeite
 
Posts: 5
Joined: Fri Apr 06, 2018 8:35 pm


Return to Macros and UNO API

Who is online

Users browsing this forum: No registered users and 5 guests