Page 1 of 1
[Calc] Split string into an array of single words
Posted: Fri Aug 13, 2010 4:46 am
by jari0471
Split string into an array of single words (similar to separation of "arguments" in shell command line)
Code: Select all
Sub test
Dim p As String
p1="aa bb cc dd"
a1=Split(p1," ")
Print a1(0)
Print a1(1)
Print a1(2)
Print a1(3)
End Sub
Displayed is "aa", "bb", " " and "cc".
Displayed should be "aa", "bb", "cc" and "dd".
There is 2 spaces between "bb" and "cc" and 3 spaces between "cc" and "dd" in p1.
Is there another command or different option for this command that can do separation of words (arguments)?
If a large file is read to calc sheet, it will take a long time if the splitting function is slow. Thus fast (and simple) operation is requested.
Re: Split string into an array of single words
Posted: Fri Aug 13, 2010 5:39 am
by Charlie Young
I get aa bb cc dd running the code copied exactly from above, but where are the strings coming from? If they are in, or could be gotten into spreadsheet cells, you could use Data > Text to Columns.
Re: Split string into an array of single words
Posted: Fri Aug 13, 2010 6:55 am
by jari0471
The forum seemed to remove additional "spaces" between the quotes.
Re: Split string into an array of single words
Posted: Fri Aug 13, 2010 6:59 am
by jari0471
There is 2 spaces between "bb" and "cc" and 3 spaces between "cc" and "dd" in p1 (the forum posting seems to remove the additional spaces automatically).
Re: Split string into an array of single words
Posted: Fri Aug 13, 2010 7:15 am
by RoryOF
jari0471 wrote:There is 2 spaces between "bb" and "cc" and 3 spaces between "cc" and "dd" in p1 (the forum posting seems to remove the additional spaces automatically).
In that case, and in all cases where the layout/spacing is important, you should post using the "Code" option in the Full Editor window. This option is not available in the Quick Reply window - you have to switch to Full Editor.
Re: Split string into an array of single words
Posted: Fri Aug 13, 2010 7:27 am
by FJCC
Here is a not-very-elegant solution for the example given
Code: Select all
Sub test
p1="aa bb cc dd"
a1=Split(p1," ")
count = 0
Dim a2(ubound(a1))
For Each Word in a1
If Word <> "" then
a2(count) = Word
count = count+1
end if
next Word
Redim Preserve a2(count)
Print a2(0)
Print a2(1)
Print a2(2)
Print a2(3)
End Sub
Re: Split string into an array of single words
Posted: Fri Aug 13, 2010 8:20 am
by jari0471
The solution you give can count the number of "words", but can be come "painfully" slow do to the for-loop, when there is 1000's of strings to work with... (Thus I try find internal function, java -code or any other solution that gives very fast speed also).
Re: Split string into an array of single words
Posted: Fri Aug 13, 2010 9:08 am
by RoryOF
If you take your strings into a word processor, such as Writer, its Find and Replace can quickly simplify multiple spaces to single spaces, then change them to new lines or other delimiters, so that the file can be reopened in Calc using the CSV import mechanism.
Re: Split string into an array of single words
Posted: Fri Aug 13, 2010 9:47 am
by jari0471
The idea using writer is possible. However the text is read from a file "iFile" with Line Input (that gives single string for a line).
The original file(s) should not be modified, since they are data files.
Code: Select all
iFile = Freefile
Open sPath+"/"+sFile For Input As iFile
While (not eof(iFile))
Line Input #iFile,sValue
sValue=Trim(sValue)
If sValue<>"" Then
Maybe there is another way reading the files that can do the splitting the words.
Some of the lines (documentation, etc. other data) in the original file are not intended to be used, thus the read-routine is complicated.
Re: Split string into an array of single words
Posted: Fri Aug 13, 2010 10:06 am
by Hagar Delest
jari0471 wrote:The forum seemed to remove additional "spaces" between the quotes.
Use BBCode (I've edited your first post).
Re: [Calc] Split string into an array of single words
Posted: Fri Aug 13, 2010 11:45 am
by RoryOF
The idea using writer is possible. However the text is read from a file "iFile" with Line Input (that gives single string for a line).
The original file(s) should not be modified, since they are data files
In that case, you should be working on copies of the files, not on the originals. The Writer path then seems most practical.
Re: [Calc] Split string into an array of single words
Posted: Fri Aug 13, 2010 12:45 pm
by RPG
Hello
Maybe you can use the function
replace. It is not document but I believe it is normal function. I'm not sure if it is only in the go-oo version or in all version of OOo.
As far I knew it can have six parameters
a) the string
b) The new part
c) the old part
d) where to start
e) How often to change a part
f) Case sensitive
Code: Select all
Sub test
Dim p As String
p1="aa bb cc dd"
p1=replace(p1," "," ")
a1=Split(p1," ")
msgbox a1(0) & chr(13) & _
a1(1) & chr(13) & _
a1(2) & chr(13) & _
a1(3)
End Sub
Romke
Re: [Calc] Split string into an array of single words
Posted: Fri Aug 13, 2010 2:37 pm
by B Marcelly
Hi,
I took this as a recreation challenge
Here is a general solution, it works with words separated by any length of space.
It tries to replace a long series of spaces by one space, then tries with a shorter series, etc.
See how powerful is split/join
Code: Select all
Dim phrase As String
phrase = "hello OOoBasic programmers, this is a solution"
Dim nbSpaces As Long, v As Variant
nbSpaces = 5 ' initial value, greater or equal to 2
Do While nbSpaces > 1
v = split(phrase, space(nbSpaces))
if UBound(v()) = 0 then
nbSpaces = nbSpaces -1
else
phrase = join(v, " ")
end if
Loop
v = split(phrase, " ")
' *** display of vector v() ***
MsgBox(join(v, chr(13)), 0, "List of words")
The initial value of nbSpaces can optimize the number of loops, if you know what length of spaces you may usually have. But even a start value of 2 can do the job.
Re: [Calc] Split string into an array of single words
Posted: Fri Aug 13, 2010 8:10 pm
by Charlie Young
OK, here's my attempt, which doesn't use the Basic Split function. I, um, borrowed Bernard's trick for displaying the results.
Code: Select all
Sub TestStrSplit
Dim a
Dim s As String
Dim i As Integer
s = "hello OOoBasic programmers, this is another solution"
a = StrSplit(s," ")
MsgBox(join(a,chr(13))
End Sub
Function StrSplit(s As String, SplitChar As String)
Dim i As Integer
Dim n As Integer
Dim m As String
Dim t As String
Dim a(0) As String
Dim InWord As Boolean
t =""
InWord = False
i = 1
n = 0
do while i <= len(s)
m = mid(s,i,1)
if m <> SplitChar then
InWord = True
t = t & m
else
if InWord then
Redim Preserve a(n) As String
a(n) = t
n = n + 1
t = ""
InWord = False
endif
endif
i = i + 1
loop
if InWord then
Redim Preserve a(n) As String
a(n) = t
endif
StrSplit = a
End Function
Re: [Calc] Split string into an array of single words
Posted: Sat Aug 14, 2010 6:07 am
by jari0471
Since the original data is in text file (and used line input to place it to a string line by line),
it could be also possible to read the original text file word by word and no split is required at all.
But I could not find how to do this either.
Re: [Calc] Split string into an array of single words
Posted: Tue May 30, 2017 8:46 am
by eddyparkinson
Bug fix - handle blank cells
Code: Select all
REM ***** BASIC *****
Sub TestStrSplit_clone
Dim a
Dim s As String
Dim i As Integer
s = "hello OOoBasic programmers, this is another solution"
a = StrSplit(s," ")
MsgBox(join(a,chr(13))
a = StrSplit(Nothing," ")
MsgBox(join(a,chr(13))
End Sub
Function StrSpliter(s, SplitChar As String)
StrSpliter = "a"
End Function
Function StrSplit(SplitThis, SplitChar As String)
Dim s As String
if IsNull(SplitThis) or IsEmpty(SplitThis) then s = "" else s = SplitThis
Dim i As Integer
Dim n As Integer
Dim m As String
Dim t As String
Dim a(0) As String
Dim InWord As Boolean
t =""
InWord = False
i = 1
n = 0
do while i <= len(s)
m = mid(s,i,1)
if m <> SplitChar then
InWord = True
t = t & m
else
if InWord then
Redim Preserve a(n) As String
a(n) = t
n = n + 1
t = ""
InWord = False
endif
endif
i = i + 1
loop
if InWord then
Redim Preserve a(n) As String
a(n) = t
endif
StrSplit = a
End Function
Re: [Calc] Split string into an array of single words
Posted: Wed Jun 14, 2017 10:47 am
by RPG
Hello
It seems to me that in the last post is not understand that B Marcelly and Charlie Young did play a little with some commands. I think it is also good to make a real short example of how to solve the problem.
In starBASIC there are two commands JOIN and SPLIT who are real powerful as also B Marcelly pointed out. Most of the time those two commands are used with a single character to join or split values. But it is also possible to use more characters then one character, B Marcelly did use it in his example in the split part.
I do prefer the command replace. In the time when this thread started the replace command was not in all version of OOo. It is now in all new version of OpenOffice. I am not sure if it works in all cases but I have test it as much I can. It is also only an example to learn to use REPLACE and SPLIT.
It is me not clear what people want do with the StrSplit function. I want only place here how to use replace and split. When you knew the function replace, split, join then it is more easy to make short and easy code.
Code: Select all
Function StrSplit(SplitThis, SplitChar As String)
dim tempory ' we define a tempory variable
' Be aware that we use the variabele first for the string value.
' We set it as an empty string
tempory="" ' Make sure we have an empty string.
if typename(SplitThis)="String" then
' The parameter is a string in this case we need a string
tempory=replace(SplitThis," ", " ") ' erase first all double spaces
' Be aware that that tempory is change from a single variabele to an array variabele
tempory =SPLIT(tempory,SplitChar) ' Split string to array
end if
StrSplit=tempory ' Give back to the function an empty string or an array as the original function was
End Function
Romke
Re: [Calc] Split string into an array of single words
Posted: Tue Nov 12, 2019 9:52 am
by jarkky
At the end (after 9 years) the below is the latest solution:
Code: Select all
Sub Test_string
Print parse_string("a b c")(2)
Print parse_string("")(0)
End Sub
Function parse_string(stri as String) as Array
lst=Len(stri)
Dim ssv(30) as String
Dim ssvr(0) as String
i=0 : n=0
Do While i<lst
Do While i<lst
f=0
If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
f=1
i=i+1
End If
If f=0 Then
Exit Do
End If
Loop
If i=lst Then
Exit Do
End If
If Mid(stri,i+1,1)="'" Then
f=0 : i=i+1
ssv(n)=""
Do While i<lst
If Mid(stri,i+1,1)="'" Then
f=1 : i=i+1
Exit Do
Else
ssv(n)=ssv(n)+Mid(stri,i+1,1)
End If
i=i+1
Loop
If f=0 Then
Print "String has no closing quotation!"
parse_string="@closing_quotation_missing"
Exit Function
End If
If i<lst Then
f=0
If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
f=1
End If
If f=0 Then
Print "After closing quotation at column "+Str(i)+" string separator is required!"
parse_string="@string_sepator_missing"
Exit Function
End If
i=i+1
End If
n=n+1
Else
ssv(n)=Mid(stri,i+1,1)
i=i+1
Do While i<lst
f=0
If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
f=1
End If
If f=0 Then
If Mid(stri,i+1,1)="'" Then
Print "Quotation character at column "+Str(i)+" not allowed in middle of string!"
parse_string="@wrong_quotation_type"
Exit Function
End If
ssv(n)=ssv(n)+Mid(stri,i+1,1)
End If
i=i+1
If f=1 Then
Exit Do
End If
Loop
n=n+1
End If
Loop
If (n>0) Then
Redim ssvr(n-1) as String
End If
For j=0 To n-1
ssvr(j)=ssv(j)
Next j
parse_string=ssvr
End Function
It looks like the "parse_string" command as this is missing in the openoffice macro commands.
Re: [Calc] Split string into an array of single words
Posted: Tue Nov 12, 2019 10:26 am
by jarkky
Below is possibly the right method.
To test if string is empty it looks like requiring using ubound&lbound.
Not sure if there is any test command for empty arrays, such as "If v=[] Then ... End If"
Code: Select all
Sub Test_string
Print parse_string("a b c")(2)
ev=parse_string(" ")
Print lbound(ev),":",ubound(ev)
End Sub
Function parse_string(stri as String) as Array
lst=Len(stri)
Dim ssv(30) as String
Dim ssvr() as String
i=0 : n=0
Do While i<lst
Do While i<lst
f=0
If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
f=1
i=i+1
End If
If f=0 Then
Exit Do
End If
Loop
If i=lst Then
Exit Do
End If
If Mid(stri,i+1,1)="'" Then
f=0 : i=i+1
ssv(n)=""
Do While i<lst
If Mid(stri,i+1,1)="'" Then
f=1 : i=i+1
Exit Do
Else
ssv(n)=ssv(n)+Mid(stri,i+1,1)
End If
i=i+1
Loop
If f=0 Then
Print "String has no closing quotation!"
parse_string="@closing_quotation_missing"
Exit Function
End If
If i<lst Then
f=0
If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
f=1
End If
If f=0 Then
Print "After closing quotation at column "+Str(i)+" string separator is required!"
parse_string="@string_sepator_missing"
Exit Function
End If
i=i+1
End If
n=n+1
Else
ssv(n)=Mid(stri,i+1,1)
i=i+1
Do While i<lst
f=0
If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
f=1
End If
If f=0 Then
If Mid(stri,i+1,1)="'" Then
Print "Quotation character at column "+Str(i)+" not allowed in middle of string!"
parse_string="@wrong_quotation_type"
Exit Function
End If
ssv(n)=ssv(n)+Mid(stri,i+1,1)
End If
i=i+1
If f=1 Then
Exit Do
End If
Loop
n=n+1
End If
Loop
If (n>0) Then
Redim ssvr(n-1) as String
End If
For j=0 To n-1
ssvr(j)=ssv(j)
Next j
parse_string=ssvr
End Function
Re: [Calc] Split string into an array of single words
Posted: Tue Nov 12, 2019 10:54 am
by jarkky
Using "Redim preserve" at the last part will then help avoid to copy the vector contents into another vector:
Code: Select all
Sub Test_string
Print parse_string("a b c")(2)
ev=parse_string(" ")
Print lbound(ev),":",ubound(ev)
End Sub
Function parse_string(stri as String) as Array
lst=Len(stri)
Dim ssv(30) as String
Dim ssvr() as String
i=0 : n=0
Do While i<lst
Do While i<lst
f=0
If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
f=1
i=i+1
End If
If f=0 Then
Exit Do
End If
Loop
If i=lst Then
Exit Do
End If
If Mid(stri,i+1,1)="'" Then
f=0 : i=i+1
ssv(n)=""
Do While i<lst
If Mid(stri,i+1,1)="'" Then
f=1 : i=i+1
Exit Do
Else
ssv(n)=ssv(n)+Mid(stri,i+1,1)
End If
i=i+1
Loop
If f=0 Then
Print "String has no closing quotation!"
parse_string="@closing_quotation_missing"
Exit Function
End If
If i<lst Then
f=0
If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
f=1
End If
If f=0 Then
Print "After closing quotation at column "+Str(i)+" string separator is required!"
parse_string="@string_sepator_missing"
Exit Function
End If
i=i+1
End If
n=n+1
Else
ssv(n)=Mid(stri,i+1,1)
i=i+1
Do While i<lst
f=0
If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
f=1
End If
If f=0 Then
If Mid(stri,i+1,1)="'" Then
Print "Quotation character at column "+Str(i)+" not allowed in middle of string!"
parse_string="@wrong_quotation_type"
Exit Function
End If
ssv(n)=ssv(n)+Mid(stri,i+1,1)
End If
i=i+1
If f=1 Then
Exit Do
End If
Loop
n=n+1
End If
Loop
If (n>0) Then
Redim preserve ssv(n-1) as String
parse_string=ssv
Else
parse_string=ssvr
End If
End Function
Re: [Calc] Split string into an array of single words
Posted: Tue Nov 12, 2019 12:25 pm
by jarkky
I have tried to check more and thinking if function error return should occur with global variables (to return full error descriptions).
This case if openoffice had global global variable for error return for each command I could use those same variable here. But this may require more checking...
Also if the Mid()-function index starts at index=1 rather than at index=0 can be problem as well as that redim can not make empty strings.
Code: Select all
Sub Test_string
Print parse_string("a b c")(2)
Print erf
ev=parse_string(" ")
Print lbound(ev),":",ubound(ev)
Print erf
ev=parse_string("a b 'c")
Print erf
If erf=True Then
Print ers
End If
End Sub
REM Function error return is defined at the global variable level.
REM This could be possible method for all the functions.
REM erf : Flag TRUE/FALSE (error occur or not)
REM ers : Error name as string
REM erm : Error message as string
Global erf as Boolean
Global ers as String
Global erm as String
Function parse_string(stri as String) as Array
lst=Len(stri)
Dim ssv(30) as String
Dim ssvr() as String
i=0 : n=0
Do While i<lst
Do While i<lst
f=0
REM The Mid()-command requires i+1 argument when i starts from zero. Can this be problem?
If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
f=1
i=i+1
End If
If f=0 Then
Exit Do
End If
Loop
If i=lst Then
Exit Do
End If
If Mid(stri,i+1,1)="'" Then
f=0 : i=i+1
ssv(n)=""
Do While i<lst
If Mid(stri,i+1,1)="'" Then
f=1 : i=i+1
Exit Do
Else
ssv(n)=ssv(n)+Mid(stri,i+1,1)
End If
i=i+1
Loop
If f=0 Then
erm="String has no closing quotation!"
ers="@closing_quotation_missing"
erf=TRUE
parse_string=ssvr
Print erm
Exit Function
End If
If i<lst Then
f=0
If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
f=1
End If
If f=0 Then
erm="After closing quotation at column "+Str(i)+" string separator is required!"
ers="@string_sepator_missing"
erf=TRUE
parse_string=ssvr
Print erm
Exit Function
End If
i=i+1
End If
n=n+1
Else
ssv(n)=Mid(stri,i+1,1)
i=i+1
Do While i<lst
f=0
If Mid(stri,i+1,1)=" " Or Mid(stri,i+1,1)=Chr$(9) Then
f=1
End If
If f=0 Then
If Mid(stri,i+1,1)="'" Then
erm="Quotation character at column "+Str(i)+" not allowed in middle of string!"
ers="@wrong_quotation_placement"
erf=TRUE
parse_string=ssvr
Print erm
Exit Function
End If
ssv(n)=ssv(n)+Mid(stri,i+1,1)
End If
i=i+1
If f=1 Then
Exit Do
End If
Loop
n=n+1
End If
Loop
REM redim does not seem to operate with negative value (to make empty string)!
If (n>-1) Then
Redim preserve ssv(n-1) as String
parse_string=ssv
Else
parse_string=ssvr
End If
erm=""
ers=""
erf=FALSE
End Function
Re: [Calc] Split string into an array of single words
Posted: Tue Nov 12, 2019 1:30 pm
by jarkky
The error handling in the user functions would be probably using different variables that as "erl" & "err" by openoffice own functions:
Code: Select all
Sub Test_String
Print parse_string("a b c")(2)
Print erf
ev=parse_string(" ")
Print lbound(ev),":",ubound(ev)
Print erf
ev=parse_string("a b 'c")
Print erf
If erf=True Then
Print ers
End If
End Sub
REM Function error return is defined at the global variable level.
REM This could be possible method for all the functions.
REM The user function can not stop into the error automatically or detect error line number.
REM Errors must be checked from the erf-flag. erf is operated by all commands, while
REM erc, ers and erm are filled in the function only in case erf=TRUE
REM If erc, ers and erm would be arrays, then all the errors in the program could be logged
REM into memory or a file for later inspection. Also I can not fill the "erl" and "err"
REM variables for user defined functions. That is if I have my own error message outside
REM the openoffice errors the program will not halt for that case.
REM Error handling should be available for the user defined functions also...
REM erf : Flag TRUE/FALSE (error occurred in last command or not)
REM erc : Command that produced the error
REM ers : Error name as string
REM erm : Error message as string
Global erf as Boolean
Global erc as String
Global ers as String
Global erm as String
REM The above parameters could be initialized globally, but how?
Function parse_string(stri as String) as Array
lst=Len(stri)
Dim ssv(30) as String
Dim ssvr() as String
i=1 : n=0
Do While i<=lst
Do While i<=lst
f=0
If Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9) Then
f=1
i=i+1
End If
If f=0 Then
Exit Do
End If
Loop
If i>lst Then
Exit Do
End If
If Mid(stri,i,1)="'" Then
f=0 : i=i+1
ssv(n)=""
Do While i<=lst
If Mid(stri,i,1)="'" Then
f=1 : i=i+1
Exit Do
Else
ssv(n)=ssv(n)+Mid(stri,i,1)
End If
i=i+1
Loop
If f=0 Then
erc="parse_string()"
erm="String has no closing quotation!"
ers="@closing_quotation_missing"
erf=TRUE
parse_string=ssvr
Print erm
Exit Function
End If
If i<=lst Then
f=0
If Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9) Then
f=1
End If
If f=0 Then
erc="parse_string()"
erm="After closing quotation at column "+Str(i)+" string separator is required!"
ers="@string_sepator_missing"
erf=TRUE
parse_string=ssvr
Print erm
Exit Function
End If
i=i+1
End If
n=n+1
Else
ssv(n)=Mid(stri,i,1)
i=i+1
Do While i<=lst
f=0
If Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9) Then
f=1
End If
If f=0 Then
If Mid(stri,i,1)="'" Then
erc="parse_string()"
erm="Quotation character at column "+Str(i)+" not allowed in middle of string!"
ers="@wrong_quotation_placement"
erf=TRUE
parse_string=ssvr
Print erm
Exit Function
End If
ssv(n)=ssv(n)+Mid(stri,i,1)
End If
i=i+1
If f=1 Then
Exit Do
End If
Loop
n=n+1
End If
Loop
REM redim does not seem to operate with negative value (to make empty string)!
If (n>0) Then
Redim preserve ssv(n-1) as String
parse_string=ssv
Else
parse_string=ssvr
End If
erf=FALSE
End Function
Re: [Calc] Split string into an array of single words
Posted: Tue Nov 12, 2019 3:38 pm
by jarkky
It looks like the "split string" could work as below:
Code: Select all
Sub Test_String
Print parse_string("a b c")(2)
Print erf
ev=parse_string(" ")
Print lbound(ev),":",ubound(ev)
Print erf
ev=parse_string("a b 'c")
Print erf
If erf=True Then
Print ers
End If
End Sub
REM Function error return is defined at the global variable level.
REM This could be possible method for all the functions.
REM The user function can not stop into the error automatically or detect error line number.
REM Errors must be checked from the erf-flag. erf is operated by all commands, while
REM erc, ers and erm are filled in the function only in case erf=TRUE
REM If erc, ers and erm would be arrays, then all the errors in the program could be logged
REM into memory or a file for later inspection. Also I can not fill the "erl" and "err"
REM variables for user defined functions. That is if I have my own error message outside
REM the openoffice errors the program will not halt for that case.
REM Error handling should be available for the user defined functions also...
REM erf : Flag TRUE/FALSE (error occurred in last command or not)
REM erc : Command that produced the error
REM ers : Error name as string
REM erm : Error message as string
Global erf as Boolean
Global erc as String
Global ers as String
Global erm as String
REM The above parameters could be initialized globally, but how?
Function parse_string(stri as String) as Array
lst=Len(stri)
Dim ssv(30) as String
Dim ssvr() as String
i=1 : n=0
Do While i<=lst
Do While (Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9))
i=i+1
If i>lst Then
Exit Do
End If
Loop
If i>lst Then
Exit Do
End If
If Mid(stri,i,1)="'" Then
f=0 : i=i+1
ssv(n)=""
Do While i<=lst
If Mid(stri,i,1)="'" Then
f=1 : i=i+1
Exit Do
Else
ssv(n)=ssv(n)+Mid(stri,i,1)
End If
i=i+1
Loop
If f=0 Then
erc="parse_string()"
erm="String has no closing quotation!"
ers="@closing_quotation_missing"
erf=TRUE
parse_string=ssvr
Print erm
Exit Function
End If
If i<=lst Then
If Not(Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9)) Then
erc="parse_string()"
erm="After closing quotation at column "+Str(i)+" string separator is required!"
ers="@string_sepator_missing"
erf=TRUE
parse_string=ssvr
Print erm
Exit Function
End If
i=i+1
End If
Else
ssv(n)=Mid(stri,i,1)
i=i+1
Do While i<=lst
If Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9) Then
i=i+1
Exit Do
End If
If Mid(stri,i,1)="'" Then
erc="parse_string()"
erm="Quotation character at column "+Str(i)+" not allowed in middle of string!"
ers="@wrong_quotation_placement"
erf=TRUE
parse_string=ssvr
Print erm
Exit Function
End If
ssv(n)=ssv(n)+Mid(stri,i,1)
i=i+1
Loop
End If
n=n+1
Loop
REM redim does not seem to operate with negative value (to make empty string)!
If (n>0) Then
Redim preserve ssv(n-1) as String
parse_string=ssv
Else
parse_string=ssvr
End If
erf=FALSE
End Function
Not sure of the function error handling and also it looks like the "Redim" can not make empty array...
Re: [Calc] Split string into an array of single words
Posted: Wed Nov 13, 2019 1:52 am
by JeJe
Getting rid of suplus spaces in the initial post can be done easily. Turning a string into a byte array *should* make things faster than using mid...
Code: Select all
Sub test
Dim p As String
p1="aa bb cc dd"
dim b() as byte, c as long,start as boolean
b() = p1: c =-2
for i = 0 to ubound(b) step 2
if b(i) = 32 and b(i+1) = 0 then
if start = true then
c= c+2
b(c) =b(i)
b(c+1) =0
start = false
end if
else
c= c+2
b(c) =b(i)
b(c+1) =b(i+1)
start = true
end if
next
redim preserve b(c)
p1= b
a1=Split(p1," ")
Print a1(0)
Print a1(1)
Print a1(2)
Print a1(3)
End Sub
Re: [Calc] Split string into an array of single words
Posted: Wed Nov 13, 2019 12:40 pm
by jarkky
That looks possible, but I extended to remove the "tabular" or "\t" also:
Code: Select all
Sub Test_String
Print parse_string("""a b c"" ""\"""" c")(1)
Print parse_string("a b c")(2)
Print erf
ev=parse_string(" ")
Print lbound(ev),":",ubound(ev)
Print erf
ev=parse_string("a b 'c")
Print erf
If erf=True Then
Print ers
End If
End Sub
REM Function error return is defined at the global variable level.
REM This could be possible method for all the functions.
REM The user function can not stop into the error automatically or detect error line number.
REM Errors must be checked from the erf-flag. erf is operated by all commands, while
REM erc, ers and erm are filled in the function only in case erf=TRUE
REM If erc, ers and erm would be arrays, then all the errors in the program could be logged
REM into memory or a file for later inspection. Also I can not fill the "erl" and "err"
REM variables for user defined functions. That is if I have my own error message outside
REM the openoffice errors the program will not halt for that case.
REM Error handling should be available for the user defined functions also...
REM erf : Flag TRUE/FALSE (error occurred in last command or not)
REM erc : Command that produced the error
REM ers : Error name as string
REM erm : Error message as string
Global erf as Boolean
Global erc as String
Global ers as String
Global erm as String
REM The above parameters could be initialized globally, but how?
Function parse_string(stri as String) as Array
If Not (TypeName(stri)="String") Then
erc="parse_string()"
erm="Input argument can be only string type!"
ers="@wrong_argument_type"
erf=TRUE
parse_string=""
Print erm
Exit Function
End If
lst=Len(stri)
Dim ssv(30) as String
Dim ssvr() as String
i=1 : n=0
Do While i<=lst
Do While (Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9))
i=i+1
If i>lst Then
Exit Do
End If
Loop
If i>lst Then
Exit Do
End If
If Mid(stri,i,1)="""" Then
f=0 : i=i+1
ssv(n)=""
Do While i<=lst
If Mid(stri,i,1)="""" Then
If Mid(stri,i-1,1)="\" Then
ssv(n)=Left(ssv(n),Len(ssv(n))-1)
Else
f=1 : i=i+1
Exit Do
End If
End If
ssv(n)=ssv(n)+Mid(stri,i,1)
i=i+1
Loop
If f=0 Then
erc="parse_string()"
erm="String has no closing quotation!"
ers="@closing_quotation_missing"
erf=TRUE
parse_string=ssvr
Print erm
Exit Function
End If
If i<=lst Then
If Not(Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9)) Then
erc="parse_string()"
erm="After closing quotation at column "+Str(i)+" string separator is required!"
ers="@string_sepator_missing"
erf=TRUE
parse_string=ssvr
Print erm
Exit Function
End If
i=i+1
End If
Else
ssv(n)=Mid(stri,i,1)
i=i+1
Do While i<=lst
If Mid(stri,i,1)=" " Or Mid(stri,i,1)=Chr$(9) Then
i=i+1
Exit Do
End If
If Mid(stri,i,1)="""" Then
If Mid(stri,i-1,1)="\" Then
ssv(n)=Left(ssv(n),Len(ssv(n))-1)
Else
erc="parse_string()"
erm="Quotation character at column "+Str(i)+" not allowed in middle of string!"
ers="@wrong_quotation_placement"
erf=TRUE
parse_string=ssvr
Print erm
Exit Function
End If
End If
ssv(n)=ssv(n)+Mid(stri,i,1)
i=i+1
Loop
End If
n=n+1
Loop
REM redim does not seem to operate with negative value (to make empty string)!
If (n>0) Then
Redim preserve ssv(n-1) as String
parse_string=ssv
Else
parse_string=ssvr
End If
erf=FALSE
End Function
Here also even input arguments are defined as "String" -type the User should check that they come as "string"-type into the function...
Edit: Please use the CODE tags around program code for readability and easy copying |
Re: [Calc] Split string into an array of single words
Posted: Tue Dec 03, 2019 7:03 pm
by KaduLeite
jari0471 wrote:Split string into an array of single words (similar to separation of "arguments" in shell command line)
Code: Select all
Sub test
Dim p As String
p1="aa bb cc dd"
a1=Split(p1," ")
Print a1(0)
Print a1(1)
Print a1(2)
Print a1(3)
End Sub
Displayed is "aa", "bb", " " and "cc".
Displayed should be "aa", "bb", "cc" and "dd".
There is 2 spaces between "bb" and "cc" and 3 spaces between "cc" and "dd" in p1.
Is there another command or different option for this command that can do separation of words (arguments)?
If a large file is read to calc sheet, it will take a long time if the splitting function is slow. Thus fast (and simple) operation is requested.
As an exercise, I think this solve the problem with a smaller and simpler code.
To ignore other elements, like points or commas, just add them to the test with AND.
Code: Select all
Function SpaceSplit(p1 as String) as Array
Dim i as Integer, a1(), a2()
a1=Split(Trim(p1)," ")
For i = 0 to Ubound(a1) 'For each element in a1
If a1(i)<>"" Then 'If element is different to ""
Redim Preserve a2(UBound(a2)+1) 'Increases the size of a2 in 1 element
a2(UBound(a2))=a1(i) 'Add the non "" element to a2
End If
Next
SpaceSplit = a2 ' Function returns a2, an array with each word
End Function
sub test
dim s1()
s1 = spacesplit("aa bb cc dd ee")
end sub
Kadu Leite
Re: [Calc] Split string into an array of single words
Posted: Wed Dec 04, 2019 12:43 pm
by JeJe
Redim Preserve is an expensive operation to have in a loop... that doesn't matter with just a few items of course...
Re: [Calc] Split string into an array of single words
Posted: Thu Dec 05, 2019 7:55 pm
by KaduLeite
JeJe wrote:Redim Preserve is an expensive operation to have in a loop... that doesn't matter with just a few items of course...
Je Je,
You're absolutely right.
What about that?
Code: Select all
Function SpaceSplit(p1 as String) as Array
Dim i as Integer, j as Integer, a1()
a1=Split(Trim(p1)," ")
Dim a2(Ubound(a1))
For i = 0 to Ubound(a1)
If a1(i)<>"" Then
a2(j)=a1(i)
j = j+1
End If
Next
Redim Preserve a2(j-1)
SpaceSplit = a2
End Function
It uses the same logic, but writes to an array with the same size as a1 and just when a2 is OK, it redim preserve with the correct size.
Kadu Leite
Re: [Calc] Split string into an array of single words
Posted: Thu Dec 05, 2019 10:43 pm
by JeJe
Yeah, its much better to redim outside the loop once at the end.
If anyone's interested in these things... OOBasic is very similar to Microsoft's now obselete Visual Basic 6 (in the non-office suite aspects)... and there is an old but very good (and still existing!) site which looked at the fastest ways to code various functions in Visual Basic...
http://xbeat.net/vbspeed/