Page 1 of 1

### [Calc BASIC]Function to convert attributes and links to html

Posted: Sun Sep 18, 2016 12:13 am
Considering a thread (https://forum.openoffice.org/en/forum/v ... =9&t=85034) from the Calc branch I wrote a user function cAttribs2Html in BASIC.
The function is designed to analyse a cell for its content or the result of a formula and to transform the findings to html if there exist applicable inline tags and the control parameter of the function is containing a respective letter.
The function can handle the attributes "bold", "italic", "strike", and "underline". In addition links that are assigned to parts of the text content (or all of it) can be transformed into the respective html representation. As compared with the formatting possible in Calc, there is, of course, coarsening.
The code:
Code: Select all   Expand viewCollapse view
`Option ExplicitFunction cAttribs2Html(ByVal pX As Long, ByVal pY As Long, ByVal pZ As Long, ByVal pAttribs As String, Optional ByVal pDummy) As String'COMMENTS WELCOME! Wolfgang Jäger (Lupp; jag@psilosoph.de)'The dummy parameter shall give the user a way to force recalculation on "any" event by placing NOW() or any volatile function there.'Any correct expressin in this place will cause recalculation if one of the precedents is changed.'The two remarks above only fully apply, of course, if 'AutoCalculate' is active.'To pass the cell to work on by position should be an easy and reliable way. Counting starts with 1 for the parameters.'The inline tags for striking through and for underscoring were deprecated with V5 of html, but are still offered here.'I suppose the rendering of most browsers still supports them.'pAttribs may contain any combination of the letters b, i, s, u, l in arbitrary order. Repetitions allowed (no difference). Empty allowed.'The attributes Bold, Italic, Strike, Underline and Link are transcribed to html if the respective letter is present.'In case of l the correct href is inserted when the <a ...> tag is prefixed.'The plain text is returned as part of the result anyway.'The function does not attempt to anlalyse formulae containing a call to the HYPERLINK function. Only the resulting string is processed.   Dim theDoc As ObjecttheDoc = ThisComponentIf   NOT theDoc.SupportsService("com.sun.star.sheet.SpreadsheetDocument") Then   cAttribs2Html = "Need Calc Document!"   Exit FunctionEnd IfDim theSheets As Object, sheetTally As LongtheSheets = theDoc.SheetssheetTally = theSheets.GetCount()If   (pZ < 1) OR (pZ > sheetTally) Then   cAttribs2Html = "Bad SheetNumber!"   Exit FunctionEnd If   Dim theSheet As ObjecttheSheet = theDoc.Sheets(pZ - 1)If   (pX < 1) OR (pY < 1) or (pX > theSheet.RangeAddress.EndColumn + 1) OR (pY > theSheet.RangeAddress.EndRow + 1) Then   cAttribs2Html = "Bad position!"   Exit FunctionEnd If   Dim theCell As Object, rText As StringtheCell = theSheet.GetCellByPosition(pX - 1, pY - 1)rText = ""'If we have a direct entry of type text it may consist of more than one TextPortions,'zero or more of them possessing an attribute value to be handled here and/or an URL property.'To get compact code also formula results are processed by the full-featured routine'despite the fact that there only one textportion is supported.   Dim theParEnum As Object, theParElement As Object   Dim theSubEnum As Object, theSubElement As Object   Dim textSlicetheParEnum = theCell.GetText().CreateEnumerationDo   While   theParEnum.HasMoreElements   theParElement = theParEnum.NextElement   theSubEnum = theParElement.CreateEnumeration   Do   While   theSubEnum.HasMoreElements      textSlice = ""      theSubElement = theSubEnum.NextElement'The alternative below shall ensure that in case of a display formatted according to the 'Numbers' format code'the displayed string is also used for the function result.'For directly entered text the 'Numbers' format will be ignored.      If   theCell.Type = 2 Then         textSlice = theSubElement.String      Else         textSlice = theCell.String      End If      If   Instr(pAttribs, "b") > 0 Then         If   theSubElement.CharWeight >= com.sun.star.awt.FontWeight.BOLD Then            textSlice = "<b>" & textSlice & "</b>"         End If      End If      If   Instr(pAttribs, "i") > 0 Then         If   theSubElement.CharPosture >= 1 Then            textSlice = "<i>" & textSlice & "</i>"         End If      End If      If   Instr(pAttribs, "s") > 0 Then         If   theSubElement.CharStrikeOut >= 1 Then            textSlice = "<strike>" & textSlice & "</strike>"         End If      End If      If   Instr(pAttribs, "u") > 0 Then         If   theSubElement.CharUnderline >= 1 Then            textSlice = "<u>" & textSlice & "</u>"         End If      End If      If   Instr(pAttribs, "l") > 0 Then         If   theSubElement.TextPortionType = "TextField" Then            If   theSubElement.TextField.SupportsService("com.sun.star.text.TextField.URL") Then               textSlice = "<a href=" & Chr(34) & theSubElement.TextField.URL & Chr(34) & ">" & textSlice & "</a>"            End If         End If      End If      rText = rText & textSlice   LoopLoopcAttribs2Html = rTextEnd Function`

The attached file is containing the function. There are also a few demonstrations.
Attribs2Html_1.ods

===== Edit 2017-03-02 =====
Induced by posts into a thread of the Calc branch (https://forum.openoffice.org/en/forum/v ... =9&t=85034) of the forum I would like to state that I provided this function for users probably knowing little about programming for Calc, but knowing everything they need to make use of the results. This is supposed to be mainly knowledge about html.

Concerning a "macro" preparing everything for export as a table into a correct html document I am surely not the one to be charged with the task.
But why should one want to do so. There is an 'Export' option for the purpose. I suppose any html editor will allow to paste the respective part into a target document.

This said.
As a rule the content of any single cell converted will be expected to be a paragraph in html (as it is treated in Calc itself). To get it as such you have to prefix <p> and to append </p> to the function result for each cell.

If the results are separated into an extra sheet, someone may want to export this sheet to a csv. The <p>...</p> pairs already created, this will require to choose NO column separator and NO text delimiter. I would also recommend to Copy/Paste Special... (results only) in advance.
A csv created this way, and just reextensioned as html, should be accepted and interpreted by one of the modern error-tolerant browsers despite of the fact that it probably lacks a lot of things specified as mandatory for correct html. However, you will not get a table structure this way. To know how to do that is your turn. I didn't claim to know much about html and the common editors. The plain html code I wrote to date consists of a few hundred lines, mostly done just for trying.

===== Edit 2018-06-24 =====
There was also a use-case where the conversion to html only helped to find the first character formatted to 'Italic' posture. The underlying intention was to split a column into two where the first part of any orginal cell content was text in default character style (without a markup) and the second part (a translation to a different language) was in 'Italic'.
`Function cAttribs2Html(pX As Long, pY As Long, pZ As Long, pAttribs As String, Optional pTrigger, Optional pSpecial As Object) As String'V 1.0.2 Wolfgang Jäger 2017-11-20 replacing preliminary versions'New: If the "p" attribute is missing in pAttribs, paragraph breaks (in Calc = line breaks) are replaced by an ordinary space each.'COMMENTS WELCOME! Wolfgang Jäger (Lupp; jag@psilosoph.de)'The optional parameter pTrigger shall give the user a way to force recalculation by a kind of listening or with the help of a volatile function.'Any correct expression in this place will cause recalculation if one of the precedents is changed or the volatile function recalculates.'The two remarks above only fully apply, of course, if 'AutoCalculate' is active. pTrigger is not accesed from the body of this function.'To pass the cell to work on by position should be an easy and reliable way. Counting starts with 1 for the parameters.'For calls from BASIC the function can also be used passing a cell object via pSpecial.'If this parameter is present in the calling expression, has the TypeName(pSpecial) = "Object", and is a SheetCell object, the parameters pX, pY, pZ are 'treated as if declared optional and will not be accessed. The positions must be present, of course, in the actual parameter list.'The inline tags for striking through and for underscoring were deprecated with V5 of html, but are still offered here.'I suppose the rendering of most browsers still supports them.'pAttribs may contain any combination of the letters b, i, s, u, l, p in arbitrary order. Repetitions allowed (no difference). Empty allowed.'The attributes Bold, Italic, Strike, Underline and Link are transcribed to html if the respective switching letter is present.'In case of l the correct href="..." part is inserted when the <a ...> tag is prefixed.'The handling of paragraphs was introduced later. Hard line breaks are treated as paragraph breaks. in Calc cells.'The switching letter p, if present, causes <p>...</p> pairs to be inserted. Ottherwise paragraph breaks are replaced by an ordinary space.'The plain text is returned as part of the result anyway.'The function does not attempt to analyse formulas containing a call to the HYPERLINK function. Only the resulting string is processed.'I do not know a way to get the URL linked in by HYPERLINK accessing a property of the cell or a service supported by it or by objects it may point to.'Actually I suppose the link is assigned to the used part of the cell's area of the cell grid on the screen view by means similar to those used for CF.   Dim theDoc As Object, theCell As ObjecttheDoc = ThisComponentIf   NOT theDoc.SupportsService("com.sun.star.sheet.SpreadsheetDocument") Then   cAttribs2Html = "Need Calc Document!"   Exit FunctionEnd IfIf NOT IsMissing(pSpecial) Then   If TypeName(pSpecial) = "Object" Then      If pSpecial.SupportsService("com.sun.star.sheet.SheetCell") Then         theCell = pSpecial         GoTo skippedGetCell      Else         cAttribs2Html = "pSpecial not valid"         Exit Function      End If   End IfEnd If   Dim theSheets As Object, sheetTally As LongtheSheets = theDoc.SheetssheetTally = theSheets.GetCount()If   (pZ < 1) OR (pZ > sheetTally) Then   cAttribs2Html = "Bad SheetNumber!"   Exit FunctionEnd If   Dim theSheet As ObjecttheSheet = theDoc.Sheets(pZ - 1)If   (pX < 1) OR (pY < 1) or (pX > theSheet.RangeAddress.EndColumn + 1) OR (pY > theSheet.RangeAddress.EndRow + 1) Then   cAttribs2Html = "Bad position!"   Exit FunctionEnd IftheCell = theSheet.GetCellByPosition(pX - 1, pY - 1)skippedGetCell:   Dim rText As String   Dim cB As Boolean, cI As Boolean, cS As Boolean, cU As Boolean, cL As Boolean, cP As Boolean, cR As Boolean ', simpleT As BooleancB = (Instr(pAttribs, "b")>0)cI = (Instr(pAttribs, "i")>0)cS = (Instr(pAttribs, "s")>0)cU = (Instr(pAttribs, "u")>0)cL = (Instr(pAttribs, "l")>0)cP = (Instr(pAttribs, "p")>0)rText = ""'To get compact code also formula results are processed by the full-featured routine'despite the fact that there no attributes as addressed by "bisul" are supported for text portions. Numerics will be treated based on the formatted strings.   Dim theParEnum As Object, theParElement As Object   Dim theSubEnum As Object, theSubElement As Object   Dim textSlicetheParEnum = theCell.GetText().CreateEnumeration'I think the outer loop will always prossess exactly once.'??However, there seems to be no direct acces to the sole paragraph contained in the cell.'I now think there is. Maybe this implementation cann survive a change to cells containing more than one paragraph, however.Do   While   theParEnum.HasMoreElements   theParElement = theParEnum.NextElement   If cP Then      rText = rText & "<p>"   End If   theSubEnum = theParElement.CreateEnumeration   Do   While   theSubEnum.HasMoreElements      textSlice = ""      theSubElement = theSubEnum.NextElement'The alternative below shall ensure that in case of a display formatted according to the 'Numbers' format code'the displayed string is also used for the result of this function.      If   theCell.FormulaResultType=1 Then         textSlice = theCell.String      Else         textSlice = theSubElement.String      End If      If   cB Then         If   theSubElement.CharWeight >= com.sun.star.awt.FontWeight.BOLD Then            textSlice = "<b>" & textSlice & "</b>"         End If      End If      If   cI Then         If   theSubElement.CharPosture >= 1 Then            textSlice = "<i>" & textSlice & "</i>"         End If      End If      If   cS Then         If   theSubElement.CharStrikeOut >= 1 Then            textSlice = "<strike>" & textSlice & "</strike>"         End If      End If      If   cU Then         If   theSubElement.CharUnderline >= 1 Then            textSlice = "<u>" & textSlice & "</u>"         End If      End If      If   cL Then         If   theSubElement.TextPortionType = "TextField" Then            If   theSubElement.TextField.SupportsService("com.sun.star.text.TextField.URL") Then               textSlice = "<a href=" & Chr(34) & theSubElement.TextField.URL & Chr(34) & ">" & textSlice & "</a>"            End If         End If      End If      rText = rText & textSlice   Loop   If cP Then      rText = rText & "</p>"   Else      If theParEnum.HasMoreElements Then rText = rText & " "   End IfLoopcAttribs2Html = rTextEnd Function`