Page 1 of 1

How BBcodes are Processed

PostPosted: Fri Feb 22, 2008 8:27 pm
by TerryE
This post is a discussion about the internals of phpBB and specifically about how it handles post content and its displays. All posts passed through one of three formats in their life:
  • Input Format. There is the format that you type into the text window on the posting.php screen.
  • DB Format. This is the format in which it is maintained within the forum database.
  • Display Format. This is the format in which it is rendered for visual presentation within the post.
All three are typically different. Why? The answer here is that a typical post may be viewed 20-30 times for each post action. It therefore makes sense to do as much pre-processing as possible at post time if this can save work (and therefore load on the forum) during display. However because you are able to edit posts any changes or additional markup that is carried out during posting must be reversible.

A case in point for such processing is that many tags are supposed to paired like bookends, but the fact is that there is no guarantee that the posting will maintain correct paring, and in this case if you have an unimpaired [b] for example the lead is displayed as is. So the message_parse function which is responsible for converting the post from input format to DB format scans the post structure and validates all tag pairs. A UID is created for each tag pair so that a bold markup might be stored on the database in the format [b:36q9mv6g]Some Text to Be Bolded[/b:36q9mv6g]. The reason for this is that it makes the paring of such tags and their transformation into HTML a straightforward substitution task by the viewposting.php code. Moreover, if the user decides to edit or to quote an existing post and then such hidden markup can easily backed out for display in the text editing box for the post.

Hence each markup type within the BBCode needs to have functionality in three places:
  • to parse and mark-up raw BBCode text
  • to transform the marked up BBCode into HTML for insertion in the display post
  • to strip out the markings and restore the raw BBCode text for edit and quoting.
All of this brings me to the smart [code] block markup functionality proposed in bbGeshi mod. This doesn't follow the above practice. There is minimal be processing done in the message_parse phase and most is left to the viewpost.php processing. This means that there are a number of problems with its handling of constructs such as URLs which should be detected during message parse, and it also places quite the processing load on the server during the views.

But at least getting to grips with this is helping me to work out how to add usable improvements to BBcode :-)

Re: How BBcodes are Processed

PostPosted: Sat Feb 23, 2008 7:07 am
by TerryE
Placeholder for Change Control and to remove topic from "View unanswered posts" list.