diff options
Diffstat (limited to 'kpilot/conduits/docconduit/bmkSpecification.txt')
-rw-r--r-- | kpilot/conduits/docconduit/bmkSpecification.txt | 199 |
1 files changed, 199 insertions, 0 deletions
diff --git a/kpilot/conduits/docconduit/bmkSpecification.txt b/kpilot/conduits/docconduit/bmkSpecification.txt new file mode 100644 index 000000000..f8a68d961 --- /dev/null +++ b/kpilot/conduits/docconduit/bmkSpecification.txt @@ -0,0 +1,199 @@ +KPilot PalmDoc Conduit bookmark Specification +============================================= + +(c) 2003 Reinhold Kainhofer, reinhold@kainhofer.com + +This document is licensed under the FDL (Free Documentation License) +as published by the FSF. Any version of the FDL can be applied +at your convenience. + + + + +The PalmDoc conduit has three ways to indicate bookmarks for a text: + -) Inline tags of the form <* bookmarkname *> + -) Endtags of the form <bookmarkname> at the end of the document + -) Regular expressions in a separate textname.bmk file + (textname.bmk ist the filename of the text with the .txt replaced by .bmk) + + +In the design of the .bmk file, I tried to stay close to the +syntac of MakeDocJ bookmark files, but it turned out that I +needed to extend the syntax a little. Also, MakeDocJ uses Java +RegExps, while the PalmDoc conduit uses the QRegExp, which have +some slight differences (especially concerning the ^ and $ +patterns as well as backreferences). So if you used MakeDocJ, +the .bmk file syntax will be quite familiar, but you will still +have to adapt your bookmark files for Qt regular expressions +instead of Java regular expressions + + + +1) INLINE TAGS + +Whenever a tag of the form <* someText *> appears in the text, +this sequence is removed from the text, and a bookmark is set +there with the bookmark name "someText" (the part between the +<* and the *>). + + +2) ENDTAGS + +If the text ends with tags of the form <someText>, the string +in braces is used as bookmark name, and wherever it appears in +the text, a bookmark is set. +After the > any number of whitespace is allowed, but no other +characters like letters, numbers, or punctuation. Also, inside +the braces no line break must occur. The conduit searches the +text from the end and if it finds a line break inside a <...> +sequence, the tag and everything before it is assumed to belong +to the text and doesn't form a bookmark tag. +Between endtags any number of whitespace (spaces, tabs, line +feeds etc.) is allowed. + +As an example, assume you have a text ending in: +... the bad guy was punished, and they lived happily +ever after! +<Tag with +line feed> + <bad guy> <princess> +<married> + +The conduit starts at the end, ignores all whitespace between +the tags, so it finds the tags "married", "princess", and "bad guy". +The "Tag with line feed" has a line feed, so it is assumed to belong +to the text. +Assume now you have a text ending in: +... the bad guy was punished, and they lived happily +ever after! +<bad guy> The End <princess> +<married> + +Here, only "married" and "princess" are found as bookmarks. Because +of the letters before the "princess" tags, the search for the +bookmarks ends at the letter "d" of "The End" (the conduit starts +from the end and moves backward until it finds some text which +cannot be seen as a endtag. + + + + +3) REGULAR EXPRESSIONS IN A SEPARATE FILE + +This is by far the most complex way to specify bookmarks, but +it is also the mose powerful. +If you have a text with filename "My fairy tale.txt", the +bookmarks will be specified in a file called "My fairy tale.bmk" +(just the text filename with the .txt replaced by .bmk). This +file contains the bookmark definitions, one in each line. Lines +starting with a # are seen as comments, and empty lines are also +ignored. + + +In the .bmk file, each bookmark line has one of the following syntaces +(I will explain all fields later on). Fields in [..] are optional: + +bmkName +bmkPosition, bmkName ++, bmkPatternRegExp[, bmkNameAsString[, firstIncludedBmk[, lastIncludedBmk]]] ++, bmkPatternRegExp[, bmkNameIndexOfSubexpression[, firstIncludedBmk[, lastIncludedBmk]]] +-, bmkPatternRegExp[, bmkNameAsString] +-, bmkPatternRegExp[, bmkNameIndexOfSubexpression] + + If the first field is a string, it is used as the bookmark name +and pattern to search for. + If the first field is a number, it means the position of the +bookmark, and the second field is the name of the bookmark. + If the first field is either + or -, the second field gives +a regular expression that is used to find the position of the +bookmark. If the first field is a -, the search is done only +once and only the first match will be added as bookmark. If +the first field is a +, the search is done until the regular +expression can no longer be found (the fourth and fifth fields +can be used to include only a certain range of hits). If there +is a third field, and it is a string, it gives the name of the +bookmark as a regular expression (i.e. \1 are replaced by the +first subexpression of the search, where subexpressions are +specified by round brackets in the regexp of the second field). +If there is a third field, and it is a number, it gives the index +of the subexpression of bmkPatternRegExp that is used as the +bookmark name. +If there is no third field, the whole matched text will be used +as bookmark name. +The optional fourth and fifth fields can be used to set bookmarks +only after the first few ocurrences of the regexp in the text, and +to stop the search after the expression has been found a certain +number of times. + + + +If the PDB->PC sync is set up to store the bookmarks in a bookmark file, +it will create a file "My fairy tale.bm" (no "k") with entries of the form +position,bmkName +The .bmk file will be used if it exists, but if no .bmk file exists, the .bm file +will be used. This way you can override the bookmark settings, while +at the same time the PDB->TXT sync does not destroy your possibly +existing .bmk file. + + + +Examples: + +1) Imagine you have a line like: +frog princess +In this case, the text is searched for "frog princess", and a +bookmark is set whenever "frog princess" occurs in the text. +The name of each of these bookmarks will be "frog princess". + +2) A bookmark line: +55, Bookmark at offset 55 +Here, a bookmark will be set at offset 55 (55th character of +the text), and it will have the name "Bookmark at offs" (truncated +to 16 characters) + +3) A bookmark line +-,Chapter \d+ +causes a bookmark to be set at the first ocurrence of "Chapter XXX", +where XXX denotes one or more digits. The bookmark name will be +"Chapter XXX" (XXX replaced by the actual digits). + +4) A bookmark line ++,Chapter \d+ +causes bookmarks to be set wherever "Chapter XXX" (XXX being one +or more digits) appears in the text. The bookmark name will again +be "Chapter XXX", but the search does not stop after the first hit. + +5) A bookmark line ++,\n\s*(Chapter \d+)\D+, 1 +causes a bookmark to be set whenever a new line starts with +"Chapter XXX" (whitespace is allowed before the "Chapter"), and +uses the first subexpression in (..) as the bookmark name. If you +have a passage + Chapter 15: here it starts +The regular expression will match, so a bookmark will be set there +and the subexpression "Chapter 15" (which matches the (Chapter \d+) ) +will be used as bookmark text. + +6) A bookmark line ++,\n\s*Part (\d+),\1\. part +sets a bookmark whenever a line starts with "Part XXX". The XXX +will be stored as the first matched subexpression. The third field +"\1\. part" is the regular expression for the bookmark name, where +\1 is replaced by the first matched subexpression of the search (XXX +in this case). So if a line starts with " Part 17: ", the bookmark +name will be "17. part". + +7) A bookmark line ++,Table (\d+): ,\1\. Tabelle,5,25 +will match whenever "Table XXX: " appears in the text, and the bookmark +name will be "XXX. Tabelle". However, the fourth field means that the +first four hits are ignored (the 5th hit is the first hit to be included +as a bookmark), and the fifth field means that all further hits after the +25th will be ignored, too. + +8) In law texts, I use a regular expression ++,\n *(§\.? *\d+[a-z]?\.?) +, 1 +to search for all paragraphs starting like "§. 15. " or " §23 ", and set +a bookmark there using only the part from the § to the last digit or the +full stop after the last digit (the pattern between the (), in our two +cases the bookmark names will be "§. 15." and "§23" ). |