diff options
author | Timothy Pearson <kb9vqf@pearsoncomputing.net> | 2011-12-03 11:05:10 -0600 |
---|---|---|
committer | Timothy Pearson <kb9vqf@pearsoncomputing.net> | 2011-12-03 11:05:10 -0600 |
commit | f7e7a923aca8be643f9ae6f7252f9fb27b3d2c3b (patch) | |
tree | 1f78ef53b206c6b4e4efc88c4849aa9f686a094d /tde-i18n-en_GB/docs/tdebase/kate/regular-expressions.docbook | |
parent | 85ca18776aa487b06b9d5ab7459b8f837ba637f3 (diff) | |
download | tde-i18n-f7e7a923aca8be643f9ae6f7252f9fb27b3d2c3b.tar.gz tde-i18n-f7e7a923aca8be643f9ae6f7252f9fb27b3d2c3b.zip |
Second part of prior commit
Diffstat (limited to 'tde-i18n-en_GB/docs/tdebase/kate/regular-expressions.docbook')
-rw-r--r-- | tde-i18n-en_GB/docs/tdebase/kate/regular-expressions.docbook | 1219 |
1 files changed, 1219 insertions, 0 deletions
diff --git a/tde-i18n-en_GB/docs/tdebase/kate/regular-expressions.docbook b/tde-i18n-en_GB/docs/tdebase/kate/regular-expressions.docbook new file mode 100644 index 00000000000..c692da92cd5 --- /dev/null +++ b/tde-i18n-en_GB/docs/tdebase/kate/regular-expressions.docbook @@ -0,0 +1,1219 @@ +<appendix id="regular-expressions"> +<appendixinfo> +<authorgroup> +<author +>&Anders.Lund; &Anders.Lund.mail;</author> +<othercredit role="translator" +><firstname +>Malcolm</firstname +><surname +>Hunter</surname +><affiliation +><address +><email +>malcolm.hunter@gmx.co.uk</email +></address +></affiliation +><contrib +>Conversion to British English</contrib +></othercredit +> +</authorgroup> +</appendixinfo> + +<title +>Regular Expressions</title> + +<synopsis +>This Appendix contains a brief but hopefully sufficient and +covering introduction to the world of <emphasis +>regular +expressions</emphasis +>. It documents regular expressions in the form +available within &kate;, which is not compatible with the regular +expressions of perl, nor with those of for example +<command +>grep</command +>.</synopsis> + +<sect1> + +<title +>Introduction</title> + +<para +><emphasis +>Regular Expressions</emphasis +> provides us with a way to describe some possible contents of a text string in a way understood by a small piece of software, so that it can investigate if a text matches, and also in the case of advanced applications with the means of saving pieces or the matching text.</para> + +<para +>An example: Say you want to search a text for paragraphs that starts with either of the names <quote +>Henrik</quote +> or <quote +>Pernille</quote +> followed by some form of the verb <quote +>say</quote +>.</para> + +<para +>With a normal search, you would start out searching for the first name, <quote +>Henrik</quote +> maybe followed by <quote +>sa</quote +> like this: <userinput +>Henrik sa</userinput +>, and while looking for matches, you would have to discard those not being the beginning of a paragraph, as well as those in which the word starting with the letters <quote +>sa</quote +> was not either <quote +>says</quote +>, <quote +>said</quote +> or so. And then of cause repeat all of that with the next name...</para> + +<para +>With Regular Expressions, that task could be accomplished with a single search, and with a larger degree of preciseness.</para> + +<para +>To achieve this, Regular Expressions defines rules for expressing in details a generalisation of a string to match. Our example, which we might literally express like this: <quote +>A line starting with either <quote +>Henrik</quote +> or <quote +>Pernille</quote +> (possibly following up to 4 blanks or tab characters) followed by a whitespace followed by <quote +>sa</quote +> and then either <quote +>ys</quote +> or <quote +>id</quote +></quote +> could be expressed with the following regular expression:</para +> <para +><userinput +>^[ \t]{0,4}(Henrik|Pernille) sa(ys|id)</userinput +></para> + +<para +>The above example demonstrates all four major concepts of modern Regular Expressions, namely:</para> + +<itemizedlist> +<listitem +><para +>Patterns</para +></listitem> +<listitem +><para +>Assertions</para +></listitem> +<listitem +><para +>Quantifiers</para +></listitem> +<listitem +><para +>Back references</para +></listitem> +</itemizedlist> + +<para +>The caret (<literal +>^</literal +>) starting the expression is an assertion, being true only if the following matching string is at the start of a line.</para> + +<para +>The stings <literal +>[ \t]</literal +> and <literal +>(Henrik|Pernille) sa(ys|id)</literal +> are patterns. The first one is a <emphasis +>character class</emphasis +> that matches either a blank or a (horizontal) tab character; the other pattern contains first a subpattern matching either <literal +>Henrik</literal +> <emphasis +>or</emphasis +> <literal +>Pernille</literal +>, then a piece matching the exact string <literal +> sa</literal +> and finally a subpattern matching either <literal +>ys</literal +> <emphasis +>or</emphasis +> <literal +>id</literal +></para> + +<para +>The string <literal +>{0,4}</literal +> is a quantifier saying <quote +>anywhere from 0 up to 4 of the previous</quote +>.</para> + +<para +>Because regular expression software supporting the concept of <emphasis +>back references</emphasis +> saves the entire matching part of the string as well as sub-patterns enclosed in parentheses, given some means of access to those references, we could get our hands on either the whole match (when searching a text document in an editor with a regular expression, that is often marked as selected) or either the name found, or the last part of the verb.</para> + +<para +>All together, the expression will match where we wanted it to, and only there.</para> + +<para +>The following sections will describe in details how to construct and use patterns, character classes, assertions, quantifiers and back references, and the final section will give a few useful examples.</para> + +</sect1> + +<sect1 id="regex-patterns"> + +<title +>Patterns</title> + +<para +>Patterns consists of literal strings and character classes. Patterns may contain sub-patterns, which are patterns enclosed in parentheses.</para> + +<sect2> +<title +>Escaping characters</title> + +<para +>In patterns as well as in character classes, some characters have a special meaning. To literally match any of those characters, they must be marked or <emphasis +>escaped</emphasis +> to let the regular expression software know that it should interpret such characters in their literal meaning.</para> + +<para +>This is done by prepending the character with a backslash (<literal +>\</literal +>).</para> + + +<para +>The regular expression software will silently ignore escaping a character that does not have any special meaning in the context, so escaping for example a <quote +>j</quote +> (<userinput +>\j</userinput +>) is safe. If you are in doubt whether a character could have a special meaning, you can therefore escape it safely.</para> + +<para +>Escaping of cause includes the backslash character it self, to literally match a such, you would write <userinput +>\\</userinput +>.</para> + +</sect2> + +<sect2> +<title +>Character Classes and abbreviations</title> + +<para +>A <emphasis +>character class</emphasis +> is an expression that matches one of a defined set of characters. In Regular Expressions, character classes are defined by putting the legal characters for the class in square brackets, <literal +>[]</literal +>, or by using one of the abbreviated classes described below.</para> + +<para +>Simple character classes just contains one or more literal characters, for example <userinput +>[abc]</userinput +> (matching either of the letters <quote +>a</quote +>, <quote +>b</quote +> or <quote +>c</quote +>) or <userinput +>[0123456789]</userinput +> (matching any digit).</para> + +<para +>Because letters and digits have a logical order, you can abbreviate those by specifying ranges of them: <userinput +>[a-c]</userinput +> is equal to <userinput +>[abc]</userinput +> and <userinput +>[0-9]</userinput +> is equal to <userinput +>[0123456789]</userinput +>. Combining these constructs, for example <userinput +>[a-fynot1-38]</userinput +> is completely legal (the last one would match, of cause, either of <quote +>a</quote +>,<quote +>b</quote +>,<quote +>c</quote +>,<quote +>d</quote +>, <quote +>e</quote +>,<quote +>f</quote +>,<quote +>y</quote +>,<quote +>n</quote +>,<quote +>o</quote +>,<quote +>t</quote +>, <quote +>1</quote +>,<quote +>2</quote +>,<quote +>3</quote +> or <quote +>8</quote +>).</para> + +<para +>As capital letters are different characters from their non-capital equivalents, to create a caseless character class matching <quote +>a</quote +> or <quote +>b</quote +>, in any case, you need to write it <userinput +>[aAbB]</userinput +>.</para> + +<para +>It is of cause possible to create a <quote +>negative</quote +> class matching as <quote +>anything but</quote +> To do so put a caret (<literal +>^</literal +>) at the beginning of the class: </para> + +<para +><userinput +>[^abc]</userinput +> will match any character <emphasis +>but</emphasis +> <quote +>a</quote +>, <quote +>b</quote +> or <quote +>c</quote +>.</para> + +<para +>In addition to literal characters, some abbreviations are defined, making life still a bit easier: <variablelist> + +<varlistentry> +<term +><userinput +>\a</userinput +></term> +<listitem +><para +>This matches the <acronym +>ASCII</acronym +> bell character (BEL, 0x07).</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>\f</userinput +></term> +<listitem +><para +>This matches the <acronym +>ASCII</acronym +> form feed character (FF, 0x0C).</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>\n</userinput +></term> +<listitem +><para +>This matches the <acronym +>ASCII</acronym +> line feed character (LF, 0x0A, Unix newline).</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>\r</userinput +></term> +<listitem +><para +>This matches the <acronym +>ASCII</acronym +> carriage return character (CR, 0x0D).</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>\t</userinput +></term> +<listitem +><para +>This matches the <acronym +>ASCII</acronym +> horizontal tab character (HT, 0x09).</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>\v</userinput +></term> +<listitem +><para +>This matches the <acronym +>ASCII</acronym +> vertical tab character (VT, 0x0B).</para +></listitem> +</varlistentry> +<varlistentry> +<term +><userinput +>\xhhhh</userinput +></term> + +<listitem +><para +>This matches the Unicode character corresponding to the hexadecimal number hhhh (between 0x0000 and 0xFFFF). \0ooo (&ie;, \zero ooo) matches the <acronym +>ASCII</acronym +>/Latin-1 character corresponding to the octal number ooo (between 0 and 0377).</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>.</userinput +> (dot)</term> +<listitem +><para +>This matches any character (including newline).</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>\d</userinput +></term> +<listitem +><para +>This matches a digit. Equal to <literal +>[0-9]</literal +></para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>\D</userinput +></term> +<listitem +><para +>This matches a non-digit. Equal to <literal +>[^0-9]</literal +> or <literal +>[^\d]</literal +></para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>\s</userinput +></term> +<listitem +><para +>This matches a whitespace character. Practically equal to <literal +>[ \t\n\r]</literal +></para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>\S</userinput +></term> +<listitem +><para +>This matches a non-whitespace. Practically equal to <literal +>[^ \t\r\n]</literal +>, and equal to <literal +>[^\s]</literal +></para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>\w</userinput +></term> +<listitem +><para +>Matches any <quote +>word character</quote +> - in this case any letter or digit. Note that underscore (<literal +>_</literal +>) is not matched, as is the case with perl regular expressions. Equal to <literal +>[a-zA-Z0-9]</literal +></para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>\W</userinput +></term> +<listitem +><para +>Matches any non-word character - anything but letters or numbers. Equal to <literal +>[^a-zA-Z0-9]</literal +> or <literal +>[^\w]</literal +></para +></listitem> +</varlistentry> + + +</variablelist> + +</para> + +<para +>The abbreviated classes can be put inside a custom class, for example to match a word character, a blank or a dot, you could write <userinput +>[\w \.]</userinput +></para +> + +<note +> <para +>The POSIX notation of classes, <userinput +>[:<class name>:]</userinput +> is currently not supported.</para +> </note> + +<sect3> +<title +>Characters with special meanings inside character classes</title> + +<para +>The following characters has a special meaning inside the <quote +>[]</quote +> character class construct, and must be escaped to be literally included in a class:</para> + +<variablelist> +<varlistentry> +<term +><userinput +>]</userinput +></term> +<listitem +><para +>Ends the character class. Must be escaped unless it is the very first character in the class (may follow an unescaped caret)</para +></listitem> +</varlistentry> +<varlistentry> +<term +><userinput +>^</userinput +> (caret)</term> +<listitem +><para +>Denotes a negative class, if it is the first character. Must be escaped to match literally if it is the first character in the class.</para +></listitem +> +</varlistentry> +<varlistentry> +<term +><userinput +>-</userinput +> (dash)</term> +<listitem +><para +>Denotes a logical range. Must always be escaped within a character class.</para +></listitem> +</varlistentry> +<varlistentry> +<term +><userinput +>\</userinput +> (backslash)</term> +<listitem +><para +>The escape character. Must always be escaped.</para +></listitem> +</varlistentry> + +</variablelist> + +</sect3> + +</sect2> + +<sect2> + +<title +>Alternatives: matching <quote +>one of</quote +></title> + +<para +>If you want to match one of a set of alternative patterns, you can separate those with <literal +>|</literal +> (vertical bar character).</para> + +<para +>For example to find either <quote +>John</quote +> or <quote +>Harry</quote +> you would use an expression <userinput +>John|Harry</userinput +>.</para> + +</sect2> + +<sect2> + +<title +>Sub Patterns</title> + +<para +><emphasis +>Sub patterns</emphasis +> are patterns enclosed in parentheses, and they have several uses in the world of regular expressions.</para> + +<sect3> + +<title +>Specifying alternatives</title> + +<para +>You may use a sub pattern to group a set of alternatives within a larger pattern. The alternatives are separated by the character <quote +>|</quote +> (vertical bar).</para> + +<para +>For example to match either of the words <quote +>int</quote +>, <quote +>float</quote +> or <quote +>double</quote +>, you could use the pattern <userinput +>int|float|double</userinput +>. If you only want to find one if it is followed by some whitespace and then some letters, put the alternatives inside a subpattern: <userinput +>(int|float|double)\s+\w+</userinput +>.</para> + +</sect3> + +<sect3> + +<title +>Capturing matching text (back references)</title> + +<para +>If you want to use a back reference, use a sub pattern to have the desired part of the pattern remembered.</para> + +<para +>For example, it you want to find two occurrences of the same word separated by a comma and possibly some whitespace, you could write <userinput +>(\w+),\s*\1</userinput +>. The sub pattern <literal +>\w+</literal +> would find a chunk of word characters, and the entire expression would match if those were followed by a comma, 0 or more whitespace and then an equal chunk of word characters. (The string <literal +>\1</literal +> references <emphasis +>the first sub pattern enclosed in parentheses</emphasis +>)</para> + +<!-- <para +>See also <link linkend="backreferences" +>Back references</link +>.</para +> --> + +</sect3> + +<sect3 id="lookahead-assertions"> +<title +>Lookahead Assertions</title> + +<para +>A lookahead assertion is a sub pattern, starting with either <literal +>?=</literal +> or <literal +>?!</literal +>.</para> + +<para +>For example to match the literal string <quote +>Bill</quote +> but only if not followed by <quote +> Gates</quote +>, you could use this expression: <userinput +>Bill(?! Gates)</userinput +>. (This would find <quote +>Bill Clinton</quote +> as well as <quote +>Billy the kid</quote +>, but silently ignore the other matches.)</para> + +<para +>Sub patterns used for assertions are not captured.</para> + +<para +>See also <link linkend="assertions" +>Assertions</link +></para> + +</sect3> + +</sect2> + +<sect2 id="special-characters-in-patterns"> +<title +>Characters with a special meaning inside patterns</title> + +<para +>The following characters have meaning inside a pattern, and must be escaped if you want to literally match them: <variablelist> + +<varlistentry> +<term +><userinput +>\</userinput +> (backslash)</term> +<listitem +><para +>The escape character.</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>^</userinput +> (caret)</term> +<listitem +><para +>Asserts the beginning of the string.</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>$</userinput +></term> +<listitem +><para +>Asserts the end of string.</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>()</userinput +> (left and right parentheses)</term> +<listitem +><para +>Denotes sub patterns.</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>{}</userinput +> (left and right curly braces)</term> +<listitem +><para +>Denotes numeric quantifiers.</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>[]</userinput +> (left and right square brackets)</term> +<listitem +><para +>Denotes character classes.</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>|</userinput +> (vertical bar)</term> +<listitem +><para +>logical OR. Separates alternatives.</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>+</userinput +> (plus sign)</term> +<listitem +><para +>Quantifier, 1 or more.</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>*</userinput +> (asterisk)</term> +<listitem +><para +>Quantifier, 0 or more.</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>?</userinput +> (question mark)</term> +<listitem +><para +>An optional character. Can be interpreted as a quantifier, 0 or 1.</para +></listitem> +</varlistentry> + +</variablelist> + +</para> + +</sect2> + +</sect1> + +<sect1 id="quantifiers"> +<title +>Quantifiers</title> + +<para +><emphasis +>Quantifiers</emphasis +> allows a regular expression to match a specified number or range of numbers of either a character, character class or sub pattern.</para> + +<para +>Quantifiers are enclosed in curly brackets (<literal +>{</literal +> and <literal +>}</literal +>) and have the general form <literal +>{[minimum-occurrences][,[maximum-occurrences]]}</literal +> </para> + +<para +>The usage is best explained by example: <variablelist> + +<varlistentry> +<term +><userinput +>{1}</userinput +></term> +<listitem +><para +>Exactly 1 occurrence</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>{0,1}</userinput +></term> +<listitem +><para +>Zero or 1 occurrences</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>{,1}</userinput +></term> +<listitem +><para +>The same, with less work;)</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>{5,10}</userinput +></term> +<listitem +><para +>At least 5 but maximum 10 occurrences.</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>{5,}</userinput +></term> +<listitem +><para +>At least 5 occurrences, no maximum.</para +></listitem> +</varlistentry> + +</variablelist> + +</para> + +<para +>Additionally, there are some abbreviations: <variablelist> + +<varlistentry> +<term +><userinput +>*</userinput +> (asterisk)</term> +<listitem +><para +>similar to <literal +>{0,}</literal +>, find any number of occurrences.</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>+</userinput +> (plus sign)</term> +<listitem +><para +>similar to <literal +>{1,}</literal +>, at least 1 occurrence.</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>?</userinput +> (question mark)</term> +<listitem +><para +>similar to <literal +>{0,1}</literal +>, zero or 1 occurrence.</para +></listitem> +</varlistentry> + +</variablelist> + +</para> + +<sect2> + +<title +>Greed</title> + +<para +>When using quantifiers with no maximum, regular expressions defaults to match as much of the searched string as possible, commonly known as <emphasis +>greedy</emphasis +> behaviour.</para> + +<para +>Modern regular expression software provides the means of <quote +>turning off greediness</quote +>, though in a graphical environment it is up to the interface to provide you with access to this feature. For example a search dialogue providing a regular expression search could have a check box labelled <quote +>Minimal matching</quote +> as well as it ought to indicate if greediness is the default behaviour.</para> + +</sect2> + +<sect2> +<title +>In context examples</title> + +<para +>Here are a few examples of using quantifiers</para> + +<variablelist> + +<varlistentry> +<term +><userinput +>^\d{4,5}\s</userinput +></term> +<listitem +><para +>Matches the digits in <quote +>1234 go</quote +> and <quote +>12345 now</quote +>, but neither in <quote +>567 eleven</quote +> nor in <quote +>223459 somewhere</quote +></para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>\s+</userinput +></term> +<listitem +><para +>Matches one or more whitespace characters</para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>(bla){1,}</userinput +></term> +<listitem +><para +>Matches all of <quote +>blablabla</quote +> and the <quote +>bla</quote +> in <quote +>blackbird</quote +> or <quote +>tabla</quote +></para +></listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>/?></userinput +></term> +<listitem +><para +>Matches <quote +>/></quote +> in <quote +><closeditem/></quote +> as well as <quote +>></quote +> in <quote +><openitem></quote +>.</para +></listitem> +</varlistentry> + +</variablelist> + +</sect2> + +</sect1> + +<sect1 id="assertions"> +<title +>Assertions</title> + +<para +><emphasis +>Assertions</emphasis +> allows a regular expression to match only under certain controlled conditions.</para> + +<para +>An assertion does not need a character to match, it rather investigates the surroundings of a possible match before acknowledging it. For example the <emphasis +>word boundary</emphasis +> assertion does not try to find a non word character opposite a word one at its position, instead it makes sure that there is not a word character. This means that the assertion can match where there is no character, &ie; at the ends of a searched string.</para> + +<para +>Some assertions actually does have a pattern to match, but the part of the string matching that will not be a part of the result of the match of the full expression.</para> + +<para +>Regular Expressions as documented here supports the following assertions: <variablelist> + +<varlistentry +> +<term +><userinput +>^</userinput +> (caret: beginning of string)</term +> +<listitem +><para +>Matches the beginning of the searched string.</para +> <para +>The expression <userinput +>^Peter</userinput +> will match at <quote +>Peter</quote +> in the string <quote +>Peter, hey!</quote +> but not in <quote +>Hey, Peter!</quote +> </para +> </listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>$</userinput +> (end of string)</term> +<listitem +><para +>Matches the end of the searched string.</para> + +<para +>The expression <userinput +>you\?$</userinput +> will match at the last you in the string <quote +>You didn't do that, did you?</quote +> but nowhere in <quote +>You didn't do that, right?</quote +></para> + +</listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>\b</userinput +> (word boundary)</term> +<listitem +><para +>Matches if there is a word character at one side and not a word character at the other.</para> +<para +>This is useful to find word ends, for example both ends to find a whole word. The expression <userinput +>\bin\b</userinput +> will match at the separate <quote +>in</quote +> in the string <quote +>He came in through the window</quote +>, but not at the <quote +>in</quote +> in <quote +>window</quote +>.</para +></listitem> + +</varlistentry> + +<varlistentry> +<term +><userinput +>\B</userinput +> (non word boundary)</term> +<listitem +><para +>Matches wherever <quote +>\b</quote +> does not.</para> +<para +>That means that it will match for example within words: The expression <userinput +>\Bin\B</userinput +> will match at in <quote +>window</quote +> but not in <quote +>integer</quote +> or <quote +>I'm in love</quote +>.</para> +</listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>(?=PATTERN)</userinput +> (Positive lookahead)</term> +<listitem +><para +>A lookahead assertion looks at the part of the string following a possible match. The positive lookahead will prevent the string from matching if the text following the possible match does not match the <emphasis +>PATTERN</emphasis +> of the assertion, but the text matched by that will not be included in the result.</para> +<para +>The expression <userinput +>handy(?=\w)</userinput +> will match at <quote +>handy</quote +> in <quote +>handyman</quote +> but not in <quote +>That came in handy!</quote +></para> +</listitem> +</varlistentry> + +<varlistentry> +<term +><userinput +>(?!PATTERN)</userinput +> (Negative lookahead)</term> + +<listitem +><para +>The negative lookahead prevents a possible match to be acknowledged if the following part of the searched string does match its <emphasis +>PATTERN</emphasis +>.</para> +<para +>The expression <userinput +>const \w+\b(?!\s*&)</userinput +> will match at <quote +>const char</quote +> in the string <quote +>const char* foo</quote +> while it can not match <quote +>const QString</quote +> in <quote +>const QString& bar</quote +> because the <quote +>&</quote +> matches the negative lookahead assertion pattern.</para> +</listitem> +</varlistentry> + +</variablelist> + +</para> + +</sect1> + +<!-- TODO sect1 id="backreferences"> + +<title +>Back References</title> + +<para +></para> + +</sect1 --> + +</appendix> |