class Regexp
The core class of the regular expression engine.
Description
Regular expressions provide a very powerful method of matching, modifying and extractinginformation from strings. Using special syntax, code that would usually require line after line of special matching code can be summarised within a one line regular expression (from here on in referred to as a regex). ferite's regex's are providied by means of PCRE(Perl Compatible Regular Expressions, a C library that can be found at http://www.pcre.org) and as a result are almost identical in use to Perl's. Regex's look like this:
Example:
Regexp.replace('1(2)3', input, '456');
This one will match all occurrences of the string "123" and swap them with "456"
Regexp.replace('W(or(l))d', input, 'Ch\1ris\2');
This is more complicated and will match occurrences of "World" and swap them with "Chorlrisl". The reason being is due to back ticks which are discussed soon.
There are three types of regular expression support and that is match, swap and split. They are used as follows:
Regexp.match( 'expression to match', intput )
Regexp.replace( 'expression to match', intput, 'replacement' )
Regexp.split( 'expression to split with', intput )
You can either use the quick methods metioned above or create regexp objects by creating an object from the Regexp class.
Options
There are a number of options that can be used to modify the method that the regular expression's execution and processing. These can only be passed in when you create a manual Regexp object. The options are:
- x - This option allows the regular expression to be multi line, and also allows comments using the # character. This is useful for long regular expressions where it is important to remember what each individual part performs.
object r = new Regexp('^(.*?) # Match Dots
chris # Match Chris
$', 'm' );
- s - This allows the . (dot meta character) matching character to match newlines (\n's).
- m - This gets the ^ and $ meta characters to match at newlines within the source string.
object r = new Regexp('^(.*)$', 'm' );
r.replaceAll( "Hello\nWorld\nFrom\nChris", "Foo" );
The above regex will return "Foo\nFoo\nFoo\nFoo".
- i - This causes the regex engine to match cases without looking at the case of characters being processed. Therefore a expression 'abc' will match 'ABC'.
- g - This forces all matches along a line to be matched. Normally it is only the first occurance that is matched. The *All functions use this option.
- A - The pattern will only match if it matches at the beginning of the string being searched. This is equivalent to specifying ^ at the beginning of the regular expression.
- D - This option allows the user to have only the $ tie to the end of a line when it is at the end of the regular expression.
Backticks and Capturing
When brackets are used within a regular expression they capture the values when the expression matches. These can be used for two main purposes: either fetching the values through the match object function capture() or for use within the replace pattern. To use within the replacement expression you need to use backticks. Backticks are used within the swap mode of the regular expressions. It allows you to used captured strings within string that should replace the matched expression. They are used as follows: a '\' (back slash) followed by the number that you want to use. The example below will match all 3 digit numbers and reverse their ordering.
Regexp.replaceAll( '(\d)(\d)(\d)', input, '\3\2\1' );
To work out what capture number to use, you need to count from left to right within the expression with each opening bracket being an increment. For a match object the counting starts at 0, with backticks it starts at 1.
More
This is only a brief insight into regular expressions, and a suggested read is "Mastering Regular Expressions" by Jeffrey E. F. Friedl (published by O'Reilly), and that will tell you everything you need to know about regular expressions. :-) It is also suggested that the libpcre documentation is worth reading on http://www.pcre.org.
| class contents [NB. Highlighted attributes are static members] |
| Functions |
constructor(string) - Create a regular expression using the string passed into the class |
constructor(string,string) - Create a regular expression using the string passed into the class |
getRegexp() - Get the regular expression as a string |
lastMatch() - Get the last sucessful match |
match(string) - Run a match on a string |
match(string,string) - A quick method of running a regular expression match on a string |
matchAll(string,string) - A quick method of running a regular expression match all matches on a string |
matchAll(string) - Run a match on a string, unlike match this function will return all matches |
replace(string,string) - Replace the first match in a string |
replace(string,string,string) - A quick method of running a regular expression replace on the first match on a string |
replace(string) - Replace the first match in a string |
replaceAll(string) - Replace the all of the matches in a string |
replaceAll(string,string) - Replace all of the matches in a string |
replaceAll(string,string,string) - A quick method of running a regular expression replace on all matches on a string |
split(string,string) - A quick method of splitting a string up using a regexp as the delimeter |
|
Functions
function constructor  |
| Create a regular expression using the string passed into the class |
| Declaration: |
| function constructor( string regexp ) |
| Parameters: |
| Parameter #1: string regexp - The regular expression to compile |
| Returns: |
| The regular expression object |
|
function constructor  |
| Create a regular expression using the string passed into the class |
| Declaration: |
| function constructor( string regexp, string flags ) |
| Parameters: |
| Parameter #1: string regexp - The regular expression to compile |
| Parameter #2: string flags - The flags to use when compiling and executing an expression |
| Returns: |
| The regular expression object |
|
function getRegexp  |
| Get the regular expression as a string |
| Declaration: |
| function getRegexp() |
| Returns: |
| The expression as a string |
|
function lastMatch  |
| Get the last sucessful match |
| Declaration: |
| function lastMatch() |
| Returns: |
| The last match or null otherwise |
|
function match  |
| Run a match on a string |
| Declaration: |
| function match( string str ) |
| Description: |
| It possible to give this call a closure to handle the match. |
| Parameters: |
| Parameter #1: string str - The string to execute the regular expression on |
| Returns: |
| A match object or null if no match |
| Example: |
object o = new Regexp( '([0-9]+)' );
o.match( "123 456 789 345" ) using ( match ) {
Console.println( "Got match: '${match.match()}' in range '${match.span()}'" );
Console.println( " captures: ${match.captures()}\n" );
};
The above example will only match 123. |
|
static function match  |
| A quick method of running a regular expression match on a string |
| Declaration: |
| static function match( string regexp, string container ) |
| Parameters: |
| Parameter #1: string regexp - The regular expression to compile and use |
| Parameter #2: string container - The string to match against |
| Returns: |
| A match object, or null otherwise |
|
static function matchAll  |
| A quick method of running a regular expression match all matches on a string |
| Declaration: |
| static function matchAll( string regexp, string container ) |
| Parameters: |
| Parameter #1: string regexp - The regular expression to compile and use |
| Parameter #2: string container - The string to match against |
| Returns: |
| An array of match objects |
|
function matchAll  |
| Run a match on a string, unlike match this function will return all matches |
| Declaration: |
| function matchAll( string str ) |
| Description: |
| It possible to give this call a closure to handle the match. |
| Parameters: |
| Parameter #1: string str - The string to execute the regular expression on |
| Returns: |
| An array of match objects or null if no match |
| Example: |
object o = new Regexp( '([0-9]+)' );
o.matchAll( "123 456 789 345" ) using ( match ) {
Console.println( "Got match: '${match.match()}' in range '${match.span()}'" );
Console.println( " captures: ${match.captures()}\n" );
};
The above example will match '123', '456', '789' and '345'. |
|
function replace  |
| Replace the first match in a string |
| Declaration: |
| function replace( string str, string replacement ) |
| Description: |
| This function can't take a closure. |
| Parameters: |
| Parameter #1: string str - The string to run the replace on |
| Parameter #2: string replacement - The string to replace matches with, if empty, and a closure is provided, it'll use the closure's return value for the replacement. |
| Returns: |
| A string combing the original with the replacements |
| Example: |
object o = new Regexp( "([0-9]+)" );
string replaced = o.replace( "1234 is the 123456", "LargeNumbers" );
The result of the above code will be 'LargeNumbers is the 123456' |
|
static function replace  |
| A quick method of running a regular expression replace on the first match on a string |
| Declaration: |
| static function replace( string regexp, string container, string replacement ) |
| Parameters: |
| Parameter #1: string regexp - The regular expression to compile and use |
| Parameter #2: string container - The string to match against |
| Parameter #3: string replacement - The string to swap for the match. |
| Returns: |
| The string container with all matches swap with replacement |
|
function replace  |
| Replace the first match in a string |
| Declaration: |
| function replace( string str ) |
| Description: |
| This function can takes a closure, the return of the closure is used in the replacement. |
| Parameters: |
| Parameter #1: string str - The string to run the replace on |
| Returns: |
| A string combing the original with the replacements |
| Example: |
object o = new Regexp( "([0-9]+)" );
string replaced = o.replace( "1234 is the 123456" ) using ( match ) {
return "${match.match()}.feriteRocks";
};
The result of the above code will be '1234.feriteRocks is the 123456' |
|
function replaceAll  |
| Replace the all of the matches in a string |
| Declaration: |
| function replaceAll( string str ) |
| Description: |
| This function can takes a closure, the return of the closure is used in the replacement. |
| Parameters: |
| Parameter #1: string str - The string to run the replace on |
| Returns: |
| A string combing the original with the replacements |
| Example: |
object o = new Regexp( "([0-9]+)" );
string replaced = o.replaceAll( "1234 is the 123456" ) using ( match ) {
return "${match.match()}.feriteRocks";
};
The result of the above code will be '1234.feriteRocks is the 123456.feriteRocks' |
|
function replaceAll  |
| Replace all of the matches in a string |
| Declaration: |
| function replaceAll( string str, string replacement ) |
| Description: |
| This function can't take a closure. |
| Parameters: |
| Parameter #1: string str - The string to run the replace on |
| Parameter #2: string replacement - The string to replace matches with, if empty, and a closure is provided, it'll use the closure's return value for the replacement. |
| Returns: |
| A string combing the original with the replacements |
| Example: |
object o = new Regexp( "([0-9]+)" );
string replaced = o.replace( "1234 is the 123456", "LargeNumbers" );
The result of the above code will be 'LargeNumbers is the LargeNumbers' |
|
static function replaceAll  |
| A quick method of running a regular expression replace on all matches on a string |
| Declaration: |
| static function replaceAll( string regexp, string container, string replacement ) |
| Parameters: |
| Parameter #1: string regexp - The regular expression to compile and use |
| Parameter #2: string container - The string to match against |
| Parameter #3: string replacement - The string to swap for the match. |
| Returns: |
| The string container with all matches swap with replacement |
|
static function split  |
| A quick method of splitting a string up using a regexp as the delimeter |
| Declaration: |
| static function split( string regexp, string line ) |
| Parameters: |
| Parameter #1: string regexp - The regular expression to compile and use |
| Parameter #2: string container - The string to match against |
| Returns: |
| An array of strings |
| Example: |
uses "console", "regexp";
Console.println( Regexp.split( ",", "x,x,,;t t" ) ); » [ "x", "x", "", ";t t" ]
|
|
Automatically generated at 12:07PM, Wednesday 25 May 2005 by feritedoc.