Extracting substrings
I want to see if my string contains a digit.
$mystring = "[2004/04/13] The date of this article.";
if($mystring =~ m/\d/) { print "Yes"; }Prints “Yes”. The pattern
matches any single digit. In this case, the search will finish as soon as it reads the “2″. Searching always goes left to right.
Huh? Why doesn’t “\d” match the exact characters ‘\’ and ‘d’?
This is because Perl uses characters from the alphabet to also match things with special meaning, like digits. To differentiate between matching a regular character and something else, the character is immediately preceded by a backslash. Therefore, whenever you read ‘\’ followed by any character, you treat the two together as one symbol. For example, ‘\d’ means digit, ‘\w’ means alphanumeric characters including ‘_’, ‘\/’ means forward slash, and ‘\\’ means match a single backslash. Preceding a character with a ‘\’ is called escaping, and the ‘\’ together with its character is called an escape sequence.
Okay, how do I return the first matching digit from my string?
$mystring = "[2004/04/13] The date of this article.";
if($mystring =~ m/(\d)/) {
print "The first digit is $1.";
}Prints “The first digit is 2.” In order to designate a pattern for extraction, one places parenthesis around the pattern. If the pattern is matched, it is returned in the Perl special variable called $1. If there are multiple parenthesized expressions, then they will be in variables $1, $2, $3, etc.
Huh? Why doesn’t ‘(‘ and ‘)’ match the parenthesis symbols exactly?
This is because the designers of regular expressions felt that some constructs are so common that they should use unescaped characters to represent them. Besides parentheses, there are a number of other characters that have special meanings when unescaped, and these are called metacharacters. To match parenthesis characters or other metacharacters, you have to escape them like ‘\(‘ and ‘\)’. They designed it for their convenience, not to make it easy to learn.
Okay, how do I extract a complete number, like the year?
$mystring = "[2004/04/13] The date of this article.";
if($mystring =~ m/(\d+)/) {
print "The first number is $1.";
}Prints “The first number is 2004.” First, when one says “complete number”, one really means a grouping of one or more digits. The pattern quantifier
matches one or more of the pattern that immediately precedes it, in this case, the
. The search will finish as soon as it reads the “2004″.
How do I print all the numbers from the string?
$mystring = "[2004/04/13] The date of this article.";
while($mystring =~ m/(\d+)/g) {
print "Found number $1.";
}Prints “Found number 2004. Found number 04. Found number 13. “. This introduces another pattern modifier
, which tells Perl to do a global search on the string. In other words, search the whole string from left to write.
How do I get all the numbers from the string into an array instead?
$mystring = "[2004/04/13] The date of this article.";
@myarray = ($mystring =~ m/(\d+)/g);
print join(",", @myarray);Prints “2004,04,13″. This does the same thing as before, except assigns the returned values from the pattern search into myarray.
Courtesy : http://www.somacon.com/p127.php
Categories
- Accounts, Finance
- Architect,Interior Design
- Automobiles
- Banking, Financial Services
- BPO, KPO
- Construction, Engineering
- Consultants
- Content, Journalism
- Engineering Design
- Export, Import
- Fashion
- Feng Shui
- Gadgets
- Global, Climate
- Healthcare, Medical
- Hotels, Restaurants
- HR, Admin
- Insurance
- IT, Software Services
- IT- Hardware
- Management
- Marketing, Advertising
- Media, Entertainment
- NGO
- Other
- Pharma, Biotech
- Photography
- Recipe
- Retail
- Site Engineering
- Teaching, Education
- Telecom
- Travel
- Trekking and Mountaineering
- TV, Films
- Web, Graphic Design
Calendar
February 2012 M T W T F S S « Jul 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
Add Widgets (Secondary Sidebar)
This is your Secondary Sidebar. Edit this content that appears here in the widgets panel by adding or removing widgets in the Secondary Sidebar area.




