Wednesday, July 2, 2008

Regex to match same consecutive characters


Share at Facebook

If you are a coder on Python/Per/Ruby/PHP then using regular expression will boost your coding speed a lot. While doing coding sometimes you may need to find a way through which you can find all the two consecutive characters from a given string.

For example say you have a string "22 333 4444 55555 666666 7777777 88888888 999999999" and you need to find all the characters that comes twice or more times into the string consecutively. I mean the output is "2 3 4 5 6 7 8 9"

You can use a simple regular expression to find the answer. Here is the example code block.

my $string = "22 333 4444 55555 666666 7777777 88888888 999999999";

## matching the required chars and put them into array
my @array = $string =~ /(.)\1+/g;
print "@array\n";

At the above regex there used /(.)\1+/, here \1 means $1. Since you are using inside regex, so it is using as \1. So (.) means the first char, and \1 means the first char came again. You can add another \1 for matching another extra character. For example the below code will output "5 6 7 8 9" means only the characters those came 5 times or more consecutively into the string.
my $string = "22 333 4444 55555 666666 7777777 88888888 999999999";

## matching the required chars and put them into array
my @array = $string =~ /(.)\1\1\1\1+/g;
print "@array\n";


But I suggest you to use below one instead of just above one.
## Matching five or more consecutive chars
my @array = $string =~ /(.)\1{4,}/g;

Thats all for now. If you are interested, you can read my other programming related topics here http://icfun.blogspot.com/search/label/perl




No comments: