Monday, April 14, 2008

Ruby: Regular Expression handling.


Share at Facebook

You can use various ways to perform regular expression operation using Perl. It supports the similar regular expression of Perl or Preg_match of PHP. I'm giving few ways using that you can execute regular expression inside your ruby script.

This is a very basic regular expression by using the match() function of ruby string class. Below expression will return the values inside the title tages. Remember the matched values are found at $1, $2. $3... variables similar like Perl.

html = "This is a simple html with <title>Ruby Regex</title> Handling.";
/<title>(.*?)<\/title>/.match(html);
print $1,"\n"; ## Print the first match from html string

You can use If condition with your regex. This way if no match found, you won't get error as accessing null values.

if(/<title>(.*?)<\/title>/.match(html))
   print $1,"\n"; ## Print the first match from html string
end

This is the example of exactly same way Perl used to do. This one will do the same as before mentioned codes.

if(html =~ /<title>(.*?)<\/title>/)
   print $1,"\n"; ## Print the first match from html string
end

You can use variables inside your regex. For example at below code, title is kept at tags variable. From regular expression, the tags value used as #{tags}, when this formatted variables found inside REGEX, it will evaluate this.

tags = "title";
if(html =~ /<#{tags}>(.*?)<\/#{tags}>/)
   print $1,"\n"; ## Print the first match from html string
end

You can use the regular expression to replace text from a given string. For example the below code block it will remove the first occurance of the regex.

html = html.sub(/<.*?>/,"");
print html,"\n";

If you want to replace all the occurrences of the match from the string, then you have to use gsub function. This will replace all the matches of the string with given string.

html = html.gsub(/<.*?>/,"");
print html,"\n";

You can use regular expression to split a string too. You can use regex as delimiter. For example at below code block I used to split the string with white spaces.

split_arr = html.split(/\s+/);
split_arr.each{|token|
   print token,"\n";
}

Hope you guys enjoyed the session of REGEXP of ruby.




No comments: