Friday, April 3, 2009

Perl: Remove duplicate elements from a file


Share at Facebook

Lets learn about how to remove the duplicate elements from an input file. I have talked about similar issue few months back at my posting http://icfun.blogspot.com/2008/06/perl-remove-duplicate-elements-from.html, which was to remove duplicate elements from a list.

Ok, The below example will using a %duplicate_hash , It will count the number of occurrence of a line from file. After that it will just the print the KEY from the hash, which is actually the lines from the file.

#!/usr/bin/perl

my $input_file = 'input.txt';
my %duplicate_hash = ();

local @ARGV = ($input_file);
local $^I = '.bak';
while(<>){
$duplicate_hash{$_}++;
next if $duplicate_hash{$_} > 1;
print;
}


Remember, the above code will also create a backup file named as input.txt.bak, so that you don't messed up. Cheers!!!




1 comment:

BhawnaSingh said...

Thanks for your comments. Yes I agree there are something which is best done in Perl. You sure of good set of tips and guides for perl.

I am new to blogging world, but please do visit my blog or email me if you need to discuss anything related to Java, c# or Perl.
http://everydaydeveloper.blogspot.com/