[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Extractiing data with Perl



On Tue, 2003-04-29 at 07:49, Ossama Khayaat wrote:
> Salam,
> I'm using Active State's Perl 5.8 on my Win2K AS machine.
> I have a file that has a list of countries as an option list which I
> want to extract, and put each option in a line and add a closing
> </option> tag.
> After trying for hours I got this script, which worked fine except that
> it repeats any line that has an ending \n (newline).
> ---- Begin Perl script ----
> #!/usr/bin/perl
> while (<STDIN>) {
>   while (m/(<option[\w\s=\"]*>[\w\s\d&#;]+)/gi){
>     chomp();
>     print "$1</option>\n";
>   }
> }---- End Perl script ----

Will that work 

while ($line=<STDIN>) {
    chomp($line);
  while ($line=~/(<option[\w\s=\"]*>[\w]+)/gi){
      print "$1</option>\n";}
}

I hate perl btw, the further I stay from it the more cryptic it gets,
but the above can be done several ways as one lines in perl, awk, and
sed for that matter also PYTHON 

What I did is beautify your code a bit, and chomp the lines before
testing for the condition. I haven't tested it but that's where I think
you going wrong.


> I'm running the script as:
> #perl extract.pl < countries.txt
> 
> Can any one please help, and also explain some things:
> * How can I use \G (explained as: matches were the previous m//g left
> off)?
> * What is the difference between using chop() and chomp()?
chomp will remove the end of line character only
where as chop will delete any ending symbol, and will keep deleting the
tail if you call it again
> 
> I read through the manual but just couldn't figure it out.
try writing a test script :)
> 
> Thanks in advance,
> Ossama Khayat
-- 
Walid Shaari <shaari at arabeyes dot org>
www.arabeyes.org