Question: perl script: extracting mir-target information based on a list
0
gravatar for biolab
5.2 years ago by
biolab1.1k
biolab1.1k wrote:

Dear all,

I have a file with mir-target information as well as a list file shown below. I wrote a script to extract mir-target information according to list file.

mir-target infor file:

miR156 AT1G19920.1:1068 5' AUGUUC-CUCUUGA-UGUUA 3'     ||o|| ||||o   |||o| 3' CACGAGUGAGAGAAGACAGU 5'
miR390 AT1G19920.1:1247 5' GGUCGUGAUCCUGCAGGAAUGGGCCA 3'    || ||o ||||o       ||o||   3' CC-GCGAUAGGGAGG----ACUCGAA 5'
miR172 AT1G19940.1:899 5' GGGCAGCUUCAU---GGUUGU 3'      ||||| ||||   |o|| | 3' UACGUCGUAGUAGUUCUAAGA 5'

 

list file:

miR156 AT1G19920.1:1068

miR172 AT1G19940.1:899

 

my perl script:

#!/usr/bin/perl -w
use strict;
if (@ARGV<1){print "perl sta.pl LIST INPUT"; exit;}

my %h;
open LIST, '<', $ARGV[0];
while(my $line = <LIST>){
    chomp $line;
    $h{$line} = 1;
}
close LIST;

open FH, '<', $ARGV[1];
while (my $line2 = <FH>){
    chomp $line2;
    my @a = split/\t/, $line2;
    my $mirsite = "$a[0]"."\t"."$a[1]";


     if (exists $h{$mirsite} ){
            print "$line2\n";
     }
}
close FH;

 

My PROBLEM is when running perl extr.pl list input  I only got the miR172 line, the miR156 line can't be outputted.

Your help on my script will be much appreciated! THANKS!!

perl • 1.1k views
ADD COMMENTlink modified 5.2 years ago by Kenosis1.2k • written 5.2 years ago by biolab1.1k
1
gravatar for Prakki Rama
5.2 years ago by
Prakki Rama2.3k
Singapore
Prakki Rama2.3k wrote:

Assuming tab space, is between 

miR156 AT1G19920.1:1068 \t 5' AUGUUC-CUCUUGA-UGUUA 3'     ||o|| ||||o   |||o| 3' CACGAGUGAGAGAAGACAGU 5'

Try this:

#!/usr/bin/perl -w
use strict;
if (@ARGV<1){print "perl sta.pl LIST INPUT"; exit;}
my %h;
open LIST, '<', $ARGV[0];
while(my $line = <LIST>){
    chomp $line;
    $h{$line} = 1;
}
close LIST;

open FH, '<', $ARGV[1];
while (my $line2 = <FH>){
    chomp $line2;
    my @a = split/\t/, $line2;
    my $mirsite = "$a[0]";#."\t"."$a[1]";
    #print "$mirsite\n";
     if (exists $h{$mirsite} ){
            print "$line2\n";
     }
}
close FH;

ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by Prakki Rama2.3k

HI Prakki, thanks a lot. Adding a "#" didn't work.  I am sure my list file and input file are tab separated. Something is weired.  Thanks anyway.

ADD REPLYlink written 5.2 years ago by biolab1.1k

I finally sorted out the problem. When  deleting all \r in the list and input files, the script works well.

Sorry for my non-biology-relavant post, but I do hope it could give little help to others.

ADD REPLYlink written 5.2 years ago by biolab1.1k

but the above script with # worked for me! anyway good that, you could sort it out.:)

ADD REPLYlink written 5.2 years ago by Prakki Rama2.3k
1
gravatar for Kenosis
5.2 years ago by
Kenosis1.2k
Kenosis1.2k wrote:

Here's another option that uses Perl to handle the file i/o (no need to explicitly open and close files), and a capture to grab the first part of your mir-target infor file lines:

use strict;
use warnings;

@ARGV == 2 or die 'perl sta.pl LIST INPUT';

my %h;

while (<>) {
    chomp;
    $h{$_} = undef;
    last if eof;
}

while (<>) {
    print if /(.+?)\s+\d+'/ and exists $h{$1};
}

Hope this helps!

ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by Kenosis1.2k

Thank you Kenosis, I really appreciate your answer.

ADD REPLYlink written 5.2 years ago by biolab1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 680 users visited in the last hour