perl script: extracting mir-target information based on a list
2
0
Entering edit mode
9.9 years ago
biolab ★ 1.4k

Dear all,

I have a file with mir-target information as well as a list file shown below. I wrote a script to extract mir-target information according to list file.

mir-target infor file:

miR156 AT1G19920.1:1068 5' AUGUUC-CUCUUGA-UGUUA 3'     ||o|| ||||o   |||o| 3' CACGAGUGAGAGAAGACAGU 5'
miR390 AT1G19920.1:1247 5' GGUCGUGAUCCUGCAGGAAUGGGCCA 3'    || ||o ||||o       ||o||   3' CC-GCGAUAGGGAGG----ACUCGAA 5'
miR172 AT1G19940.1:899 5' GGGCAGCUUCAU---GGUUGU 3'      ||||| ||||   |o|| | 3' UACGUCGUAGUAGUUCUAAGA 5'

list file:

miR156 AT1G19920.1:1068
miR172 AT1G19940.1:899

my perl script:

#!/usr/bin/perl -w
use strict;
if (@ARGV<1){print "perl sta.pl LIST INPUT"; exit;}

my %h;
open LIST, '<', $ARGV[0];
while(my $line = <LIST>){
    chomp $line;
    $h{$line} = 1;
}
close LIST;

open FH, '<', $ARGV[1];
while (my $line2 = <FH>){
    chomp $line2;
    my @a = split/\t/, $line2;
    my $mirsite = "$a[0]"."\t"."$a[1]";
     if (exists $h{$mirsite} ){
            print "$line2\n";
     }
}
close FH;

My PROBLEM is when running perl extr.pl list input I only got the miR172 line, the miR156 line can't be outputted.

Your help on my script will be much appreciated! THANKS!!

perl • 2.5k views
ADD COMMENT
1
Entering edit mode
9.9 years ago
Prakki Rama ★ 2.7k

Assuming tab space, is between

miR156 AT1G19920.1:1068 \t 5' AUGUUC-CUCUUGA-UGUUA 3'     ||o|| ||||o   |||o| 3' CACGAGUGAGAGAAGACAGU 5'
________________________^^

Try this:

#!/usr/bin/perl -w
use strict;
if (@ARGV<1){print "perl sta.pl LIST INPUT"; exit;}
my %h;
open LIST, '<', $ARGV[0];
while(my $line = <LIST>){
    chomp $line;
    $h{$line} = 1;
}
close LIST;

open FH, '<', $ARGV[1];
while (my $line2 = <FH>){
    chomp $line2;
    my @a = split/\t/, $line2;
    my $mirsite = "$a[0]";#."\t"."$a[1]"; #Note this line
    #print "$mirsite\n";
     if (exists $h{$mirsite} ){
            print "$line2\n";
     }
}
close FH;
ADD COMMENT
0
Entering edit mode

Hi Prakki, thanks a lot. Adding a "#" didn't work. I am sure my list file and input file are tab separated. Something is weird. Thanks anyway.

ADD REPLY
0
Entering edit mode

I finally sorted out the problem. When deleting all \r in the list and input files, the script works well.

Sorry for my non-biology-relavant post, but I do hope it could give little help to others.

ADD REPLY
0
Entering edit mode

but the above script with # worked for me! anyway good that, you could sort it out.:)

ADD REPLY
1
Entering edit mode
9.9 years ago
Kenosis ★ 1.3k

Here's another option that uses Perl to handle the file i/o (no need to explicitly open and close files), and a capture to grab the first part of your mir-target infor file lines:

use strict;
use warnings;

@ARGV == 2 or die 'perl sta.pl LIST INPUT';

my %h;

while (<>) {
    chomp;
    $h{$_} = undef;
    last if eof;
}

while (<>) {
    print if /(.+?)\s+\d+'/ and exists $h{$1};
}

Hope this helps!

ADD COMMENT
0
Entering edit mode

Thank you Kenosis, I really appreciate your answer.

ADD REPLY

Login before adding your answer.

Traffic: 2058 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6