Question: making grep method specific in perl
0
gravatar for bioinfo355
2.2 years ago by
bioinfo3550
Pakistan
bioinfo3550 wrote:

The following code is working fine with the file 'testID' but when this program gets the input from 'piSNPs' file instead of 'testID' then it produces wrong input. The following program is suppose to get IDs from 'testID' file and then search for those IDs in testSNPs file. 'testID' has some IDs that get match and some do not get match and the program gives correct output and shows those IDs which get match. but when I use 'piSNPs' as an input file which has all those IDs none of which are present in 'testSNPs' then program gives wrong output. it shows all the data of testSNPs while it was suppose to give no output as none of the ID got matched. I have shown a line from my testSNPs file to give the idea as the file is large.

!/usr/bin/perl

use warnings;
use strict;

my $rs_id;

open (my $id_file, '<testID') or die "Could not open the file: $!\n";

while ($rs_id = <$id_file>) {

chomp($rs_id);

findVariants($rs_id);

}

close $id_file;

print "Done\n";

sub findVariants {

my ($rs_num) = @_;

open (my $in_file, '<testSNPs') or die "Could not open the file: $!\n";

my @snp = <$in_file>;

my @result= grep /$rs_num/ , @snp;

open (my $out_file, '>>testSNPs.out');

print $out_file @result;

close $out_file;

close $in_file; }

<h6>###contents of testID</h6>

rs559388654

rs528510043

rs551517067

rs201326893_ Y152OCH

rs147422954

rs568103159

rs683

rs1805008

rs1805009

<h6>###contents of piSNPs</h6>

N29insA

rs1110400

rs11547464

rs1805005

rs1805006

rs1805007

rs1805008

rs1805009

rs201326893_ Y152OCH

rs2228479

rs885479

rs2378249

<h6>###contents of testSNPs</h6>

CHROM POS ID REF ALT QUAL FILTER INFO

1 10505 rs548419688 A T 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=9632;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS _AF=0;AA=.|||;VT=SNP

finding variants grep perl • 629 views
ADD COMMENTlink modified 2.2 years ago by mittu1602170 • written 2.2 years ago by bioinfo3550
0
gravatar for mittu1602
2.2 years ago by
mittu1602170
India
mittu1602170 wrote:

I am not sure of this perl script, but if it is not necessary to follow a perl script you can follow one liner awk command which will match rs_ids column from file1 in file2, as follows:

file1

rs1110400

rs11547464

rs1805005

rs1805006

rs1805007

rs1805008

rs201326893_ Y152OCH

file2

rs559388654

rs528510043

rs551517067

rs201326893_ Y152OCH

rs147422954

rs568103159

awk 'NR==FNR {h[$1] = $1; next} {print $1,h[$1]}' file1.txt file2.txt > output.txt

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by mittu1602170

i do not have to make such comparison. i have to search $rs_id from testID file in 'testSNPs' file.

ADD REPLYlink written 2.2 years ago by bioinfo3550
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 805 users visited in the last hour