!/usr/bin/perl

Question

making grep method specific in perl

0

Entering edit mode

7.6 years ago

bioinfo355 • 0

The following code is working fine with the file 'testID' but when this program gets the input from 'piSNPs' file instead of 'testID' then it produces wrong input. The following program is suppose to get IDs from 'testID' file and then search for those IDs in testSNPs file. 'testID' has some IDs that get match and some do not get match and the program gives correct output and shows those IDs which get match. but when I use 'piSNPs' as an input file which has all those IDs none of which are present in 'testSNPs' then program gives wrong output. it shows all the data of testSNPs while it was suppose to give no output as none of the ID got matched. I have shown a line from my testSNPs file to give the idea as the file is large.

!/usr/bin/perl

use warnings;
use strict;

my $rs_id;

open (my $id_file, '<testID') or die "Could not open the file: $!\n";

while ($rs_id = <$id_file>) {

chomp($rs_id);

findVariants($rs_id);

}

close $id_file;

print "Done\n";

sub findVariants {

my ($rs_num) = @_;

open (my $in_file, '<testSNPs') or die "Could not open the file: $!\n";

my @snp = <$in_file>;

my @result= grep /$rs_num/ , @snp;

open (my $out_file, '>>testSNPs.out');

print $out_file @result;

close $out_file;

close $in_file; }

<h6>###contents of testID</h6>

rs559388654

rs528510043

rs551517067

rs201326893_ Y152OCH

rs147422954

rs568103159

rs683

rs1805008

rs1805009

<h6>###contents of piSNPs</h6>

N29insA

rs1110400

rs11547464

rs1805005

rs1805006

rs1805007

rs1805008

rs1805009

rs201326893_ Y152OCH

rs2228479

rs885479

rs2378249

<h6>###contents of testSNPs</h6>

CHROM POS ID REF ALT QUAL FILTER INFO

1 10505 rs548419688 A T 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=9632;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS _AF=0;AA=.|||;VT=SNP

perl finding variants grep • 1.7k views

ADD COMMENT • link updated 7.6 years ago by mittu1602 ▴ 200 • written 7.6 years ago by bioinfo355 • 0

score 0 · Answer 1 · 2017-12-14

0

Entering edit mode

7.6 years ago

mittu1602 ▴ 200

I am not sure of this perl script, but if it is not necessary to follow a perl script you can follow one liner awk command which will match rs_ids column from file1 in file2, as follows:

file1

rs1110400

rs11547464

rs1805005

rs1805006

rs1805007

rs1805008

rs201326893_ Y152OCH

file2

rs559388654

rs528510043

rs551517067

rs201326893_ Y152OCH

rs147422954

rs568103159

awk 'NR==FNR {h[$1] = $1; next} {print $1,h[$1]}' file1.txt file2.txt > output.txt

ADD COMMENT • link 7.6 years ago by mittu1602 ▴ 200

0

Entering edit mode

i do not have to make such comparison. i have to search $rs_id from testID file in 'testSNPs' file.

ADD REPLY • link 7.6 years ago by bioinfo355 • 0