Need A Script That Finds Whether A String In One Column Matches In Other Columns Of The Same Row
0
1
Entering edit mode
9.2 years ago
biolab ★ 1.4k

Dear all, i have a file and want to know if the word in column 2 contains the word in column 1. If so output yes at column 3, otherewise output no. For example,

   abc abcde
abc abddd
abc dabcc


I want to output like below:

abc abcde yes
abc abddd no
abc dabcc yes


My following script doesn't work. I am a perl beginner, could u pls briefly indicate erros to me? Any comments are ok. THANKS!

my $file= @ARGV; open IN, <$file>;
my @file=<IN>;
my $i=0; if ($i < $#file){ if ($file[1]=~/.*$file[0].*/) { print "$file[0]\t$file[1]\tyes\n"; } else {print "$file[0]\t$file[1]\tno\n";}$i +=1;
}
close IN;

perl • 4.4k views
ADD COMMENT
3
Entering edit mode

What is the relevance to bioinformatics? Also, you could just do that with an awk one-liner: cat foo.txt | awk '{if(match($2,$1)) {print $1,$2,"yes"} else { print $1,$2,"no"}}'

ADD REPLY
2
Entering edit mode

Your one-liner can be shortened to: cat File.txt | awk '{print $1,$2,match($2,$1)?"yes":"no"}'

ADD REPLY
2
Entering edit mode

You can skip cat and shorten this to: awk '{print $1,$2,match($2,$1)?"yes":"no"}' File.txt

ADD REPLY
0
Entering edit mode

powerful awk. Actually I am learning perl now. I am eager to find some rules to program. Anyway, THANKS a lot!

ADD REPLY
0
Entering edit mode

Hi dpryan79, it is relavant to bioinformatics. I only made an example there. My column 1 lists mature miRNA sequences, while column 2 lists predicted miRNA precursor sequence. I need to find those mature miRNAs that locate precisely within the miR precursors. So you can see how this command work in bioinfromatics. THANKS

ADD REPLY
1
Entering edit mode

In the future, you might want to state that in advance. Some editors would tend to close questions like this upon reading it due to lack of relevance.

ADD REPLY
0
Entering edit mode

Have in mind that they can be encoded in - strand.

ADD REPLY
0
Entering edit mode

I used it in bioinformatics!! thank you!!

ADD REPLY
1
Entering edit mode

Stack Overflow - is a question and answer site for professional and enthusiast programmers

ADD REPLY
1
Entering edit mode

As others said: it is important to phrase your question in terms of a research problem in bioinformatics. Otherwise, it appears to be a "straight programming" question and we will direct you to StackOverflow. Although to be honest, this is a "straight Perl" problem even with the bioinformatics content.

ADD REPLY
1
Entering edit mode

Here's a Perl one-liner for the task: perl -lane 'print "@F ",(index$F[1],$F[0])>-1?"yes":"no"' foo.txt`

In addition to StackOverflow, PerlMonks is another site for Perl questions.

ADD REPLY
0
Entering edit mode

thanks for all answers!

ADD REPLY

Login before adding your answer.

Traffic: 1053 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6