searching for genes included in the file name
2
0
Entering edit mode
6.1 years ago
lessismore ★ 1.3k

Dear all,

i have a list of of files,

TMCS09g1008676.gene_ancestors.txt
TMCS09g1008677.gene_ancestors.txt
TMCS09g1008677.gene_ancestors.txt
TMCS09g1008679.gene_ancestors.txt
TMCS09g1008680.gene_ancestors.txt
TMCS09g100868.gene_ancestors.txt

what i want to do is to search for that specific ID in this files in order to filter them. Should be a simple e.g. grep TMCS09g1008676 TMCS09g1008676.gene_ancestors.txt, etc.. but i want to automatize all. i tried creating a list and then iterating e.g. echo TMCS09g1008676.gene_ancestors.txt | cut d '.' -f1 | grep - TMCS09g1008676.gene_ancestors.txt but then grep cannot read from its standard input. any help would be highly appreciated. thanks

bash • 1.3k views
ADD COMMENT
4
Entering edit mode
6.1 years ago

What about a loop like:

for file in *.gene_ancestors.txt
do
gene=$(echo $file | cut -f1 -d'.')
grep $gene $file
done
ADD COMMENT
1
Entering edit mode

my thought exactly.

however, perhaps you'll need to be a bit more strict with grep , as for example TMCS09g100868 might also return TMCS09g1008680. This of course depends also on the content of the files . If there is no confusion possible then WouterDeCoster answer will do perfectly.

ADD REPLY
1
Entering edit mode

perfectly right. grep -w would be even better

ADD REPLY
0
Entering edit mode
6.1 years ago
Paul ★ 1.5k

Or you can use while in bash. But if your files are big, this is not very fast solution.

while read genes; 
 do
 echo "----SEARCHING: $genes"
grep -w "$genes" *.gene_ancestors.txt >> result

done < gene_list.txt
ADD COMMENT

Login before adding your answer.

Traffic: 2673 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6