Off topic:id in text file for sequence retrival from multifasta
0
0
Entering edit mode
8.8 years ago
tcf.hcdg ▴ 70

Dear all

I have a text file containing all the sequences which I wanted to extract from the fasta file. I tried as an example with some raw sequences using grep function and it works perfectly

But I am facing some problem with the original file. I am getting the blank output file. I tried different way and found that there is a problem with the IDs which is in the text file.

Problem which I could understand until now "ids in the text file have the space character at the end" due to this space I am getting the blank output file.

I took some of the sequences from the ID file and remove the space at the end and run again the grep script. It gives me result.

I would like to know

  1. Is space at the end of the id in the text file matter?
  2. How can I remove the space from the id file with some command?
  3. Is there some way that I can tell grep that do not consider the space at the end?

My code is

grep -A1 -wFf ID.txt input.fasta > result.fa

Thanks

fasta grep • 2.4k views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2794 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6