Extract Unmapped Reads From Blast
2
1
Entering edit mode
12.4 years ago
Lorddoskias ▴ 160

Is there a way to tell blast to output the unmapped reads into a separate file or at least the name of the unmapped sequences?

blast • 4.8k views
ADD COMMENT
1
Entering edit mode

How do you map your reads with BLAST? To what? I don't understand your question.

ADD REPLY
4
Entering edit mode
12.4 years ago
Torst ▴ 980

I assume you have used blast to align some reads "reads.fa" onto a reference "ref.fa", and have a blast report in "out.bls" ?

% formatdb -i ref.fa -p F
% blastall -p blastn -i reads.fa -d ./ref.fa -o out.bls

As RM answered, you can (hackily) get the read IDs that got no hits. I have added extra code to ensure a clean list of IDs only into "nohits.ids":

% grep -B5 "***** No hits" out.bls | grep '^Query=' | sed 's/^Query= //' > nohits.ids

Now if you want to get back the sequences listed in nohits.ids from reads.fa, you can use this trick:

% formatdb -i reads.fa -p F
% fastacmd -d ./reads.fa -i nohits.ids -D 1 -o nohits.fa

Good luck!

ADD COMMENT
0
Entering edit mode
12.4 years ago
Rm 8.3k

If i understand your query correctly: This will extract sequences without hits (unmapped sequences) from blast (BLASTN 2.2.25+) output.

grep -B5 "***** No hits found *****" blast.output.txt | grep Query=
ADD COMMENT

Login before adding your answer.

Traffic: 1719 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6