Extracting the aligned sequence from a BLAST output in a multi-fasta file
1
0
Entering edit mode
5.0 years ago
Ming ▴ 110

Dear All,

I am trying to extract the aligned sequences from my query search from a BLAST output in a single multi-fasta file. How do I go about doing so?

Thank you in advanced.

blast • 4.7k views
ADD COMMENT
2
Entering edit mode
5.0 years ago
flogin ▴ 280

What is your blast output format? and which sequences you want to extract? queries or subjects?

If your output format is 6 (outfmt 6), you can use the information of query/subject names and query/subject positions.

For example, if you need to extract the positions of the subjects that show any match, you can cut the columns of subject name (2), subject start (9) and subject end (10), and use this information in Bedtools (https://bedtools.readthedocs.io/en/latest/content/tools/getfasta.html)

If you need all-region (independent of alignment region) you can retrieve the name of sequences and use the seqtk tool (Seqtk subseq: structure of file name.lst)

ADD COMMENT
0
Entering edit mode

Probably are solutions more efficient, but I'm still a beginner in bioinformatics.

ADD REPLY
1
Entering edit mode

@flogin, thank you very much for pushing me in the correct direction!

ADD REPLY
0
Entering edit mode

You're welcome !!! :D

ADD REPLY

Login before adding your answer.

Traffic: 2429 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6