Question: Full header in blastn out6 format ??
0
gravatar for Picasa
4.8 years ago by
Picasa590
Picasa590 wrote:

Hello,

Is it possible to get the full header in blastn out6 mode when outputing the result ?

For example, my output looks like:

seq1 GBSQ01012201.1.2324 100.0 297 0 0 1 297 1 2324 -1 0

My database:

GBSQ01012201.1.2324 Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Moraxellaceae;Penicillium janthinellum AAUCAAGUGCCUUAAGGGUGU ...

I would like an output like this:

seq1 GBSQ01012201.1.2324 Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Moraxellaceae;Penicillium janthinellum 100.0 297 0 0 1 297 1 2324 -1 0

blast • 2.0k views
ADD COMMENTlink modified 4.8 years ago by Damian Kao15k • written 4.8 years ago by Picasa590
3
gravatar for Damian Kao
4.8 years ago by
Damian Kao15k
USA
Damian Kao15k wrote:

You will have to custom define your tab delimited output to include the "stitle" column. For example:

The default columns for outfmt 6 are:

'qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore'

Just add 'stitle' in there. So something like:

-outfmt '6 qseqid sseqid stitle pident length mismatch gapopen qstart qend sstart send evalue bitscore'
ADD COMMENTlink written 4.8 years ago by Damian Kao15k

As far as I recall, this only works with 1. NCBI's preformatted BLAST DBs and 2. DBs created from sequence files retrieved from the NCBI when -parse_seqids is used in makeblastdb and the sequence headers follow the gi|acc formatting (so it does not work with e.g. sequences from GenBank assembly dirs).

ADD REPLYlink written 4.8 years ago by 5heikki9.3k
0
gravatar for 5heikki
4.8 years ago by
5heikki9.3k
Finland
5heikki9.3k wrote:

If you replace spaces with e.g. underscores, you will see full headers in output. Alternatively, if you create a map file like (id TAB lineage), which you can parse from your headers easily as such:

GBSQ01012201.1.2324    Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Moraxellaceae;Penicillium janthinellum
...

You can join the rest of the header to tabular blast output with join, e.g.

join -1 2 -2 1 -t $'\t' -o. 1.1,1,2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,1.10,1.11,1.12,2.2 <(sort -k2,2 -t $'\t' blastOutput) <(sort -k1,1 -t $'\t' mapfile) > blastOutputWithLineageAtCol13
ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by 5heikki9.3k

Thanks, Ill look at it

ADD REPLYlink modified 4.8 years ago • written 4.8 years ago by Picasa590
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1697 users visited in the last hour
_