Trouble obtaining chromosome number using standalone blastn
1
1
Entering edit mode
9.6 years ago
DVA ▴ 630

Hello,

I'm having trouble obtaining the exact chromosome number, when using standalone BLAST. My current setting is like below:

/...BLAST/blast-2.2.29+/bin/blastn \
  -db GPIPE/9606/105/GCF_000001405.25_top_level \
  -query test.fasta \
  -out test.out \
  -task blastn -outfmt 6 \
  -remote \
  -word_size=15

where the database is GRCh37.

The result looks like:

sample1    gi|441431411|ref|NW_004166863.1|    100.00    41    0    0    38    78    527422    527382    1e-012    77.0

As you can see, there is gene id and the position of the reference sequence, but there is no chromosome number. I couldn't find this option in the format specifiers blast provided either. I know I could get what I want by standalone BLAT, but I need to ensure BLAST also works, because it has a windows version. I also understand if I use a XML or query-anchored format, I could locate the chromosome information in a hard way. However I'm just disappointed that I can't find it in the tabular format... Could some help me please?

Thank you!

Helene

standalone-blast blastn alignment • 2.6k views
ADD COMMENT
3
Entering edit mode
9.6 years ago

Is the chromosome number in the full name of the gene 'gi|441431411|ref|NW_004166863.1|'?

If it is, you can customize the output of blast+, the BLAST user manual has the full options here: http://www.ncbi.nlm.nih.gov/books/NBK1763/#_CmdLineAppsManual_Appendix_C_Options_for_

In your case, you might want to try

/...BLAST/blast-2.2.29+/bin/blastn \
  -db GPIPE/9606/105/GCF_000001405.25_top_level \
  -query test.fasta \
  -out test.out \
  -task blastn \
  -outfmt "6 qseqid salltitles pident length mismatch gapopen qstart qend sstart send evalue bitscore" \
  -remote \
  -word_size=15

Most of this is the standard format, except that I have replaced 'sseqid' by 'salltitles', which gives you all possible titles for that gene instead of just the ID. Maybe that will print your chromosome name?

The linked above manual has all possible options, so you can trim it down more depending on what you want.

There is a longer explanation in this blog post: http://blastedbio.blogspot.com.au/2012/05/blast-tabular-missing-descriptions.html The interesting parts start after 'Update (1 April 2013)'

ADD COMMENT
0
Entering edit mode

Philipp you are amazing!! Salltitles works perfect for me. I really appreciate your thorough answer.

ADD REPLY

Login before adding your answer.

Traffic: 3443 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6