Question: Trouble obtaining chromosome number using standalone blastn
1
gravatar for DVA
4.7 years ago by
DVA510
United States
DVA510 wrote:

Hello,

I'm having trouble obtaining the exact chromosome number, when using standalone BLAST. My current setting is like below:

/...BLAST/blast-2.2.29+/bin/blastn -db GPIPE/9606/105/GCF_000001405.25_top_level -query test.fasta -out test.out -task blastn -outfmt 6 -remote -word_size=15

,where the database is GRCh37.

The result looks like:

sample1    gi|441431411|ref|NW_004166863.1|    100.00    41    0    0    38    78    527422    527382    1e-012    77.0
 

As you can see, there is gene id and the position of the reference sequence, but there is no chromosome number. I couldn't find this option in the format specifiers blast provided either. I know I could get what I want by standalone BLAT, but I need to ensure BLAST also works, because it has a windows version. I also understand if I use a XML or query-anchored format, I could locate the chromosome information in a hard way. However I'm just disappointed that I can't find it in the tabular format... Could some help me please? 

Thank you!

Helene

ADD COMMENTlink modified 4.7 years ago by Philipp Bayer6.0k • written 4.7 years ago by DVA510
3
gravatar for Philipp Bayer
4.7 years ago by
Philipp Bayer6.0k
Australia/Perth/UWA
Philipp Bayer6.0k wrote:

Is the chromosome number in the full name of the gene 'gi|441431411|ref|NW_004166863.1|'?

If it is, you can customize the output of blast+, the BLAST user manual has the full options here: http://www.ncbi.nlm.nih.gov/books/NBK1763/#_CmdLineAppsManual_Appendix_C_Options_for_

In your case, you might want to try

/...BLAST/blast-2.2.29+/bin/blastn -db GPIPE/9606/105/GCF_000001405.25_top_level -query test.fasta -out test.out -task blastn -outfmt "6 qseqid salltitles pident length mismatch gapopen qstart qend sstart send evalue bitscore"  -remote -word_size=15

Most of this is the standard format, except that I have replaced 'sseqid' by 'salltitles', which gives you all possible titles for that gene instead of just the ID. Maybe that will print your chromosome name?

The linked above manual has all possible options, so you can trim it down more depending on what you want.

There is a longer explanation in this blog post: http://blastedbio.blogspot.com.au/2012/05/blast-tabular-missing-descriptions.html The interesting parts start after 'Update (1 April 2013)'

ADD COMMENTlink modified 4.7 years ago • written 4.7 years ago by Philipp Bayer6.0k

Philipp you are amazing!! Salltitles works perfect for me. I really appreciate your thorough answer. 

ADD REPLYlink written 4.7 years ago by DVA510
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1245 users visited in the last hour