Question: convert blastn output file to sam/bam
0
gravatar for federica.r299
4.6 years ago by
Italy
federica.r2990 wrote:

I am trying to convert the output of blast in a .sam or .bam file. I tried to use the blast2sam tool but there are many warnings and the output file is not complete.

Is there another tool to make the conversion or another alignment tool for which it is possible to specify the output format as .sam or .bam?

Thanks

blast2bam blast blast2sam • 3.6k views
ADD COMMENTlink written 4.6 years ago by federica.r2990
1

Take a look at Peter Cock's post: http://blastedbio.blogspot.com/2015/07/ncbi-working-on-sam-output-from-blast.html

ADD REPLYlink written 4.6 years ago by a.zielezinski9.0k

Maybe there is something wrong with your input rather than with the tool you use? You should post the exact command you executed and some or all the warnings you see and give details of the input file, like format and a few example lines.

There is a blast2bam converter, is it the one you used? (You mention blast2sam).

ADD REPLYlink written 4.6 years ago by dariober11k

The alignement of the reads has been done with the command

blastn -query 130205_UNC11-SN627_0280_AC1NEKACXX_TTAGGC_L004_1.fasta \
    -db blast_ref -word_size 15 \
    -outfmt "6 qseqid sseqid pident nident length mismatch positive gapopen gaps ppos qframe sframe sstrand qcovs qstart qend qseq sstart send sseq evalue bitscore score" \
    -out blast_tab

This is the first line of the output blast_tab:

UNC11-SN627:280:C1NEKACXX:4:1101:11031:1976     sequenzadifusione       93.62   44      3       44      0       0       93.62   1       1       plus    98      2       48      TGAACCCGGGAGGTGGAGGTTGCAGTGAGCCGAGATTGCGCCACTGC 24710   24756   TGAACCCGGGAGGTGGAGGCTGCAGTGAGCTGAGATAGCGCCACTGC 6e-16   71.3    38

Then the conversion has been done with the command blast2sam (not blast2bam)

blast2sam.pl blast_tab > blast.sam

For the conversion we didn't use the default format, but the tabular format of the output of blast.

In the conversion there aren't errors, but the output file blast.sam is empty.

Where can be the error?

ADD REPLYlink modified 3 months ago by RamRS25k • written 4.6 years ago by federica.r2990

I think blast2sam.pl has not been updated for some time as Heng Li said:

BLAST support will be dropped unless someone want to maintain it. I realize that it would be better to have fewer functionality to avoid letting others blame me for having too many bugs. I just thought this script may be useful to someone occasionally, but it is now causing more troubles than good. Sorry.

That is part of why I wrote Blast2Bam.

If you want to use it, blast output will have to be in XML format (-outfmt 5).

ADD REPLYlink modified 3 months ago by RamRS25k • written 4.6 years ago by Aurélien Guy-Duché30

I downloaded the code, but I'm not able to create the ref.dict. How can I do it?
Then in the folder "src" there are two codes (blastSam.c and blastSam.h), so which one should I use?
Thanks

ADD REPLYlink written 4.6 years ago by federica.r2990

The .dict file is created by picard-tools:

In your case:

picard-tools CreateSequenceDictionary R=blast_ref O=blast_ref.dict

The src folder contains the source code, not the program.

You need to compile the code first by typing "make" in your command line in the main folder or in the src folder.

The program will then be in the bin folder.

You can then pipe the output of blastn in the program:

blastn -query 130205_UNC11-SN627_0280_AC1NEKACXX_TTAGGC_L004_1.fasta -db blast_ref -word_size 15 -outfmt 5 | blast2bam - blast_ref.dict 130205_UNC11-SN627_0280_AC1NEKACXX_TTAGGC_L004_1.fasta > out.sam
ADD REPLYlink written 4.6 years ago by Aurélien Guy-Duché30

By typing make in src folder there is an error

xsltproc --output parseXML.c --stringparam fileType c schema2c.xsl schema.xml
make: xsltproc: Command not found
make: *** [parseXML.c] Error 127
ADD REPLYlink modified 3 months ago by RamRS25k • written 4.6 years ago by federica.r2990

As Pierre told you on seqanswers.com, you got this error because xsltproc wasn't installed on your computer.

In order to compile Blast2Bam, you will also need libxml2, zlib and of course gcc.

If your XML is big, you should pipe the blast output into Blast2Bam, like I've shown you in my previous comment.

Again, if you're afraid the SAM file will be too big, you should pipe the output of Blast2Bam into SAMtools to make a BAM file:

blastn -query 130205_UNC11-SN627_0280_AC1NEKACXX_TTAGGC_L004_1.fasta -db blast_ref -word_size 15 -outfmt 5 | blast2bam - blast_ref.dict 130205_UNC11-SN627_0280_AC1NEKACXX_TTAGGC_L004_1.fasta | samtools view -Sb -F 0xF00 - > out.bam

-F 0xF00 is used to filter the results in order to keep only the primary alignments. You may or may not want to use this option depending on what you want to do with the results.

ADD REPLYlink written 4.6 years ago by Aurélien Guy-Duché30

I got it working. Thank you.

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by viv3kanand10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1503 users visited in the last hour