Question: BAM/SAM to FASTA conversion
3
gravatar for biolab
5.7 years ago by
biolab1.2k
biolab1.2k wrote:

Hi everone,

I searched Biostars for BAM/SAM to FASTA conversion method, and found the tools EMBOSS Picard could do this (Convert Bam File To Fasta File)   I am wondering is there any perl script to complish this work?   I should note that my seq data is strand-specific seq, so the command samtools view filename.bam | awk '{OFS="\t"; print ">"$1"\n"$10}' may not be very good.

I appreciate any of your solutions. Thanks a lot!

sam bam • 42k views
ADD COMMENTlink modified 9 weeks ago by kristina.mahan100 • written 5.7 years ago by biolab1.2k
17
gravatar for Carambakaracho
2.7 years ago by
Carambakaracho2.2k
Germany/Cologne
Carambakaracho2.2k wrote:

For future record: samtools versions >1.3 can convert bam to fasta directly via samtools fasta

samtools --help

Program: samtools (Tools for alignments in the SAM format)
Version: 1.4 (using htslib 1.4)

Usage:   samtools <command> [options]

Commands:
  [...]
  -- File operations
     [...]
     fastq          converts a BAM to a FASTQ
     fasta          converts a BAM to a FASTA
  [...]
ADD COMMENTlink written 2.7 years ago by Carambakaracho2.2k
12
gravatar for Istvan Albert
5.7 years ago by
Istvan Albert ♦♦ 85k
University Park, USA
Istvan Albert ♦♦ 85k wrote:

The advice in the post you link to is greatly outdated. One of the reasons we should revisit these posts. There are far easier/faster ways to accomplish the same task.

The latest versions of samtools have the bam2fq command. Chain that to seqtk seq like so:

samtools bam2fq input.bam | seqtk seq -A > output.fa
ADD COMMENTlink modified 13 months ago by RamRS30k • written 5.7 years ago by Istvan Albert ♦♦ 85k

Hi Istvan, thank you very much for your answer.  It's much helpful for my work.  I just have two more minor questions: (1) my samtools version is Version: 0.1.8 (r613).  I did not see samtools bam2fq function, so I need to update samtools?  (2) Additionally is seqtk inside samtools? If not, where to get it?  Thanks a lot!

ADD REPLYlink written 5.7 years ago by biolab1.2k
2

You need to update samtools to at least 1.1 http://www.htslib.org/

Get seqtk from https://github.com/lh3/seqtk see also  How To Obtain And Install Seqtk

ADD REPLYlink written 5.7 years ago by Istvan Albert ♦♦ 85k

Hi Istvan, your answer is really helpful.

ADD REPLYlink written 5.7 years ago by biolab1.2k
4
gravatar for Chadi Saad
3.2 years ago by
Chadi Saad80
France
Chadi Saad80 wrote:

You need to add '-' for seqtk in order to read from stdin :

samtools bam2fq input.bam | seqtk seq -A - > output.fa
ADD COMMENTlink modified 8 months ago by ATpoint39k • written 3.2 years ago by Chadi Saad80

You can add - but it is not necessary.

ADD REPLYlink written 8 months ago by ATpoint39k
3
gravatar for genomax
2.7 years ago by
genomax90k
United States
genomax90k wrote:

reformat.sh from BBMap suite will do the conversion efficiently.

reformat.sh in=your.bam out=seq.fa
reformat.sh in=your.sam out=seq.fa
ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by genomax90k
1
gravatar for Alex Reynolds
2.7 years ago by
Alex Reynolds30k
Seattle, WA USA
Alex Reynolds30k wrote:

Via bam2bed and bed2faidxsta.pl:

$ bam2bed < reads.bam | bed2faidxsta.pl > reads.fa

Or sam2bed:

$ sam2bed < reads.sam | bed2faidxsta.pl > reads.fa
ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by Alex Reynolds30k

I followed your instruction and even copied the unzipped bed2faidxstal.pl to the target folder, I got error "(base) bieniaszs-ipro:K20 bieniaszlab$ sam2bed < NL43_5562_filterredreads_header.sam | bed2faidxsta.pl > NL43_5562_filterredreads_header.fa -bash: bed2faidxsta.pl: command not found".

Then I add "perl" before the bed2faidxstal.pl, I still got error as below:

(base) bieniaszs-ipro:K20 bieniaszlab$ sam2bed < NL43_5562_filterredreads_header.sam | perl bed2faidxsta.pl > NL43_5562_filterredreads_header.fa
[E::fai_build3_core] Failed to open the file /Users/bieniaszlab/miniconda3/data/K_SRF11_U/K_20/K20/NL43_WT_CLIP.fa.gz
[faidx] Could not load fai index of /Users/bieniaszlab/miniconda3/data/K_SRF11_U/K_20/K20/NL43_WT_CLIP.fa.gz
[E::fai_build3_core] Failed to open the file /Users/bieniaszlab/miniconda3/data/K_SRF11_U/K_20/K20/NL43_WT_CLIP.fa.gz
[faidx] Could not load fai index of /Users/bieniaszlab/miniconda3/data/K_SRF11_U/K_20/K20/NL43_WT_CLIP.fa.gz
[E::fai_build3_core] Failed to open the file /Users/bieniaszlab/miniconda3/data/K_SRF11_U/K_20/K20/NL43_WT_CLIP.fa.gz
[faidx] Could not load fai index of /Users/bieniaszlab/miniconda3/data/K_SRF11_U/K_20/K20/NL43_WT_CLIP.fa.gz
[E::fai_build3_core] Failed to open the file /Users/bieniaszlab/miniconda3/data/K_SRF11_U/K_20/K20/NL43_WT_CLIP.fa.gz
ADD REPLYlink modified 14 months ago by genomax90k • written 14 months ago by xiaoleiusc60

Put the directory containing the Perl script into your environment PATH, or run it with Perl. Make sure your FASTA is indexed with samtools.

ADD REPLYlink written 14 months ago by Alex Reynolds30k

Thanks. I need FASTA output from SAM input, you mean to make sure my SAM file (not FASTA file) is indexed right?

ADD REPLYlink written 14 months ago by xiaoleiusc60
1

No, I mean that your reference genome FASTA must be indexed, in order to get out FASTA based on the BED coords that come out of the SAM conversion step. (Sorry for brevity, I was on a cell phone. Feel free to follow up if this isn't clear.)

ADD REPLYlink modified 14 months ago • written 14 months ago by Alex Reynolds30k
0
gravatar for kristina.mahan
9 weeks ago by
kristina.mahan100 wrote:

samtools fasta input.bam > output.fasta

ADD COMMENTlink written 9 weeks ago by kristina.mahan100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1656 users visited in the last hour