Question: BAM format file to FASTA alignment file
0
gravatar for ben.ward
2.8 years ago by
ben.ward30
United Kingdom
ben.ward30 wrote:

Hi,

I've been given a reference genome and BAM file(s) which are reads aligned to that reference. This will be the first time I've done anything with BAM/SAM formats. I'd like to know how I would go about generating for each BAM file, the full sequence, using the reads in the BAM file and the reference they are aligned to, essentially resulting in a FASTA format alignment of the reference, and then each sequence represented by each BAM file. What are the considerations? For example at the top of my head, whether reads are paired end and their strandedness matters.

I'd like to know if this can be done in R / Bioconductor as I have some familiarity with it - mostly using Biostrings.

Thanks,

Ben.

alignment sam bam R fasta • 3.4k views
ADD COMMENTlink modified 2.8 years ago by Devon Ryan73k • written 2.8 years ago by ben.ward30

I re-read your question and I think you actually want a fasta format alignment.  You actually probably don't want that.  You are new to BAM/SAM and so you might not realize yet that it would be pretty awful having it in a different alignment format.  Take a look at the actual file or the first several lines and just see what the file means before moving onto something more familiar like fasta.

e.g., samtools view file.bam | head -n 999 | column -t | less -S

ADD REPLYlink written 2.8 years ago by Lee Katz2.8k
3
gravatar for Devon Ryan
2.8 years ago by
Devon Ryan73k
Freiburg, Germany
Devon Ryan73k wrote:

The simplest method for generating the consensus sequence is to simply use bcftools consensus. An example is given in the documentation. Obviously, you'll need to have called the variants. GATK has a similar tool.

ADD COMMENTlink written 2.8 years ago by Devon Ryan73k

In samtools v1.1, you can use samtools bam2fq.

Usage:   samtools bam2fq [-nO] [-s <outSE.fq>] <in.bam>

Options: -n        don't append /1 and /2 to the read name
         -O        output quality in the OQ tag if present
         -s FILE   write singleton reads to FILE [assume single-end]
ADD REPLYlink written 2.8 years ago by Lee Katz2.8k

I've interpreted OP as requesting the consensus sequence rather than reproducing fastq files.

ADD REPLYlink written 2.8 years ago by Devon Ryan73k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 797 users visited in the last hour