Entering edit mode
3.8 years ago
rijithraj
•
0
Can anyone suggest the best way to convert bam to fasta in an aligned format?
I have a data from ddRad sequencing. Initially, I want to count the number of restriction sites obtained. So, I already got suggestions to get it mapped to the genome and count the sites using available python code.
Following are the commands I have used for fasta conversion.
samtools mpileup -uf ref.fa aln.bam | bcftools call -mv -Oz -o calls.vcf.gz
tabix calls.vcf.gz
cat ref.fa | bcftools consensus calls.vcf.gz > cns.fa
Could you please have look at this and please tell me know if this is the right format where I could get a reliable number of restriction sites?
Other pipeline suggestions are also most welcome.
Since you have special type of data (ddRAD) just wanted to make sure that you are aware of
stacks
(LINK).Thank you for this information. The pipeline looks good to me. Let me have a check.
You can convert your bam to bed then use bedtools getfasta to convert the bed to fasta.
Thanks for your reply.
Would you mind sharing the commands for the same?
Get bed with: bamTobed -i <bamfile> -o <bed_output_name>
Get fasta with: bedtools getfasta -fi <genome_file.fasta> -bed bed_output_name -o your_desired.fasta
copy/pasted from https://www.htslib.org/doc/samtools-fasta.html:
**SYNOPSIS
samtools fastq [options] in.bam
samtools fasta [options] in.bam
DESCRIPTION
Converts a BAM or CRAM into either FASTQ or FASTA format depending on the command invoked. The files will be automatically compressed if the file names have a .gz or .bgzf extension.**