Question

How to get a FASTA file from BAM, SAM, or VCF files

0

Entering edit mode

4.4 years ago

as2779 • 0

Hello!

I''m trying to get a FASTA file from a BAM size. Essentially, I want to get the entire organism's genome into a FASTA sequence that reads like:

>one_line

ACCGCGG.... (only nucleotides; no more >)

When I asked a similar question, I was told that I had to make a consensus sequence. Basically, I just need to get the organism's genome in a way that's organized above.

I used a command that I found on this website. It was:

samtools bam2fq number1.bam | seqtk seq -A - > pop1.fa

It did successfully convert the file to a FASTA file, but it had multiple descriptions like:

>NB551191:275:HMT7LBGX7:1:11101:1614:1054 1:N:0:ATCACG

TAAATNAGATCATTTTTGTAGAGAAAAANGANGGCTTNCGAATGGTATGAAAATCTCTGTGATCCGTCAAAAACTGACTGAGTTCTGATAAAAAATGTATTGGCAGAAAATACCACTTGGACCAAATCTCAAAAATTGACGGAAATGTCAC

>NB551191:275:HMT7LBGX7:1:11101:18472:1054 1:N:0:ATCACG

TTTCCNGAAAACGCATCCAGCATTGTTTNACNTCATTNGAGAGCTGAAAATTTTCAAACCTGTATTTTCCAATCGCATAATAACTCGTGTCTCCTTCTCCATAATCCGTGGGAAGCTTTCAACTCAATAAATTTTAGGAAAAAAGTTTATT etc....

I only want the one description, and the rest of the file be the nucleotide description. Alternatively, it can be organized by each chromosome, but the most important part is that there should not be a new description every few lines.

If anybody has any advice on how I should do this, please let me know! Thank you in advance for your help!

FASTA FASTQ RepeatModeler sequencing conversion • 1.1k views

ADD COMMENT • link 4.4 years ago by as2779 • 0

0

Entering edit mode

I also think this link can help: Generating consensus sequence from bam file

But I don't know how to call the variants. I do have GATK on the computer that I am using, but I do not know how to use it.

ADD REPLY • link 4.4 years ago by as2779 • 0

0

Entering edit mode

Use this tutorial by @Finswimmer. It walks you through the entire process step by step.
Generating consensus sequence from bam file (this is a different post on biostars even though it has the same title as one you posted above)