Samtools consensus for WES files
1
0
Entering edit mode
4.9 years ago
sidjain • 0

Hi all,

I am new to bioinformatics analysis. I have a sorted BAM file which is WES. In order to get chromosome 1 from the given BAM file in fasta format, I use the following set of commands

samtools view -b file_name.bam chr1>chr1.bam

samtools mpileup -E -uf ~/hg38.fa chr1.bam > chr1.mpileup

bcftools call -mv -Oz chr1.mpileup>chr1.vcf.gz

tabix -p vcf chr1.vcf.gz

samtools faidx ~/hg38.fa chr1|bcftools consensus chr1.vcf.gz -o chr1.fa

In the obtained chr1.fa file, it seems that the non-exome parts of the sequence are also there which doesn't make sense. Further, given the human genome is diploid, shouldn't I be getting 2 fasta files??

Assembly Genome sequence alignment • 937 views
ADD COMMENT
0
Entering edit mode
4.9 years ago

In the obtained chr1.fa file, it seems that the non-exome parts of the sequence are also there which doesn't make sense.

I don't see a place where you filtered anything by exomic position or coverage, so I'm not sure why you would expect non-exomic regions to be filtered away.

Further, given the human genome is diploid, shouldn't I be getting 2 fasta files??

Ahh, no. That's not how it works. For a lot of reasons. bcftools consensus might be putting ambiguous letters in where your SNPs are, you'll have to look.

ADD COMMENT
0
Entering edit mode

Thanks. I see. So if the bam file is WES, the vcf file obtained will only have variants from the exome regions, however for the consensus command, we need to put a filter to remove non-exome regions. Am I correct?

ADD REPLY
1
Entering edit mode

You never get 100% reads on target, you will always have some reads that align to non-exome loci, and your software will try to call variants there. You can filter your the bam to only have reads that cross exons, of you could filter your vcf to only have exomic variants. I don't know if you can filter in the consensus command.

ADD REPLY

Login before adding your answer.

Traffic: 2092 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6