Variant calling using samtools
1
0
Entering edit mode
11 months ago
Chris ▴ 260

Hi all,

I modified my output as a vcf file but not bcf as the instruction. Is that OK or the output is not correct? Thank you so much!

bcftools mpileup -f Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa finalBamFile2.bam | bcftools call -mv -Ob -o variants.vcf.gz
bcftools mpileup -f reference.fa alignments.bam | bcftools call -mv -Ob -o calls.bcf

https://samtools.github.io/bcftools/howtos/variant-calling.html

bcftools • 1.2k views
ADD COMMENT
2
Entering edit mode
11 months ago

as a vcf file but not bcf as the instruction. Is that OK or the output is not correct?

yeah, it's correct. The processing of the vcf.gz will only be sightly slower compared to bcf

ADD COMMENT
0
Entering edit mode

Thank you! So I don't have to run again which took me about half a day. I got a suggestion that uploads bam file to IGV to verify the result. I am checking mutation in a gene of interest. By looking at the location of this gene on IGV using the bam file, I did not find any variant. Would you please have a suggestion to find mutations in a gene of interest? enter image description here

ADD REPLY
1
Entering edit mode

poor qualities at this location.

ADD REPLY
0
Entering edit mode

enter image description here

Thanks for the comment! Are these good qualities?

ADD REPLY
1
Entering edit mode

no. blank reads == poor quality.

ADD REPLY
1
Entering edit mode

Is this RNA or DNA? It looks to me as if what Pierre is interpreting as mapping quality 0 reads may be split reads. (the .junctions file also may indicate that this is the case).

You should be able to load the VCF into IGV as well (after using tabix to index the .vcf.gz)

ADD REPLY
0
Entering edit mode

This is the whole genome so DNA. When I load the vcf into IGV, I got this:

n index file for /variants2_unzip.vcf could not be located. An index is recommended to view files of this size. Click "Go" to create one now or "Cancel to proceed without an index. Maybe I need to use tabix as your suggestion.

ADD REPLY
0
Entering edit mode

Thank you! So could you give me a sample of good qualities? So white space means blank reads?

ADD REPLY
1
Entering edit mode

White-space blank reads mean mapping quality score of 0, which means there are other region(s) of the genome where this read maps equally well so there is no confidence that the mapping is correct.

ADD REPLY
0
Entering edit mode

So you mean grey is good quality and white blank between read in the same row means mapping quality score from tool such STAR, BWA is 0? A good quality sample should have all grey?

ADD REPLY
1
Entering edit mode

Grey does not necessarily mean quality. Grey means mapping quality (MQ) > 0, which for most aligners mean the mapping is unique. For DNAseq, in most data sets over 90% of reads have maximum mapping quality score (e.g., bwa MQ of 60), so a MQ of 10 is still poor. For a region with lots of MQ0 reads, the chances are many of those grey reads have low MQ as well.

ADD REPLY

Login before adding your answer.

Traffic: 2562 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6