Can't find a variant which suppose must have in a vcf file
0
0
Entering edit mode
8 months ago
Chris ▴ 260

Hi all,

I get a vcf file after running nf-core pipeline then I annotated ID column using bcftools. But when I search for a variant which know must have because the sample from a person has this mutation, I didn't find it. Would you please suggest what wrong in this case? Thank you so much!

variant-calling bcftools nf-core • 1.6k views
ADD COMMENT
1
Entering edit mode

There are a whole lot of reasons this could be happening, including the fact that you could be misdiagnosing it. What are you comparing to decide that your expected variant locus is indeed being called a hom-ref locus?

ADD REPLY
0
Entering edit mode

hom-ref locus means homozygous reference locus? Sorry for my understanding. I am not sure your question.

ADD REPLY
0
Entering edit mode

Yes, it means the locus is REF-REF (if you're dealing with a biallelic organism).

ADD REPLY
1
Entering edit mode

Just because a pipeline ran does not mean it produced correct results. If you are sure about the variant you are looking for then you should back track and check intermediate result files. If you have not done this already start by looking at the alignment to make sure the expected base changes are there and they made it to the VCF file.

ADD REPLY
0
Entering edit mode

Thanks GenoMax! I know the chromosome and the gene has this mutation so how can I check the alignment?

ADD REPLY
2
Entering edit mode

You must have started with fastq data? There should be BAM alignment files that you can start checking in a genome browser like IGV.

ADD REPLY
0
Entering edit mode

Seem I don't have bam file with nf-core sarek. Thanks GenoMax for your instruction.

ADD REPLY
2
Entering edit mode

You should have "cram" files which are the equivalent alignment files. I see them in example for sarek.

https://nf-co.re/sarek/results#sarek/results-ed1cc8499366dcefea216fe37e36c6189537d57b/germline_test/preprocessing/recalibrated/NA12878/

ADD REPLY
0
Entering edit mode

Yes, I have 2 cram files:

./preprocessing/recalibrated/sample_1/sample_1.recal.cram
./preprocessing/markduplicates/sample_1/sample_1.md.cram

Not sure which file I should view on IGV.

ADD REPLY
1
Entering edit mode

View the one with the more recent time stamp - it's from later in the pipeline.

ADD REPLY
0
Entering edit mode

Thanks @RAM! Which a gene length about 50k bases, I am not sure how to identify the variant. Is that a variant A -> G? Would you explain why other reads are grey but this is red?enter image description here

ADD REPLY
1
Entering edit mode

Yes that seems to be a variant.

Color of reads explained in: Meaning of read color on IGV

ADD REPLY
0
Entering edit mode

Thanks Max! So red mean large insertion but not a point mutation in this case?

ADD REPLY
0
Entering edit mode

I used other tools such as samtools and some variants show in both vcf file but some only in one file. How can I verify which tools results the correct vcf?

ADD REPLY
2
Entering edit mode

look at the bam with e.g. IGV

ADD REPLY

Login before adding your answer.

Traffic: 1060 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6