Hi everyone
If a range of positions is annotated as no-call in whole genome <var-GSxx-ASM.tsv.bz2>
in cgatools
output files.
I am interested in a varaint in this region that doesn't show up in the variants file and I need to check the coverage.
Does no-call means no reads here. I read the definition in cgatools but did not completely get it
Thank you
Would have been good to mention this is data from Complete Genomics (right?).
I think yes. It is old data in the lab, have to post-process and just have the output files from cgatools.
From the Complete Genomics (now BGI) pipeline, a 'no-call' indicates that they are "uncertain as to whether the genome contains this variant" - this is pulled from documentation tat I had when I was last analysing Complete Genomics data.
Your particuar call is 'no-call-rc', which is 'no call, reference consistent':
[source: http://www.completegenomics.com/documents/Small+Variants+FAQ.pdf]
The coverage information should be found in one of the many files that cgatools produces, if not the
var-GSxx-ASM.tsv
fileYes, I read this kevin, thank you :) I just want to understand more what uncertain mean. Like there is no reads covering this region at all, or there are reads but low confidence. Like uncertain is too open to know what is going on here.
The reason is I am checking a varaint. it is found in the exome data of the effected. But the proband, we have the whole genome. I really want to validate if the variant is really missing here or there are no reads, ..etc.
The idea that I have in my head is that these
no-call
tags are more related to low base qualities and inconsistent base calls, as opposed to low depth of coverage. In this sense, my feeling is that they cannot be trusted. It has been quite some time since I last looked at CG data, though. We were interacting with them long before they were even purchased by BGI in China.