Hi all, I am analyzing WGA data and want to interpreter the alt information after command:
bcftoolsCommand=mpileup --threads 10 -o /data/data/WGA_analysis/wga_19.1.vcf -Ov -f /data/ref_and_beds/reference_human_hg19/Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/genome.fa chr1_marked_wga_19.1.bam
It looks like in fig:
I found in the manual https://samtools.github.io/hts-specs/VCFv4.2.pdf, "The ‘*
’ allele is reserved to indicate that the allele is missing due to a upstream deletion".
At the beginning of the file, all positions except for a few are stars "<*>
" (almost 20 thousand against a couple of non-stars).
It is not very clear to me how to determine then the boundaries of the deletion. I see that on the screen *
goes beyond *
directly coordinate to coordinate. How, then, is the deletion marked? Also *
?
I understand that looking at something that is low covered is usually pointless, DP = 2 is nothing, and it seems to me that maybe a cluster of these stars just appear because of the shallow depth and / or because of low quality in some area ...Maybe some WGA areas, I suppose, should rise worse, and some better.. I'm right? Nevertheless, the question above about deletions remains relevant.
Thanks a lot