I have a simple question regarding homozygous and heterozygous variation.
We are using a reference sequence such as hg19 from UCSC or NCBI, and we can easily see the whole range of human sequence(from chr1 to chr22 with chrX , chrY) at nucleotide level resolution due to the development of next-generation sequencing.
Here, I wonder and hope somebody solve my question.
We know that chromosomes are two pairs that come from our parent, and homozygous concept is that having identical copies of the gene at the same location(e.g. the genotype is AA or aa) whereas heterozygous has different alleles occupying the gene's positions like Aa or aA.
At this point, as i mentioned above, we can identify just only one chromosome location in two pairs using reference sequence. To better understand, supposing that you are using reference sequencing hg19, and you want to go to specific position which is chr2 5500. So you just click the chromosome icon and drag to chr2 and click while going to 5500 position. In this process you don't need to select which pairs you want to search for. There is only one chromosome in reference sequencing, NOT two pairs. Why the reference sequence has just only one of pairs? or Is there anything that i make a mistake?
and I have another question regarding above similar concept.
Why the homozygous deletion events is measure of grouping the reads?
To better explain, I show you part of the paper "CREST maps somatic structural variation in cancer genome with base-pair resolution".
below content is part of it and my second question arises in bold sentence.
Many putative SV breakpoints have soft-clipped reads as well as wild-type reads because SVs usually occur either in a subset of tumors owing to tumor heterogeneity and/or are heterozygous event. Therefore, with the exception for homozygous deletion events, there are usually two groups of reads at a putative breakpoint.
(1) Many putative SV breakpoints have soft-clipped reads as well as wild-type reads because SVs usually occur either in a subset of tumors owing to tumor heterogeneity and/or are heterozygous event.
-> because of heterozygous event, one of pair is likely to be a normal whereas the other has potentially structural variaiont. so aligned read can be categorized into two subgroups which are mapped to normal or are mapped to structural variation range. Is it right?
(2) Therefore, with the exception for homozygous deletion events, there are usually two groups of reads at a putative breakpoint.
-> In the point of view above, I can understand why usually two groups of reads exists because of heterozygous. However, i didn't figure out first sentence. Why only the homozygous deletion is excepted?
There are various homozygous variations from insertion to translocation and i think that all the reads should be formed in one group..because it's homozygous ! each chromosome pair is identical. So why the author pointed the homozygous deletion only? is there other aspects that i didn't capture?