Hi.
I have been reading paper "a practical method for the discovery of genomic rearrangements at breakpoint resoultion". and while reading, there is some sentence that I can't understand. It is about 5' and 3' coordinate and I thought that the author might be not write the process in detail but anyway I want to understand it so I am appreciated if you look these sentence for me. and below is the sentence.
In our example, the closest exon of each discordant read is used to cluster discordant reads into distinct gene-gene groups. For every group, a genomic region Ri is defined for each gene by taking the minimum of all 3' coordinates in the cluster and the maximum of all 5' coordinates in the same.
In this sentence, I approximately understand why minimum of 3' coordinate and maximum of 5' coordinate but I can't understand fully with concept. I cannot totally imagine the picture of it.
Can anybody give me a more Intuitive answer?
Thank you for your advice.
Anyway another question also arouses to me suddenly.
Why not taking the minimization in 5' and maximization in 3' instead of that above?
just only to minimize gene fusion range?
So it looks like as shown in the figure below.
Well if we have reads covering region in 5'FPG that is closer to 5', it means that gene sequence is intact there (most likely), while we're interested in getting as closer as possible to fusion junction.
Note that we're studying a cluster of reads, and we want to simply determine coverage boundaries. It is highly unlikely that gene sequence is lost/deleted to 5' of the aforementioned read.
It is highly unlikely that gene sequence is lost/deleted to 5' of the aforementioned read.
Why?
If you'll check out fusion examples from RNA-Seq, or even aCGH data, you can see a gene coverage profile (in case of 5'FPG) as a ~uniformly covered first exons with a drop after fused exon. From biological point of view, commonly 5'FPG retains a promoter, while 3'FPG retains its functional domains.