Question: Question about homologous concept in sequencing.
gravatar for mangfu100
5.7 years ago by
Korea, Republic Of
mangfu100730 wrote:


while i am reading a paper name "An algorithm for Gene Fusion Discovery in Tumor RNA-Seq DATA" related to Gene fusion discovery , I have a trouble with its context.

below is my difficult paragraph from paper and I need someone to understand the fuzzy concept.

I displayed in bold text which I cannot understand.


Previous work on gene fusion detection from RNA-Seq

FusionSeq has been used to identify
fusions in prostate tumor samples and cell lines [10,11]. While the
methods used for these studies are capable of identifying genuine
gene fusions, many challenges and limitations remain in the
analysis of RNA-Seq data. For example, the aforementioned
studies only considered reads that align uniquely to the genome.
However, errors in next generation sequencing together with
homologous and repetitive sequences shared between genes often
produce ambiguous alignments of the short reads generated in
RNA-Seq experiments
. While resolving the ‘correct’ placement of
these reads is often not possible, we propose that ambiguously

aligning reads provide important evidence of real gene fusions, 

and therefore should be leveraged by analysis methods


Firstly I am so confused about the homologous.

I already know the concept of the homologous, but I can't connect the concept into sequencing field.

also why not het instaed of homo? because i think ,  homologous is always same DNA in chromosome pair.

Secondly, What is the meaning of shared between genes?

I hope you understand because I am very beginner in this field.

anyway I am looking forward to your reply.

Thank you!

ADD COMMENTlink modified 5.7 years ago by cdsouthan1.8k • written 5.7 years ago by mangfu100730
gravatar for cdsouthan
5.7 years ago by
cdsouthan1.8k wrote:

The clunky sentence that causes the confusion would be simpler if  "homologous" was replaced with "parologous gene families with high sequence similarity" (quasi-repeats if you like)

The issue is the technical abbility to select between real chimeric transcripts arrising from chromosomal rearangments in vivo,  or as artifacts generated from the assembly contigs of  RNA-seq data in silico.  As said, aligning against the tumor genome may discriminate one from the other (on a good day)

ADD COMMENTlink modified 5.7 years ago • written 5.7 years ago by cdsouthan1.8k
gravatar for linus
5.7 years ago by
linus330 wrote:

The main problem is, that you can not map each read uniquely or even once. Uniquely means at one position of the genome.

Here they notice two different reasons, the first one is the homologous genes and the second one is repetitive sequences. Especially the second case is quite obvious there are regions in the genome which are highly repetitive (some examples If you now try to map your read against these you will have multiple matches! Therefore you can not easily decide which one is the correct one. And if they are shared between genes it gets even worse.

About homologous gene problem i am not sure, but two genes are homologus to each other if they have the same ancestor, but there sequence can vary. If you now map against your reference genome, you might can not map, because your reads are from an altered homologous gene.

ADD COMMENTlink written 5.7 years ago by linus330
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1170 users visited in the last hour