Question: How to distinguish between insertion and assembly error in transcripts?
gravatar for wxh81
5 days ago by
wxh8110 wrote:

I have some coding sequences of the ppc gene family from plant species. The sequences were translated into amino acids and aligned by Clustal. I found a region in the amino acid alignment where only one sequence seems to have a piece of insertion, but I'm not sure if that reflects a real insertion or it is from assembly error (for example chimera). This sequence seems alright otherwise, it has both start and stop codon for the cds.

I have a image for part of the alignment:

Anybody have idea how to distinguish between an insertion and assembly error? Thanks!

rna-seq sequence assembly • 67 views
ADD COMMENTlink modified 5 days ago by lieven.sterck7.8k • written 5 days ago by wxh8110
gravatar for lieven.sterck
5 days ago by
VIB, Ghent, Belgium
lieven.sterck7.8k wrote:

One way to check this is to align your input reads back to the assembled transcripts and inspect the coverage. If the coverage of the 'inserted' part is deviating much from the surroundings it might indeed be an artefact .

However you most likely will never be a 100% sure as it could just as well be an rare isoform you're picking up. But it will at least give you some indication what's going on.

ADD COMMENTlink written 5 days ago by lieven.sterck7.8k

I think I understand your point. That's useful suggestion, thank you!

ADD REPLYlink written 5 days ago by wxh8110
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1209 users visited in the last hour