How to distinguish between insertion and assembly error in transcripts?
1
0
Entering edit mode
3.9 years ago
wxh81 ▴ 10

I have some coding sequences of the ppc gene family from plant species. The sequences were translated into amino acids and aligned by Clustal. I found a region in the amino acid alignment where only one sequence seems to have a piece of insertion, but I'm not sure if that reflects a real insertion or it is from assembly error (for example chimera). This sequence seems alright otherwise, it has both start and stop codon for the cds.

I have a image for part of the alignment: https://drive.google.com/open?id=1gAgweM_1conDUa3vKAwKE0if2Ow6s8gp

Anybody have idea how to distinguish between an insertion and assembly error? Thanks!

RNA-Seq Assembly sequence • 531 views
ADD COMMENT
2
Entering edit mode
3.9 years ago

One way to check this is to align your input reads back to the assembled transcripts and inspect the coverage. If the coverage of the 'inserted' part is deviating much from the surroundings it might indeed be an artefact .

However you most likely will never be a 100% sure as it could just as well be an rare isoform you're picking up. But it will at least give you some indication what's going on.

ADD COMMENT
1
Entering edit mode

I think I understand your point. That's useful suggestion, thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2457 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6