Question

De Novo RNA Seq Assembly, Spurious Isoforms?

0

Entering edit mode

3.4 years ago

jer364 • 0

Hi all,

I'm mining viruses in RNA Seq data, and often see that nucleotide gaps (~ 50-100 nucleotides) differentiating isoforms lay outside the predicted coding regions. I've been working under the assumption that this type of isoform would represent alternatively spliced variants, but that isn't consistent with my data if the variation exists outside the coding regions. Possible explanation: Predicted coding regions are wrong, although for some transcripts I have > 90% amino acid similarity to references and near complete viral genomes, so I think in those cases it would be less likely. I also suppose they could be different viruses all together. Or could these be artifacts in the assembly process? Any information that sheds light on why this happens would be greatly appreciated! Thanks!

RNA-Seq • 487 views

ADD COMMENT • link updated 3.4 years ago by lieven.sterck 15k • written 3.4 years ago by jer364 • 0

1

Entering edit mode

I'm not an expert in viruses but in other organisms this does not have to be the case. Isoform as in alternative transcript (could be via splicing differences, exon skipping, intron retention, ...) are defined on the transcript level (== mRNA thus) and as such has nothing to do with the coding region. You can have different isoforms that have identical proteins, thus where the alternative is not in the coding region (but for instance UTR).

I do add that most attention goes to those variants where the variation has effect on the translated protein, but strictly speaking it does not have to be.

ADD REPLY • link 3.4 years ago by lieven.sterck 15k

0

Entering edit mode

Thank you! This was very helpful!

ADD REPLY • link 3.4 years ago by jer364 • 0

0

Entering edit mode

mining viruses in RNA Seq data

Can you clarify how you are doing that?

ADD REPLY • link 3.4 years ago by GenoMax 141k