I'm mining viruses in RNA Seq data, and often see that nucleotide gaps (~ 50-100 nucleotides) differentiating isoforms lay outside the predicted coding regions. I've been working under the assumption that this type of isoform would represent alternatively spliced variants, but that isn't consistent with my data if the variation exists outside the coding regions. Possible explanation: Predicted coding regions are wrong, although for some transcripts I have > 90% amino acid similarity to references and near complete viral genomes, so I think in those cases it would be less likely. I also suppose they could be different viruses all together. Or could these be artifacts in the assembly process? Any information that sheds light on why this happens would be greatly appreciated! Thanks!