Exons Distribution from single cell data.
3 months ago
Cheng Wei • 0

I was looking at the read counts from my single cell sequencing data through IGV and notice that there are very little reads that mapped onto the last few exons of Ezh2, but a lot of reads that are mapped onto the first few exons.(even after normalizing for exon length) I am using 3' end sequencing so I was expecting alot more reads that are mapped onto the last few exons since that is it the 3' end. Is there a reason why this is the case, or are there any papers that share the distribution of the reads on mapped exons?

3 months ago
3 months ago
benformatics

There should be a bias towards the 3' end if you did use 3' end sequencing. If there are poly-A tracts within the transcripts then you might get some internal transcript priming - which could explain your observations. Other than that, a screenshot would probably be more helpful to determine if it's some kind of artifact.

The most right side is the 3' side (exon 20). There is a lot of reads at the exon 20(UTR) as expected. Also, you can see here there are almost no reads near my last few exons, but a lot of reads in exon 5-12 over here which is the one that is unexpected to me. Is this some knid of artifact?

I mean it's not 100% convincing but a lot of your bigger peaks seem to be upstream of AAAAA which might be a sign that they are from non-polyA tail priming. Obviously, this is very speculative but your coverage is highest at the 3'-end as expected. And at least 3 of the 5 tallest exon peaks in your data have a clear AAAAA motif within the exon itself. (The first biggest exon to the left of the gene name [in your screenshot] and then 2 out of the 3 exons furthest to the right).