Question

minimum reads for the splicing event analysis ?

0

Entering edit mode

3.9 years ago

Omics data mining ▴ 260

Hello

I am working on human genome. It is always recommended to choose sequencing depth based on the experimentation purpose . In my project, I need to estimate isoform expression (Ψ values, for “Percent Spliced In” or “Percent Spliced Isoform”) and to identify the splicing events between multiple samples. So, RNA-Seq depth needed for isoform calculation. What should be minimum reads required for splicing analysis from RNASeq data ??

I will appreciate all suggestions.

A

alternative splicing isoform expression • 1.2k views

ADD COMMENT • link updated 3.9 years ago by lieven.sterck 15k • written 3.9 years ago by Omics data mining ▴ 260

score 1 · Answer 1 · 2020-06-15

1

Entering edit mode

3.9 years ago

lieven.sterck 15k

As you might have figured out this is also depending on the kind of isoform you wish to analyse. I mean: if they are very rare splice variants you will need to sequencer deeper. Reversing this reasoning it means that a moderated depth will allow to pick up quite 'common' isoforms, and thus the deeper you sequence the more rare isoforms you will potentially pick up.

I assume you are talking illumina (short) reads here. If so, it might be good to also have a look at the long read technologies (PacBio, ONT) , as for true correct full length isoforms you will only have strong evidence if the isoform is derived (or obtained) via long reads (as this will give you a better view on the full transcripts, rather than to assemble them (= potentially many false positives in there)

ADD COMMENT • link 3.9 years ago by lieven.sterck 15k

0

Entering edit mode

Thanks lieven for quick response.

I am talking about short reads from RNA-Seq data. As there are different ways to perform the splicing: reference based and denovo based. I managed to get paired end RNASeq data with 44.3 million reads (QC filtered) per sample. Can I use it for discovery of common 'isoforms'?

Including the usage of long read technologies (PacBio, ONT) will surely increase power to pick up correct isoform.

A

ADD REPLY • link 3.9 years ago by Omics data mining ▴ 260

0

Entering edit mode

yes, if talking short reads than the paired-end ones are the most usable indeed. Going from that number you might even get more than only the common ones.

Personally I would start with the reference based approaches (given that there is a genome present for this species) and only in a second phase go for the de-novo route, as this one will give you a much more noisy view on things.

ADD REPLY • link 3.9 years ago by lieven.sterck 15k