Count reads from isoforms in RNA-seq data
1
0
Entering edit mode
3.8 years ago
nicanicagm • 0

Hi,

Maybe a basic question. I have RNA-Seq data from a tissue taken at different time points. The reads are in the SRA database (For example, here: SRR6314256).

For a gene with many different alternatively splice variants I want to get the relative read ratios between them. Say, I want to know if the splice variant A is more expressed than the splice variant B or C.

How do I do this?

I have tried using magic-blast using an artificial transcriptome database, containing only the gene of interest, with all the splice variants; to count the number of reads in each splice variant. But, I don't know how to infer the contribution of each variant, specially when most exons are common to at least 2 variants.

Thanks!

RNA-Seq next-gen isoforms splice variants • 1.1k views
0
Entering edit mode
3.8 years ago
h.mon 33k

Use Salmon or kallisto. Do not use only the genes of interest, use the whole transcriptome as reference to avoid false-positive counts. Both tools are really fast and quantifying each sample should take just some minutes. To perform differential transcript expression analysis, you may use sleuth.

0
Entering edit mode

Thanks h.mon, Do you know if Salmon or kallisto requires downloading of all the SRA reads? The total size of all the experiment is 100Gb. I used magicblast because it can work with SRA accession numbers without downloading the complete set.

thanks

0
Entering edit mode

Meaning you used remote magicBLAST (sorry, I never used magicBLAST)?

There are tricks to stream the data from SRA (or ENA) directly into kallisto / Salmon, which means you don't have to save the data on disk, but they have to be transferred to the local machine nonetheless. The tricks:

http://www.nxn.se/valent/streaming-rna-seq-data-from-ena

https://standage.github.io/streaming-data-from-the-sra-with-fastq-dump.html

http://genomespot.blogspot.com.br/2015/01/sra-toolkit-tips-and-workarounds.html

fastq-dump allows streaming, so you may be able to feed Salmon / kallisto directly with it.

https://github.com/ncbi/sra-tools/issues/57

Some packages to facilitate SRA data streaming (I never tested them):

https://github.com/jdidion/ngstream