Count reads from isoforms in RNA-seq data
1
0
Entering edit mode
4.6 years ago
nicanicagm • 0

Hi,

Maybe a basic question. I have RNA-Seq data from a tissue taken at different time points. The reads are in the SRA database (For example, here: SRR6314256).

For a gene with many different alternatively splice variants I want to get the relative read ratios between them. Say, I want to know if the splice variant A is more expressed than the splice variant B or C.

How do I do this?

I have tried using magic-blast using an artificial transcriptome database, containing only the gene of interest, with all the splice variants; to count the number of reads in each splice variant. But, I don't know how to infer the contribution of each variant, specially when most exons are common to at least 2 variants.

Thanks!

RNA-Seq next-gen isoforms splice variants • 1.3k views
ADD COMMENT
0
Entering edit mode
4.6 years ago
h.mon 34k

Use Salmon or kallisto. Do not use only the genes of interest, use the whole transcriptome as reference to avoid false-positive counts. Both tools are really fast and quantifying each sample should take just some minutes. To perform differential transcript expression analysis, you may use sleuth.

ADD COMMENT
0
Entering edit mode

Thanks h.mon, Do you know if Salmon or kallisto requires downloading of all the SRA reads? The total size of all the experiment is 100Gb. I used magicblast because it can work with SRA accession numbers without downloading the complete set.

thanks

ADD REPLY
0
Entering edit mode

I used magicblast because it can work with SRA accession numbers without downloading the complete set.

Meaning you used remote magicBLAST (sorry, I never used magicBLAST)?

There are tricks to stream the data from SRA (or ENA) directly into kallisto / Salmon, which means you don't have to save the data on disk, but they have to be transferred to the local machine nonetheless. The tricks:

http://www.nxn.se/valent/streaming-rna-seq-data-from-ena

https://standage.github.io/streaming-data-from-the-sra-with-fastq-dump.html

http://genomespot.blogspot.com.br/2015/01/sra-toolkit-tips-and-workarounds.html

fastq-dump allows streaming, so you may be able to feed Salmon / kallisto directly with it.

https://github.com/ncbi/sra-tools/issues/57

Some packages to facilitate SRA data streaming (I never tested them):

https://github.com/jdidion/ngstream

ADD REPLY

Login before adding your answer.

Traffic: 1368 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6