Question: How Do I Get Started Working With Rna-Seq Data
4.4 years ago
HNK110 wrote:

hey everyone I have to start my work on RNA-seq. i am totally new to this RNA seq approach. I have to work on the data given by neurological department. The data has 96 samples(reads ..fastq files), the samples were derived from formalin fixed paraffin embedded. . I have to determine somatic variation, gene expression, SNV and fusion genes between subgroups from RNAseq. CAn any1 help me out, how should i start my work. How to analyse teh RNAseq and cancer genome data.

gene-expression rna-seq • 12k views
Welcome to Biostar! This is not a great question, as there are many guides to RNA sequence analysis online just a search away. I'd recommend finding one, starting to follow it, and if you get stuck, then come back and ask specific questions. You're more likely to get useful responses that way. Look at Section 6 here for more details. (Do your homework before posting);jsessionid=A2C2B677241104800E044DA36AFB577B

I recently found this book, which I think does a good job of giving an overview of the RNA-Seq tools available and what they do. It also gives code snippets to explain how to execute them.

As slides:

on amazon:


3.3 years ago
Michele Busby1.8k
United States
Michele Busby1.8k wrote:

We have a blog post here that goes over basic concepts in RNA Seq:

I have to edit it (I've been told by complaining readers) to add something on normalization and I also want to add stuff on biases and complexity.

Since those are FFPE samples some are likely crappy so you will have a lot of biases, etc. which means that just running it through an existing pipeline may not be optimal, though it may be a good first step.  i.e. you may need to do something like principle component analysis to see what your confounders are. It's not trivial but others have done it.  


4.4 years ago
Carlos Borroto1.7k
Washington Metropolitan Area
Carlos Borroto1.7k wrote:

I would start by reading this paper from the authors of the Tuxedo pipeline.

Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks.

this is a really good point, I always tell people the same thing. There is probably no better way to get started, just work through a paper or to.

Thankyou so much. I have started reading this paper.

2.9 years ago
Washington University School of Medicine, St. Louis, USA
Malachi Griffith16k wrote:

We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at

This material was released alongside this publication:

Malachi Griffith*, Jason R. Walker, Nicholas C. Spies, Benjamin J. Ainscough, Obi L. Griffith*. 2015. Informatics for RNA-seq: A web resource for analysis on the cloud.11(8):e1004393.

The Supplementary Information for this publication includes an extensive review of RNA-seq wet lab and analysis concepts, existing tools, common questions, etc.

All materials associated with this publication, including high resolution and original figure files, supplementary tables, etc. are available here:

This publication was inspired by workshops that we have taught at CBWCSHL, and NYGC over the last few years.  These workshops are ongoing and we hope to maintain and expand the content in the coming years.

2.5 years ago
United Kingdom
dnaseiseq190 wrote:


Just published:

A survey of best practices for RNA-seq data analysis

Genome Biology 2016, 17:13  doi:10.1186/s13059-016-0881-8

4.4 years ago
Charles Warden5.0k
Duarte, CA
Charles Warden5.0k wrote:

Somatic variation is really meant for DNA-Seq data. Although you can look for RNA-editing events with paired DNA-Seq and RNA-Seq data, I think you will have a hard time distinguishing true variants from tumor-specific RNA-editing events if you are comparing two RNA-Seq samples (or SNV calling in RNA-Seq sample against a reference genome).

For gene expression, I've included some benchmarks here (which I ran using paired tumor-normal RNA-Seq data):

I don't think there is a gold standard for gene fusion events, but I've liked chimerascan the best. TopHat-fusion is probably the most popular option.

3.3 years ago
Czh3180 wrote:

RNA-seq pipeline:

