Question: A question about RNASeq experiment design
gravatar for nazaninhoseinkhan
3.8 years ago by
Iran, Islamic Republic Of
nazaninhoseinkhan410 wrote:

Dear all,

I want to design an RNASeq project on cancer. The major goal of this project is to find a list of differential expressed genes between cancer and normal tissues.

In most data sets, deposited data consists of cancerous tissues along with adjacent normal tissues belong to the same patient.

Now my question is: is it possible to compare the cancer tissue of a number of patients with normal tissues belonging to completely normal individuals? Is this comparison biologically meaningful?

I will appreciate any help in advance


rna-seq experiment design • 1.1k views
ADD COMMENTlink modified 3.8 years ago by Michele Busby2.1k • written 3.8 years ago by nazaninhoseinkhan410

Dear Nazanin, Hi.

I guess in this case you need multiple biological replications for normal humans to decrease the bias of individual variation of gene expression (and maybe your next question would be about "to pool or not to pool ?").

The candidate genes that you aimed for may influence your designe, too. If you are searching for up-regulated or down-regulated new gene(s), minimizing individual variation is more important.

You can also check the pipeline of some papers or database to see if their “primary tumor” and “solid tissue normal” were from same individuals or not.

~ Best

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by Farbod3.3k

Dear Farbod, Hi,

Thank you for your help



ADD REPLYlink written 3.7 years ago by nazaninhoseinkhan410
gravatar for Michele Busby
3.8 years ago by
Michele Busby2.1k
United States
Michele Busby2.1k wrote:

I don't know if a straight up differential expression analysis would work well here (e.g. some version of a t-test) but you may be able to do a more sophisticated analysis where you can compare your tumor data to your normal data by clustering it. There are many papers where people look for signatures of cancer this way. They usually require a lot of samples.

It is important, if you are going to be directly analyzing the data, to make sure that libraries of the cancer and the comparison normal data are all prepared in the same way. You would need to do a lot of extra normalizing if you want to mix data from a poly A TruSeq protocol with data from, e.g. a protocol that uses RiboZero.

Also, it is important to know that differences in sample handling can introduce big artifacts into the data. Fresh frozen tissue will usually be in better shape than FFPE samples but even then things like how long it took to process the tissue will affect the data. Without good handling the RNA will break up into alphabet soup. Then if you use a poly A protocol you will have a huge 3' bias because the 5' end is not longer joined to the poly A tail. This will show up in the data as length bias when you compare the samples and needs to be normalized out before analysis. This is an issue as the normal tissue is often from deceased donors and obviously it is difficult to just go in and take the tissue.

There are computational ways to smooth out these differences and get meaningful results. There are some in the GTex papers. But it is better to consider these things at the design phase so you can minimize them if possible.

Finally, big numbers are you friend.

I assume that you have already looked through existing RNA Seq datasets to see if the data you need to answer you question already exists. You may also want to look at Oncomine. It also includes a lot of microarray studies and the data is pretty easy to interrogate. The cancer you are looking at might be in there. Existing datasets are also good for telling you how many replicates you are going to need.

ADD COMMENTlink written 3.8 years ago by Michele Busby2.1k

Thank you for your comprehensive explanation


ADD REPLYlink written 3.7 years ago by nazaninhoseinkhan410
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1355 users visited in the last hour