Question: What Does A Zero Expression Level Mean In The Encode Rna-Seq Data?
0
gravatar for pengcui1989
6.4 years ago by
pengcui198930
Chiina
pengcui198930 wrote:

Dear all,

I'm new in bioinformatics and not so familiar with RNA-Seq Data. So I have a very simple question about your RNA-Seq Data. For example, I have download the long polya+ RNA-Seq data from ENCODEenter link description here. It's in the view of Genes Gencode V7 and contains more than 50000 genes' expression level information (RPKM). I find that there are so many genes' expression level in 0. However, I'm confused about whether they are expressed or not. They may indeed not be expressed. Or they may be expressed a lot in polya-RNA or microRNA type and we don't extract them from a polya+ extraction. I think this problem is also remained in transcripts data. So I don't know how to use the ploya+ RNA-Seq data to identify the gene's expression level (about level 0 genes).

So who can give me a help? And thank you very much!

encode rna-seq • 2.8k views
ADD COMMENTlink modified 6.4 years ago by Mikael Huss4.6k • written 6.4 years ago by pengcui198930

Are you calculating R/FPKM yourself from aligned reads (BAM files), or are you downloading some summarized version of the data? If it's the latter, could you share the link so that we can see what you're talking about?

ADD REPLYlink written 6.4 years ago by Steve Lianoglou5.0k

OK, it is the latter. The link is 'http://genome.ucsc.edu/cgi-bin/hgFileUi?db=hg19&g=wgEncodeCshlLongRnaSeq'. I also have edited my question.

ADD REPLYlink written 6.4 years ago by pengcui198930
0
gravatar for Mikael Huss
6.4 years ago by
Mikael Huss4.6k
Stockholm
Mikael Huss4.6k wrote:

As you point out, you will pick up mostly polyadenylated transcripts using this protocol, so it is not surprising that many Gencode genes don't show any expression - you would not expect any signal from non-poly-A transcripts or microRNA (because the latter needs a different protocol designed for short rather than long fragments). Probably, if you would check against RefSeq, you would have a higher proportion of non-zero RPKMs. Many transcripts are probably also genuinely unexpressed, since tissues don't express all transcripts available to them.

ADD COMMENTlink written 6.4 years ago by Mikael Huss4.6k

So how can we use RNA-Seq data if we can't figure out whether the gene has expressed or not?

ADD REPLYlink written 6.4 years ago by pengcui198930

We can, but it isn't a perfect experiment that will magically give us all of the answers to life's mysteries :) If we are talking about ploy-A transcripts it means we are interested in protein coding genes and not microRNAs. You can also do total RNA experiments and deplete the rRNA transcripts to look at mRNA and other RNA populations. So for the gene you want to ask the question about expression level first ask if you expect to see it in your data in the first place. If not, you are looking at the wrong data. If you do expect it, and see no reads, than it wasn't expressed in that experiment. Either it isn't normally expressed in that tissue/cell-type, or it was down-regulated/turned off/lost.

ADD REPLYlink written 6.4 years ago by Dan Gaston7.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1412 users visited in the last hour