Question: How To Determine 'How Complete The Rna-Seq Data Is'?
3
gravatar for Ken
7.5 years ago by
Ken150
Ken150 wrote:

Hi all, just wondering if anyone has an idea of how to judge how complete the RNA-seq data is? Of course this should depend on what genome it is from. Thanks in advanced.

rna • 2.3k views
ADD COMMENTlink written 7.5 years ago by Ken150
1

Perhaps you could give a definition of what you mean by 'complete'. All known loci covered by some number of reads? All splice variants represented? if that's you're definition (you'll require billions of reads for a mammalian genome). What are you actually looking for?

ADD REPLYlink written 7.5 years ago by seidel6.8k

This question makes no sense as is. Please clarify.

ADD REPLYlink written 7.5 years ago by Neilfws48k

Hi seidel and neilfws, my intention for 'complete' refers to 'all known loci covered by some number of reads' as what seidel pointed out. Thanks.

ADD REPLYlink written 7.5 years ago by Ken150
5
gravatar for Larry_Parnell
7.5 years ago by
Larry_Parnell16k
Boston, MA USA
Larry_Parnell16k wrote:

You can look at gene representation from some fraction (say 50% of your samples) and compare changes in coverage as you add another 10% or 25%, for example, of the reads. You can do this in terms of total number of genes or mRNA isoforms observed as well as representation of some select genes that are expressed to high, moderate and low levels. Basically, you would do this to see where discovery (of expressed genes) starts to plateau.

I have seen this approach presented at genome conferences.

Edit (6 Oct 2011): I don't recall seeing data from the group who authored the paper Istvan mentioned, but the results are indeed similar to those I have heard and observed others discuss. I suggest taking a good look at their figure 1, showing saturation curves. There is, however, much more to this paper that should be explored for those facing similar issues of gene coverage and saturation.

ADD COMMENTlink modified 7.5 years ago • written 7.5 years ago by Larry_Parnell16k

Do you have a link to an example presentation that uses this? It would be nice to see what it looks like.

ADD REPLYlink written 7.5 years ago by Gww2.6k

No, I don't have anything on hand. If I can find the time, I will try to redraw someone else's data - but that is risky...

ADD REPLYlink written 7.5 years ago by Larry_Parnell16k

Oh no worries then, don't worry about it thank you though

ADD REPLYlink written 7.5 years ago by Gww2.6k

@GWW, I think the paper Istvan suggests is the one.

ADD REPLYlink written 7.5 years ago by Ken150
4
gravatar for Istvan Albert
7.5 years ago by
Istvan Albert ♦♦ 79k
University Park, USA
Istvan Albert ♦♦ 79k wrote:

For some ideas consult the paper titled Differential expression in RNA-seq: A matter of depth

ADD COMMENTlink written 7.5 years ago by Istvan Albert ♦♦ 79k
4
gravatar for Michael Reich
7.4 years ago by
United States
Michael Reich40 wrote:

Ken, the GenePattern software has a tool that can help you to determine coverage by gene, locus, transcript, etc. - it is called RNAseqMetrics and is available on the GenePattern server at http://genepattern.broadinstitute.org. A publication on this tool is in process. For general information you can go to http://www.genepattern.org.

Best, Michael

ADD COMMENTlink written 7.4 years ago by Michael Reich40

Thank you! Always great when an expert is available to give advice. Welcome to BioStar Michael!

ADD REPLYlink written 7.4 years ago by Larry_Parnell16k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1314 users visited in the last hour