Question: How Does Encode Data Change Design Of Ngs Experiments?
3
gravatar for Alex Paciorkowski
6.4 years ago by
Rochester, NY USA
Alex Paciorkowski3.3k wrote:

Now that we've had a week or so to digest the ENCODE publications (nice summary here), this is a question for those groups engaged in next-gen sequencing projects for gene discovery in human disorders. Most of you have probably focused on whole exome.

What elements of the ENCODE data set are ready or near-ready to include in future experiments that capture the "exome-plus"? Are groups designing targets for some of these regions for capture? Which ones? Enhancers? Promotors? Other long-range functional elements? Or do you suspect it's more efficient to just target the whole genome, so that data can be re-analyzed as functional annotations of the non-coding regions continue to improve? Interested in your responses.

encode • 2.3k views
ADD COMMENTlink written 6.4 years ago by Alex Paciorkowski3.3k
5
gravatar for Istvan Albert
6.4 years ago by
Istvan Albert ♦♦ 79k
University Park, USA
Istvan Albert ♦♦ 79k wrote:

Allow me to demonstrate

enter image description here

ADD COMMENTlink modified 6.4 years ago • written 6.4 years ago by Istvan Albert ♦♦ 79k

so, are we going down from the big peak?

ADD REPLYlink written 6.4 years ago by JC7.4k
1

oh I think we are at the technology trigger state

ADD REPLYlink written 6.4 years ago by Istvan Albert ♦♦ 79k

:) Point taken. What're the values on the time axis? Minutes? Hours? Days?

ADD REPLYlink written 6.4 years ago by Alex Paciorkowski3.3k

I've noticed that even though your proposed time steps span orders of magnitude are all of lengths that a person could easily tolerate. ;-) I have no idea.

I do think that the closer to release the more of a race it is to find that low hanging fruit. I have already heard a few talks of people that are interested in reverse engineering the data to find patterns with little concern to the origins or meaning of it all. It is all binding baby!

ADD REPLYlink written 6.4 years ago by Istvan Albert ♦♦ 79k
0
gravatar for JC
6.4 years ago by
JC7.4k
Mexico
JC7.4k wrote:

I consider the whole genome sequencing more reliable than the exome capturing techniques, because bias and other missing factor. Besides, as you mentioned, to be able to reanalyze regions.

ADD COMMENTlink written 6.4 years ago by JC7.4k

but how would you use the ENCODE data ?

ADD REPLYlink written 6.4 years ago by Pierre Lindenbaum116k

probably I don't, many of the sequences come from immortalized cell lines, I don't have a direct application in mind for that. For my current projects I prefer the 1000 Genomes data set.

ADD REPLYlink modified 6.4 years ago • written 6.4 years ago by JC7.4k

Thanks, JC. Sure, whole genome seq provides more consistent coverage of the exome, but at what trade-off? In my neighborhood, WGS of 1 sample is about 4x the cost of whole exome of 1 sample, so you can exome a whole trio for less than WGS of 1 sample. Unfortunately funding influences experimental design, especially when data analysis has traditionally focused on the coding regions. My question was more about what elements of the ENCODE data can be incorporated into current analysis workflows.

ADD REPLYlink written 6.4 years ago by Alex Paciorkowski3.3k

I agree with the money limitation, but I don't think people will expand the exome capture probes right now with ENCODE data, the problem is how much can cost to design and produce specific probes for your regions, at some point, whole genome sequencing will be cheaper.

ADD REPLYlink written 6.4 years ago by JC7.4k

True, changing the target capture can be expensive. So let's modify the question -- for those who focus on exome capture, at what point will the possible incorporation of ENCODE data into analysis justify the switch to WGS?

ADD REPLYlink written 6.4 years ago by Alex Paciorkowski3.3k
0
gravatar for swbarnes2
6.4 years ago by
swbarnes24.8k
United States
swbarnes24.8k wrote:

Wasn't ENCODE highly permissive in what they were labeling as biologically active? I think people will have to validate that this stuff is biologically relevent. And maybe some of it will be, but probably not all of it.

As someone on another blog pointed out, the % of non-coding DNA differs widely among species. If so much of our non-coding DNA was important, how are some species getting on with so much less of it?

For instance, there are two closely related onions, and one has a genome 5x as large as the other. Does it make sense to think that one onion really has 5x more going on in its genome than another onion in the same genus?

One group took a mouse, and deleted a 1 Mb region of intergenic DNA, and the mouse was phenotypically indistinguisnable from wild-type. So if there was active stuff in that region, it wasn't doing much, at least in a lab setting.

http://www.nrcresearchpress.com/doi/abs/10.1139/g05-017

http://www.ncbi.nlm.nih.gov/pubmed/15496924

ADD COMMENTlink written 6.4 years ago by swbarnes24.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1818 users visited in the last hour