Question: Where Can I Find Fastq Data (Ngs Raw Data) And Published Results?
10
gravatar for Orca
9.8 years ago by
Orca140
Orca140 wrote:

I would like to reproduce some published results with my own analysis pipeline, but I need the corresponding datasets downloadable. I have to validate my pipeline. If someone has another idea... Let me know!!

Orc@

ADD COMMENTlink modified 9.6 years ago by Pablo Marin-Garcia1.8k • written 9.8 years ago by Orca140

What kind of analysis are you trying to perform?

ADD REPLYlink written 9.8 years ago by Jts1.3k

same question here: I'd like to find a set of fastq files related to a given article to show my students how to process this kind of data.

ADD REPLYlink written 9.8 years ago by Pierre Lindenbaum129k

Thanks for your answers, but I'm looking for an article (published results) in which the raw data are available.

ADD REPLYlink written 9.7 years ago by Orca140

I would like to perform some analysis of Ins/Del, SNP on human genome.

ADD REPLYlink written 9.7 years ago by Orca140

I have a question. Since it is very related I chose to post it here.

Q: So when you say that you want to validate your analysis pipeline with published/publicaly available data. Do you get any rights to publish your results based on your analysis of somebody elses's data. What are the norms to use publicaly available NGS data?

ADD REPLYlink modified 7 months ago by RamRS28k • written 6.2 years ago by rohan100
6
gravatar for iw9oel_ad
9.8 years ago by
iw9oel_ad6.1k
iw9oel_ad6.1k wrote:

From the NCBI Sequence Read Archive. To obtain Fastq format see the relevant section in the SRA handbook for which you will probably need the SRA Toolkit.

ADD COMMENTlink written 9.8 years ago by iw9oel_ad6.1k
6
gravatar for Daniel Swan
9.8 years ago by
Daniel Swan13k
Aberdeen, UK
Daniel Swan13k wrote:

You could also look into the European Nucleotide Archive at the EBI.

ADD COMMENTlink written 9.8 years ago by Daniel Swan13k
5
gravatar for Sean Davis
9.8 years ago by
Sean Davis26k
National Institutes of Health, Bethesda, MD
Sean Davis26k wrote:

NOTE: Shameless plug for our software....

You could have a look at the SRAdb R/Bioconductor package. We pull down all the metadata from the sequence read archives at EBI, NCBI, and DDBJ and consolidate that into a SQLite file that can be used from R or any other language that has a SQLite interface.

ADD COMMENTlink written 9.8 years ago by Sean Davis26k

A great idea ! I will use it as soon as possible. Thanks

ADD REPLYlink written 9.8 years ago by Puthier250

Is there a SQLite interface for perl ?

ADD REPLYlink written 6.2 years ago by rohan100
5
gravatar for Pablo Marin-Garcia
9.2 years ago by
Spain
Pablo Marin-Garcia1.8k wrote:

If you are looking for non-human sequences you can use the European Nucleotide Archive at EBI. But if the papers that you are looking at, are from humans, then you need to go to the European Genotype Phenotype Archive at EBI EGA or the datatabase of Genotypes and Phenotypes at NCBI dbGaP. Be aware that despite of being public, almost all the human data from research studies are under consent agreement rules, so you need to ask for access first before able to access or use the data. Unless you want to replicate a study, it would be easy to use data from 1000 genomes or similar projects that you can download directly from the 1kg web site.

As a side note: for downloading this BIG data sets is better to use aspera than ftp when this possibility is provided (see 1kg data access)

ADD COMMENTlink modified 9.1 years ago • written 9.2 years ago by Pablo Marin-Garcia1.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1112 users visited in the last hour