Question: Where to find paired solid tumor WES raw data (tumor/tumor margin) that I can download?
0
gravatar for field654
7 days ago by
field65410
China/HangZhou/TigerTCR
field65410 wrote:

I've been trying to analyze a pair of tumor/margin tissues but got very few meaningful somatic mutations. I doubt if I picked the correct pipeline/parameters.

I wonder if there's any publications made with similar samples where the authors provided the raw data that I can download and run with my pipeline. I can thus compare the received result to the published results.

Thank you. Field

sequencing snp • 95 views
ADD COMMENTlink modified 5 days ago by colindaven2.3k • written 7 days ago by field65410

To understand what similar is you'd need to indicate what you're working on. Regardless, you will need to do the unpleasant work to use PubMed, scanning for studies that have done WES in your tumor entity, and then download the data. Most of the time data are available for download. if these are human data the data might be under protection at dbGaP or similar databases so you'd need to apply for access.

If you are in doubt about the pipeline then use something established, e.g. the GATK workflow for WES.

https://github.com/gatk-workflows/gatk4-exome-analysis-pipeline

ADD REPLYlink modified 7 days ago • written 7 days ago by ATpoint38k

Hi ATpoint,

Thank you so much for the reply. I've been working on a pair of breast cancer tissues.

Unfortunately I can't put in a whole lot of time into literature searching. I did limited searching but found none providing raw data. Maybe the raw data is too large for download. Also, I emailed to learn how I can gain access to NCI cancer genome common, I was told to follow the steps. https://gdc.cancer.gov/access-data/obtaining-access-controlled-data

I work at a startup company at Hangzhou China. We currently don't have the manpower to prepare the materials for a NIH data access application and do the annual renewal. That's why I've been seeking the shortcut by asking around on the forum.

Thank you so much.

Field

ADD REPLYlink written 6 days ago by field65410

Sorry I forgot to thank you for the GATK recommended pipeline. As a beginner bioinformatic practitioner, I basically struggle between a few available programs for each step of data analyzing. Papers do compare them but rarely give an absolute pick. I would definiely go-over the gatk pipeline and see what I get.

ADD REPLYlink written 6 days ago by field65410
2
gravatar for colindaven
5 days ago by
colindaven2.3k
Hannover Medical School
colindaven2.3k wrote:

This paper might be interesting. I believe I have downloaded the data from there before to use for training:

https://www.nature.com/articles/sdata201610

ADD COMMENTlink written 5 days ago by colindaven2.3k
1

Hi Colin,

Thank you for providing the resource. Although I had some glitch when registering for the TCRB account, I have fianlly got access to the data. That's exactly what I've been searching for.

ADD REPLYlink written 3 days ago by field65410

Hi Colin,

Thank you so much for the resource. I don't even know there's a journal just about data. I would try out asap and mark the question as anwsered.

Best, Field

ADD REPLYlink written 5 days ago by field65410
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1556 users visited in the last hour