Question: Can We Compare The Total Reads Between Tumor And Its Wild Type Before Calling Variants To Check Globally Landscape Of Tumor/Wild Type/Ips?
gravatar for ivivek_ngs
6.0 years ago by
Seattle,WA, USA
ivivek_ngs4.9k wrote:

Dear All,

I have a question, it would be appreciable if someone can give some ideas or suggestions to me regarding my question. I am analyzing exome data for normal samples , its corresponding tumor and its IPS derived from the tumor. I want to know even before calling the variants is there are way I can do a global check on my samples based on the reads to understand whether the IPS resembles the wild type or the tumor samples more or not. Can I make a calling on the reads between the tumor vs wild type and the IPS, based on any parameter that can help me map or enable me to mirror globally where the IPS resemble the wild type more or the tumor exomes. I am not sure how to achieve this on read level. But still I would like to give it a try, if we use the exome bed files provided by the company to see which coordinates lie on the exonic region for each tumor, wild-type and IPS based on the reads and then compare those coordinates to see the overlap of the coordinates having the reads between tumor vs wild type and tumor vs IPS to understand whether the tumor is more close to the wild type or IPS. Is this feasible? If so then I would like to have suggestions about it and how to achieve this? Is there script publicly available to do this?


ADD COMMENTlink modified 6.0 years ago by Alex Paciorkowski3.4k • written 6.0 years ago by ivivek_ngs4.9k

By IPS cell, do you mean induced pluripotent stem cells? I'd be surprised if the process used to induce pluripotency had much of an effect on the underlying sequence (that would kind of defeat the purpose, in fact).

ADD REPLYlink written 6.0 years ago by Devon Ryan94k

yes , IPS means induced pluripotent stem cells here. The IPS are tumor derived. I have done already the variant calling to check the mutational landscape between the tumor and its IPS by subtracting the normal variants common to both tumor and IPS but I would also like to do it at the read level even before calling the variants, just to understand globally if the IPS resembles more the wild type or the tumor exomes. Can that be done with a help of script on the aligned bam files with the help of exome bed file used for target enrichment provided by the company on read level? Then match for the regions having similar reads between the 3?

ADD REPLYlink written 6.0 years ago by ivivek_ngs4.9k

If you already have variant calls, which are the key piece of information, I'm not sure what you hope to accomplish here. As I state below, you can take a close look at copy number events, but those are just another type of variant call.

ADD REPLYlink written 6.0 years ago by Chris Miller21k

Yes looking at CNV and INDELS will be another level of variant call just to be sure if the variants of both tumor and IPS are similar or not. I have done these at variant level. But am not sure the question I posed is also feasible or not? I want to give it a try if this way of checking the reads for each samples on its corresponding exome region can give me any idea of inferring that the IPS is more close to tumor or wild type or not. As far as I know the tumor is polyclonal and the IPS are mono clonal. So the mirroring will not be too high but atleast I can somehow make a point that my IPS is derived from tumor. I am interested now if such can be derived even before variant detection as I have posted, considering the read in each exome region and match the region between the tumor/wild type / IPS

ADD REPLYlink written 6.0 years ago by ivivek_ngs4.9k

I suppose you could do a comparison of normalized read depth in the capture regions, which might give you an idea about CNVs at least (i.e., clustering the samples according to this should result in the tumor and iPS samples clustering together). You might also try to look at how the mapping quality varies between the samples (since the tumor and iPS will have more mutations, the mapping quality distribution may differ between them, though this will be aligner dependent). This last possibility has a number of problems, of course.

ADD REPLYlink written 6.0 years ago by Devon Ryan94k
gravatar for Chris Miller
6.0 years ago by
Chris Miller21k
Washington University in St. Louis, MO
Chris Miller21k wrote:

Sure. I'd start by looking at the somatic mutation landscape by comparing each to the normal sample (SNVs, Indels, CNVs). You expect the tumor and iPS to be very similar, otherwise it's a poor model system for testing. One thing to watch out for: Is the iPS derived from the founding clone of the tumor, or from a subclonal population? A deep dive into the somatic mutation frequencies can help answer that questions. If you used a viral vector to induce pluripotency, you can also map reads to that along with the reference and may be able to resolve breakpoints of the integration sites.

ADD COMMENTlink written 6.0 years ago by Chris Miller21k
gravatar for karl.stamm
6.0 years ago by
United States
karl.stamm3.6k wrote:

If you have a bed file of where reads are, see bedtools intersectbed. Its a simple and common function to subtract one list from another. Beware entries with below significance read counts, I don't know how your bed was made, but you wouldn't be interested in a single read region. Its a bad idea to try to use subtle differences in read depths because there are many many sources of bias in your data generation. See conifer for robust read depth based copy number variation calls. Finally to see where your samples differ you should do the variant calling.

Fast QC can say if there are gross differences in read quality or total, but I don't expect that to be relevant to the condition, so much as sample handling screwups. Still worth looking.

Finally for structural variation... I don't know any good tools.

ADD COMMENTlink written 6.0 years ago by karl.stamm3.6k

Structural variation is going to be dicey from exome sequencing anyway. The odds of getting reads that span breakpoints are slim.

ADD REPLYlink written 6.0 years ago by Chris Miller21k

I have done the variant calling and already did the SNP calls to check for the mutational landscape, but its a down stream process where I am only checking the stringent SNPs but even before calling the variants I am interested if globally I can do such kind of calls based on the reads on the exome data to understand if the IPS is more prone to tumor or the wild type

ADD REPLYlink written 6.0 years ago by ivivek_ngs4.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1622 users visited in the last hour