Question

Can We Compare The Total Reads Between Tumor And Its Wild Type Before Calling Variants To Check Globally Landscape Of Tumor/Wild Type/Ips?

0

Entering edit mode

10.2 years ago

ivivek_ngs ★ 5.2k

Dear All,

I have a question, it would be appreciable if someone can give some ideas or suggestions to me regarding my question. I am analyzing exome data for normal samples , its corresponding tumor and its IPS derived from the tumor. I want to know even before calling the variants is there are way I can do a global check on my samples based on the reads to understand whether the IPS resembles the wild type or the tumor samples more or not. Can I make a calling on the reads between the tumor vs wild type and the IPS, based on any parameter that can help me map or enable me to mirror globally where the IPS resemble the wild type more or the tumor exomes. I am not sure how to achieve this on read level. But still I would like to give it a try, if we use the exome bed files provided by the company to see which coordinates lie on the exonic region for each tumor, wild-type and IPS based on the reads and then compare those coordinates to see the overlap of the coordinates having the reads between tumor vs wild type and tumor vs IPS to understand whether the tumor is more close to the wild type or IPS. Is this feasible? If so then I would like to have suggestions about it and how to achieve this? Is there script publicly available to do this?

Thanks

exome-sequencing reads variant-calling • 3.0k views

ADD COMMENT • link updated 10.2 years ago by Alex Paciorkowski 3.5k • written 10.2 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

By IPS cell, do you mean induced pluripotent stem cells? I'd be surprised if the process used to induce pluripotency had much of an effect on the underlying sequence (that would kind of defeat the purpose, in fact).

ADD REPLY • link 10.2 years ago by Devon Ryan 104k

0

Entering edit mode

yes , IPS means induced pluripotent stem cells here. The IPS are tumor derived. I have done already the variant calling to check the mutational landscape between the tumor and its IPS by subtracting the normal variants common to both tumor and IPS but I would also like to do it at the read level even before calling the variants, just to understand globally if the IPS resembles more the wild type or the tumor exomes. Can that be done with a help of script on the aligned bam files with the help of exome bed file used for target enrichment provided by the company on read level? Then match for the regions having similar reads between the 3?

ADD REPLY • link 10.2 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

If you already have variant calls, which are the key piece of information, I'm not sure what you hope to accomplish here. As I state below, you can take a close look at copy number events, but those are just another type of variant call.

ADD REPLY • link 10.2 years ago by Chris Miller 22k

0

Entering edit mode

Yes looking at CNV and INDELS will be another level of variant call just to be sure if the variants of both tumor and IPS are similar or not. I have done these at variant level. But am not sure the question I posed is also feasible or not? I want to give it a try if this way of checking the reads for each samples on its corresponding exome region can give me any idea of inferring that the IPS is more close to tumor or wild type or not. As far as I know the tumor is polyclonal and the IPS are mono clonal. So the mirroring will not be too high but atleast I can somehow make a point that my IPS is derived from tumor. I am interested now if such can be derived even before variant detection as I have posted, considering the read in each exome region and match the region between the tumor/wild type / IPS

ADD REPLY • link 10.2 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

I suppose you could do a comparison of normalized read depth in the capture regions, which might give you an idea about CNVs at least (i.e., clustering the samples according to this should result in the tumor and iPS samples clustering together). You might also try to look at how the mapping quality varies between the samples (since the tumor and iPS will have more mutations, the mapping quality distribution may differ between them, though this will be aligner dependent). This last possibility has a number of problems, of course.

ADD REPLY • link 10.2 years ago by Devon Ryan 104k

score 1 · Answer 1 · 2014-02-21

Sure. I'd start by looking at the somatic mutation landscape by comparing each to the normal sample (SNVs, Indels, CNVs). You expect the tumor and iPS to be very similar, otherwise it's a poor model system for testing. One thing to watch out for: Is the iPS derived from the founding clone of the tumor, or from a subclonal population? A deep dive into the somatic mutation frequencies can help answer that questions. If you used a viral vector to induce pluripotency, you can also map reads to that along with the reference and may be able to resolve breakpoints of the integration sites.

score 0 · Answer 2 · 2014-02-21

0

Entering edit mode

10.2 years ago

karl.stamm 4.1k

If you have a bed file of where reads are, see bedtools intersectbed. Its a simple and common function to subtract one list from another. Beware entries with below significance read counts, I don't know how your bed was made, but you wouldn't be interested in a single read region. Its a bad idea to try to use subtle differences in read depths because there are many many sources of bias in your data generation. See conifer for robust read depth based copy number variation calls. Finally to see where your samples differ you should do the variant calling.

Fast QC can say if there are gross differences in read quality or total, but I don't expect that to be relevant to the condition, so much as sample handling screwups. Still worth looking.

Finally for structural variation... I don't know any good tools.

ADD COMMENT • link 10.2 years ago by karl.stamm 4.1k

0

Entering edit mode

Structural variation is going to be dicey from exome sequencing anyway. The odds of getting reads that span breakpoints are slim.

ADD REPLY • link 10.2 years ago by Chris Miller 22k

0

Entering edit mode

I have done the variant calling and already did the SNP calls to check for the mutational landscape, but its a down stream process where I am only checking the stringent SNPs but even before calling the variants I am interested if globally I can do such kind of calls based on the reads on the exome data to understand if the IPS is more prone to tumor or the wild type

ADD REPLY • link 10.2 years ago by ivivek_ngs ★ 5.2k