Question: Full Exome Sequencing Of Xenografted Tumor
gravatar for Orca
9.0 years ago by
Orca140 wrote:

I have to analyze NGS data after a targeted enrichment (sure select Agilent) of a xenografted tumor. We know that there is a contamination of murin stroma around 20%. How to manage this issue to be sure that the mutation annoted are human specific? Thanks,


exome next-gen sequencing • 2.6k views
ADD COMMENTlink modified 9.0 years ago by Lythimus200 • written 9.0 years ago by Orca140
gravatar for Michael Dondrup
9.0 years ago by
Bergen, Norway
Michael Dondrup47k wrote:

Let me rephrase your question a bit for the sake of making it comprehensible: you took human tumor cells, implanted them into lab mice, let the tumors grow, harvested and performed next gen exome sequencing to detect variations. Now, your reads are understandably contaminated with DNA from mice. What to do? Unfortunately, I didn't have the 'luck' to do get such contaminated data, so here is what I would do, given the theoretical possibility of being asked to analyse such a data set, and given I wanted to publish the results:

The best way of removing contamination is to avoid it in the first place (if possible) I don't believe there is any secure way to remove contamination especially of highly similar sequences. To salvage this case I would try to apply rigorous filtering:

  • Align the reads against the mouse and human genome
  • remove those reads that align better or as well to mouse as to human reference genome
  • check the alignment positions, discard all reads that align to non-exonic, intergenic regions, they should not be there anyway
  • run snp detection, I don't think copy number variation detection is feasible
  • after detecting a snp, align the genomic sequence flanking it's position against mouse using eg FASTA or SSearch. If mouse sequence is highly similar don't report it.

That way you will possibly be quite specific, the question is, if you will have many reads left.

ADD COMMENTlink modified 9.0 years ago • written 9.0 years ago by Michael Dondrup47k
gravatar for Darked89
9.0 years ago by
Barcelona, Spain
Darked894.2k wrote:

Humans are quite different on the nucleotide level from mouse (just blastn NM_000546.4 Homo sapiens tumor protein p53 (TP53), transcript variant 1, mRNA against mouse ref-seq). Even with 80% similarity there are hardly any 60bp long identical fragments. So if your read length is long enough then there is no chance that tumor will mutate giving you an exact mouse sequence. And we are talking here just exons, but (correct me if I am wrong) you should get some "dangling" intronic sequences flanking exons as well. Unless you land in some very peculiar parts of the genome, similarity drops there, so no cross-mapping of such reads.

ADD COMMENTlink written 9.0 years ago by Darked894.2k
gravatar for Lythimus
8.9 years ago by
Lythimus200 wrote:

I've been working on this problem and am searching for data sets such as yours to test it out on in which there is a known degree of contamination. Take a look and contact me if you are interested in pursuing:

ADD COMMENTlink written 8.9 years ago by Lythimus200
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 838 users visited in the last hour