Question: CNVkit: gene results not in output
0
gravatar for Lauren
14 months ago by
Lauren50
Lauren50 wrote:

I am looking at a few illumina WES experiments on the same cell line, trying to find a cnkit segment or bin hit on a gene that has reads and is captured in the library. I am not finding any values in all but one sample (no neutral calls, just omitted). The location is represented in the target file.

Variants were detected in this gene using GATK, the reads aren't low quality and it looks like there are enough reads to make a call in CNVkit (when I compare to neutral calls in other samples for this gene).

Does this happen because there aren't enough reads in or around the gene to make an inference in this particular sequencing run, or perhaps it was too noisy to make a conclusion? Thought I'd ask someone who may have a better grasp of the code than I do.

Thanks in advance.

cnvkit • 431 views
ADD COMMENTlink modified 13 months ago by Eric T.2.5k • written 14 months ago by Lauren50

Hi Lauren, Eric T. should be able to provide help when he next logs in.

ADD REPLYlink written 14 months ago by Kevin Blighe46k
1
gravatar for Eric T.
13 months ago by
Eric T.2.5k
San Francisco, CA
Eric T.2.5k wrote:

I'm not quite sure what's happening in your data, but to clarify:

  • How did you run CNVkit, and which version did you use?
  • Does the gene name appear in the .cns file? Is it in a short segment by itself, or included in a segment with other genes on it?
  • Does the gene appear in the reference .cnn file? How many bins? If you used a pooled normal reference, are the log2 and spread values in those bins within the expected ranges (log2 between -5 and +5, spread below 1.0)?
  • Is the gene in a tricky genomic region, e.g. in or near HLA, PARs, centromeres, telomeres? If you used the 'access.bed' file in the source repo, is the gene in a sequencing-accessible region?
  • Did you use the segmentation option --drop-low-coverage?
ADD COMMENTlink written 13 months ago by Eric T.2.5k

Thanks so much for getting back, Eric!

-I ran it with really basic options, see below:

cnvkit.py coverage $bamFile $target -o $outTgtCnn
cnvkit.py coverage $bamFile $antitarget -o $outATgtCnn
cnvkit.py fix $outTgtCnn $outATgtCnn $reference -o $outRatioCnr
cnvkit.py segment $outRatioCnr -o $outSegmentCns

-The gene does appear in the reference .cnn file. The file has ten bins for this gene. I did use a pooled normal reference.The log2 values in the bins from the reference cnn file for these samples were between -0.955 and +0.44, and the spread was always below +1.

-The gene is in a sequencing-accessible, non-tricky region (CCNE2)

-I did not use the option of dropping low coverage. The coverage looks pretty good in the BAMs.

-- I am not sure which version I am using (I'm sorry!). The last time it was downloaded was Dec 7 2016.

ADD REPLYlink modified 13 months ago • written 13 months ago by Lauren50

OK, so far I see no reason for it to disappear. Can you track down the step in the pipeline where the gene disappears?

  • Are those bins actually labeled as "CCNE2" in your reference.cnn file?
  • After fix, is the CCNE2 in the output .cnr file? Are those bins (same coordinates as in the .cnn) present, and do they have the CCNE2 label?
  • After segment, in the output .cns file, can you find the segment that covers that genomic region on chr8? Does that segment's gene list include CCNE2, or any other genes names?
  • Are there any similarly-named genes, i.e. a string containing or contained by the string "CCNE2", nearby?
  • Did you filter the .cns file with the call command? Did you run gainloss/genemetrics?
ADD REPLYlink modified 13 months ago • written 13 months ago by Eric T.2.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1624 users visited in the last hour