Question: CNVkit: gene results not in output
0
gravatar for Lauren
2.0 years ago by
Lauren70
Lauren70 wrote:

I am looking at a few illumina WES experiments on the same cell line, trying to find a cnkit segment or bin hit on a gene that has reads and is captured in the library. I am not finding any values in all but one sample (no neutral calls, just omitted). The location is represented in the target file.

Variants were detected in this gene using GATK, the reads aren't low quality and it looks like there are enough reads to make a call in CNVkit (when I compare to neutral calls in other samples for this gene).

Does this happen because there aren't enough reads in or around the gene to make an inference in this particular sequencing run, or perhaps it was too noisy to make a conclusion? Thought I'd ask someone who may have a better grasp of the code than I do.

Thanks in advance.

cnvkit • 679 views
ADD COMMENTlink modified 2.0 years ago by Eric T.2.6k • written 2.0 years ago by Lauren70

Hi Lauren, Eric T. should be able to provide help when he next logs in.

ADD REPLYlink written 2.0 years ago by Kevin Blighe61k
1
gravatar for Eric T.
2.0 years ago by
Eric T.2.6k
San Francisco, CA
Eric T.2.6k wrote:

I'm not quite sure what's happening in your data, but to clarify:

  • How did you run CNVkit, and which version did you use?
  • Does the gene name appear in the .cns file? Is it in a short segment by itself, or included in a segment with other genes on it?
  • Does the gene appear in the reference .cnn file? How many bins? If you used a pooled normal reference, are the log2 and spread values in those bins within the expected ranges (log2 between -5 and +5, spread below 1.0)?
  • Is the gene in a tricky genomic region, e.g. in or near HLA, PARs, centromeres, telomeres? If you used the 'access.bed' file in the source repo, is the gene in a sequencing-accessible region?
  • Did you use the segmentation option --drop-low-coverage?
ADD COMMENTlink written 2.0 years ago by Eric T.2.6k

Thanks so much for getting back, Eric!

-I ran it with really basic options, see below:

cnvkit.py coverage $bamFile $target -o $outTgtCnn
cnvkit.py coverage $bamFile $antitarget -o $outATgtCnn
cnvkit.py fix $outTgtCnn $outATgtCnn $reference -o $outRatioCnr
cnvkit.py segment $outRatioCnr -o $outSegmentCns

-The gene does appear in the reference .cnn file. The file has ten bins for this gene. I did use a pooled normal reference.The log2 values in the bins from the reference cnn file for these samples were between -0.955 and +0.44, and the spread was always below +1.

-The gene is in a sequencing-accessible, non-tricky region (CCNE2)

-I did not use the option of dropping low coverage. The coverage looks pretty good in the BAMs.

-- I am not sure which version I am using (I'm sorry!). The last time it was downloaded was Dec 7 2016.

ADD REPLYlink modified 24 months ago • written 24 months ago by Lauren70

OK, so far I see no reason for it to disappear. Can you track down the step in the pipeline where the gene disappears?

  • Are those bins actually labeled as "CCNE2" in your reference.cnn file?
  • After fix, is the CCNE2 in the output .cnr file? Are those bins (same coordinates as in the .cnn) present, and do they have the CCNE2 label?
  • After segment, in the output .cns file, can you find the segment that covers that genomic region on chr8? Does that segment's gene list include CCNE2, or any other genes names?
  • Are there any similarly-named genes, i.e. a string containing or contained by the string "CCNE2", nearby?
  • Did you filter the .cns file with the call command? Did you run gainloss/genemetrics?
ADD REPLYlink modified 23 months ago • written 23 months ago by Eric T.2.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1016 users visited in the last hour