CNVkit assuming male for female samples
3
1
Entering edit mode
5.9 years ago

Hi.

I am using CNVkit batch function to process a number of WES tumor samples from Breast Cancer. I know that all my samples are from female patients, yet the CNVkit log tells me that some of the samples are assumed male. As a result there is a loss of X in these samples.

How can i fix that? I am worried that this indicates a larger problem with the data. The reads were aligned with BWA, reduplicated and recalibrate as per the BROAD best practices recommendations.

Many thanks,

CNVkit • 3.0k views
0
Entering edit mode
5.9 years ago
Eric T. ★ 2.7k

You can see the unshifted log-ratio values in your cohort with the heatmap command -- this can help you quickly determine if there's a broader problem in your data or in the way you processed your samples with CNVkit.

The log2-scaled copy ratios in the segmented .cns files are not dependent on sample gender, so it might not be a problem for you that CNVkit detected the gender incorrectly for some of them. In the call and export functions there is a -g option you can use to directly specify the sample's gender, rather than rely on CNVkit's guess. Will that do what you need?

The reference command does detect and adjust for sample gender automatically, without a -g option for specifying these values manually. Since the inputs are normal samples, the sample gender should not be difficult to detect automatically, and any samples that were incorrectly detected should probably not be included in the reference.

The next release of CNVkit (due soon) will have an improved method for detecting chromosomal sex, which should reduce these problems. It also includes the --gender option in every command that depends on sample gender (other than reference).

0
Entering edit mode

Thank you for the clarification. The heatmap function is very useful and if it doesn't rely on sample gender that is good enough for our purposes. I will implement the -g option for the export function.

0
Entering edit mode
5.9 years ago

The heatmap function produced a heatmap showing a clear loss of X in 9 out of 16 samples. Others have some loss that may be real. I did not mention that I ran CNVkit with tumor only (we don't have normals as these are archived FFPE samples). using the function call with -g option specifying female did not change the heatmap.

What could be wrong?

0
Entering edit mode

It sounds like there are a number of low-coverage bins that are causing the segmentation algorithm to report low copy number. You can repeat segmentation with the --drop-low-coverage option to clean up your data. If the scatter plots look very noisy, you may have better results with a larger antitarget bin size (--antitarget-avg-size option in batch).

0
Entering edit mode

Thanks again for your help. I am having a difficult time finding the default --antitarget-avg-size, and I am wondering if you can recommend an optimal figure.

0
Entering edit mode

Try --antitarget-avg-size 200000.

0
Entering edit mode

I have implemented your suggestions and indeed the chrX loss is not visible in the diagram. However, the heatmap does not show the change. Does heatmap also use the .cnr files? As a side note I will report that the diagram function exports a diagram with labels [when calling it with .cns only] (it never did that in previous versions) and I cannot find anything about this in the documentation. Thanks again

0
Entering edit mode

The heatmap plot does not adjust for gender, only diagram does. If the diagrams look correct (and don't show spurious gain or loss of chrX) then CNVkit is detecting chromosomal sex correctly.

The heatmap command can take .cnr files as well as .cns, even both in the same plot. It just takes longer to render the .cnr rows because there is more data in those files.

diagram uses gene names from the input .cns file in the latest version of CNVkit. Is that a problem? They can be easily editor or deleted in a vector graphics editor like Inkscape or Adobe Illustrator.

0
Entering edit mode

Thank you, your swift reply is very helpful and you have answered all my questions. Regarding the diagram, I can edit in Illustrator, it was just surprising as previously it did not show the gene names and the ideogram was clean and easy to present.

Many thanks

0
Entering edit mode

Good to hear. In diagram, you can reduce the number of labeled regions by using a larger value for the -t/--threshold option, e.g. -t 1.0 to only label deep deletions and amplifications, and -t 99 to not label anything.

0
Entering edit mode
5.9 years ago

Traffic: 1379 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.