Merging segment CNV files
1
0
Entering edit mode
8 weeks ago
avelarbio46 ▴ 30

Hello all

I have called somatic CNVs with both CNVkit (short-reads) and SAVANA (long-read) for paired tumor samples. This means I have the segment data for both short-read and long-read per sample

But I'm having trouble finding any good way of merging those segments files to select only sCNVs present in both short-read and long-read for the same sample

My idea was to follow this simple procedure: Merge short-read with long-read sCNVs -> Select sCNVs present in both techniques -> Use segment file in GISTIC2 to find enriched sCNVs in the whole cohort

The main problem is that some programs (for example SAVANA and GISTIC2) doesn't output VCF files. In fact, most CNV calling programs don't use VCFs or use and outdated VCF format (less than v4.2) which is not great for CNVs specifically

Another problem is at what distance I could merge the CNVs of the same type (total overlap, 1bp overlap, 1000bp distance)

I know the program SURVIVOR can merge VCFs for structural variants, but it does not accept other file formats.

Any ideas would be appreciated.

Example segment files:

IGV description of segment file

From CNVkit manual:

The SEG format is the tabular output of DNAcopy, the reference implementation of Circular Binary Segmentation (CBS). It is a tab-separated table with the following 5 or 6 columns:

ID – sample name
chrom – chromosome name or ID
loc.start – segment’s genomic start position, 1-indexed
loc.end – segment end position
num.mark – (optional) number of probes or bins covered by the segment
seg.mean – segment mean value, usually in log2 scale
The column names in the first line are not enforced, and can vary across implementations.

SEG files can be used with a number of other programs that operate on segmented log2 copy ratios – including GISTIC 2.0, IGV, the GenePattern server, and many R packages.

To convert CNVkit’s .cns files to SEG, use the command export seg, and to convert SEG files produced outside of CNVkit into CNVkit’s own segmented format (.cns), use import-seg.
CNV • 547 views
ADD COMMENT
0
Entering edit mode

Why do you need VCFs? Are you not operating at seg level?

ADD REPLY
0
Entering edit mode
8 weeks ago

I know the program SURVIVOR can merge VCFs for structural variants, but it does not accept other file formats.

as far as I can see there are some tools to convert the CNVKIT format to vcf: https://manpages.ubuntu.com/manpages/jammy/man1/cnvkit-export.1.html

ADD COMMENT
0
Entering edit mode

CNVkit does output a VCF file. But SAVANA (long-read CNV) doesn't. I tried another program for long-reads, WAKHAN, but it is currently not working properly with my samples. In theory, I could use CNVkit or other programs specifically made for short-reads, but none were created with long-reads in mind, including the problems we have with b-allele and lower SNP accuracy. Also, I would have to manually reconvert the VCF to SEG file to run gistic afterwards (which I also didn't find a way to do yet)

ADD REPLY

Login before adding your answer.

Traffic: 4759 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6