Question

cnvkit segmetrics on multiple files

1

Entering edit mode

6.5 years ago

adoak ▴ 10

Is there a way to run segmetrics on multiple files at once? or incorporate it into batch? Additionally, I have certain segments or chromosomes that keep showing up as having no variation. Is this a problem with mapping?

cnvkit cnv • 1.7k views

ADD COMMENT • link 6.4 years ago by adoak ▴ 10

score 0 · Answer 1 · 2017-11-10

It's not currently part of batch, so I'd recommend running it in a loop in Bash or a Makefile.

No variation, or no reads? The latter would be an issue with mapping, the former maybe not. Which chromosomes or regions are causing trouble? I'd expect that regions close to centromeres and telomeres, as well as chr6 and chrY, would be noisy. Using more samples in your pooled reference can help to both reduce noise at these regions and improve the sensitivity of calling in the rest of the genome.

score 0 · Answer 2 · 2017-12-01

0

Entering edit mode

6.4 years ago

adoak ▴ 10

In the .cns data it looks like very low variation, but when mapped to a diagram the gene names do not show up and the sections are blank. I'm missing a section of chromosome 15, 14, and X. The sample size is currently around 30, maybe increasing would help. The missing sections are consistent for each sample.

ADD COMMENT • link 6.4 years ago by adoak ▴ 10

0

Entering edit mode

Are those missing regions the centromeres?

Some portions of each chromosome are unmappable, i.e. difficult to sequence reliably. In the reference genome sequence (FASTA) these are masked out with 'N' characters, and in your BAM you will see there is no sequencing coverage in these regions. CNVkit excludes these from its output, so you can expect see holes in your data around these regions.

ADD REPLY • link 6.4 years ago by Eric T. ★ 2.8k