Question: CNVkit output problem: Is "log2" the same as "Seg_mean"? OR how can I get "Seg_mean" with "log2"?
0
gravatar for Laven9
6 months ago by
Laven90
Laven90 wrote:

I have just get my CNV files by CNVkit. I am wondering if the column "log2" in the output of CNVkit (after call) is the same as "Seg_mean". If not, how can I get the "Seg_mean" with "log2"? Please, give me some advice,thanks!

cnv cnvkit seg_mean • 389 views
ADD COMMENTlink modified 4 months ago by Eric T.2.5k • written 6 months ago by Laven90

Please read: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002202 and add more relevant details to your question. What have you tried? Have you read the CNVkit paper?

ADD REPLYlink written 6 months ago by RamRS23k

Here are two lines of what I get.

chromosome  start   end log2    probes
chr1    826717  2410579 -0.00659771 487
chr1    2410780 2787772 -0.372291   70

Yes, I have read the CNVkit paper, here is the link.
I get an answer like this: Segment_Mean is the arithmetic mean of those probes' log2 copy ratio values.
But I am still confused how can I get "Segment_Mean"? I need it as an input to ABSOLUTE.

ADD REPLYlink modified 4 months ago by RamRS23k • written 6 months ago by Laven90

And I have got CNV file by Varscan too ,but the "Segment_mean" is quite too large.

ADD REPLYlink written 6 months ago by Laven90

I've moved this to a comment - please do not add an answer unless you're answering the top-level question. Plus, edit your question and add this information in there. Please read posts under /t/how-to for more information.

ADD REPLYlink modified 6 months ago • written 6 months ago by RamRS23k
0
gravatar for Eric T.
4 months ago by
Eric T.2.5k
San Francisco, CA
Eric T.2.5k wrote:

In the .cns files, yes, log2 is the segment mean in log2 scale. Details here: https://cnvkit.readthedocs.io/en/stable/fileformats.html

ADD COMMENTlink written 4 months ago by Eric T.2.5k

Thanks for your help!

And I am now facing other problem using CNVkit, could you please give me some advice? Details are as follows: I am running CNVkit for CNV files of my whole-exon sequencing data. I use command like cnvkit.py batch -m amplicon -t targets.bed *.bam , but I can not provide the targets.bed file. And I also check Astra-Zeneca’s reference data repository but cannot find as well.

My questions are: 1) Is that right I use -m amplicon ? 2) Is there any file containing total exons of human I can use for script guess_baits.py ? I am really confused where I can get the total bed file I can use for guess!

I will appreciate it if you could give me some advice!

ADD REPLYlink written 4 months ago by Laven90

For exome, -m hybrid is better than -m amplicon. You can verify that there are off-target reads by loading the BAM file in a viewer like IGV.

For guess_baits.py, try UCSC's RefSeq exons (refFlat.txt here), or another BED file of known genes from UCSC Genome Browser. Make sure the reference genome matches.

ADD REPLYlink written 4 months ago by Eric T.2.5k

Thanks a lot! I got it, but I do also want to make sure I am doing the right thing. Here what I did.

skg_convert.py refFlat.txt -t bed -o refFlat.bed
guess_baits.py bam1 bam2 -t refFlat.bed -o guess_baits.bed

But I get error like this:

Loaded 80816 candidate regions from refFlat.bed
Evaluating targets in bam1
Processing reads in bam1
Time: 1281.040 seconds (205575 reads/sec, 61 bins/sec)
Summary: #bins=78477, #reads=263349347, mean=3355.7520, min=0.0, max=197074.45
Percent reads in regions: 279.509 (of 94218509 mapped)
Traceback (most recent call last):
  File "miniconda2/bin/guess_baits.py", line 246, in <module>
    baits = filter_targets(args.targets, args.sample_bams, args.processes)
  File "miniconda2/bin/guess_baits.py", line 54, in filter_targets
    "%d != %d" % (len(sample), len(baits))
AssertionError: 78477 != 80816

What does it mean?

ADD REPLYlink modified 4 months ago • written 4 months ago by Laven90

Hmm, not sure, I'll take a look to see if there's a bug in guess_baits.py.

If you're building a pooled reference (multiple control samples), you can also just use the refflat.bed file as-is and CNVkit will drop most of the uncaptured exons automatically.

ADD REPLYlink written 3 months ago by Eric T.2.5k
0
gravatar for Eric T.
4 months ago by
Eric T.2.5k
San Francisco, CA
Eric T.2.5k wrote:

xxxxxxxxxxxxxxxxxxxx

ADD COMMENTlink modified 4 months ago • written 4 months ago by Eric T.2.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1326 users visited in the last hour