Tool: ClinCNV: CNV detection from short reads
11
gravatar for kuckunniwid
27 days ago by
kuckunniwid550
kuckunniwid550 wrote:

Dear community members,

we've prepared a tool for CNV detection (another one) called ClinCNV. It was already used for the analysis of around 5 thousands of samples sequenced on different platforms and the results are quite good, we also performed the benchmarking and found out that the tool is at least not worse than the competitors in germline context and works better for somatic context (using False Discovery Rate and concordance as metrics). It works for us and for our collaborators, however, since we aim to publish it, we would like to get some feedback before - is the readme readable? is the installation and file preparation easy? what can be improved?

The tool uses cohorts of samples and read-depth (and BAF for somatic calling). It has quite a lot of features, such as clustering of samples prior to analysis, IGV visualization, polymorphic regions calling, mosaic CNV calling, different options for FDR control, etc. To have a quick overview I'd recommend to go directly to the docs. Try the test run with the command from here.

The limiting factor may be - we used ngs-bits for files preparation, however, it is an easy-to-install package, it is fast and has many useful features.

Please send me any feedback about the tool.

PS Yes, I can pack it as R package, but I do not see a huge advantage of this.

PPS I got couple of questions at my mailbox, however, I think it make sense to post anything of general interest here, so I encourage you to add comments / answers with questions or advices or criticism here and please please please let me know if nothing works.

UPD the preprint is here, somatic part of ClinCNV. Please, criticize it. Not sure if I am gonna to submit it somewhere, but at least I would like to update it, taking into account your critics. https://www.biorxiv.org/content/10.1101/837971v1

cnv tool variant calling cna • 317 views
ADD COMMENTlink modified 3 days ago • written 27 days ago by kuckunniwid550
1

May have helped:

https://twitter.com/BiostarBot/status/1185192354438897665

:)

ADD REPLYlink modified 27 days ago • written 27 days ago by Kevin Blighe51k
1

Thanks a lot, Kevin!

ADD REPLYlink written 26 days ago by kuckunniwid550
1

hey man i was making file preparation, and in the manual is:

////// Then you need to merge your ".cov" files into one table. To do this, you can use script mergeFilesFromFolder.R script provided with ClinCNV using input_folder and output_folder as variables to keep your absolute paths:

Rscript mergeFilesFromFolder.R -i $input_folder -o $output_folder \\\\

But with --help u can see the next -o CHARACTER, --out=CHARACTER output file name [default= out.txt]

this is right and manual is not.

Also, could you make your script for merging only .cov files rather everyone in the folder? If it not very hard, i think it would be good to allow use a wildcards: Rscript mergeFilesFromFolder.R -i *.cov -o batch.txt

ADD REPLYlink modified 8 days ago • written 8 days ago by danatyermakovich20
1

Thanks a lot! Will fix it on Monday

ADD REPLYlink written 8 days ago by kuckunniwid550

I've tried to overcome this exception near hour. I think I can beat it, but now i should leaving. Maybe you can help me

[1] "Percentage of regions remained after GC correction: 0.957518796992481"
Error in gcNormalisedCov[which(!bedFile[, 1] %in% c("chrX", "chrY")),  : 
  subscript out of bounds
Calls: writeOutLevelOfNoiseVersusCoverage -> apply
Execution halted

I've proceeded files obtained byTruSightCardioSeqKit (alignmented on the GRCh37_latest_genomic.fna). yeap, i think i haven't got the chrY in my dataset

#################################

simple_command (i made the simplest one for first run on my data)

Rscript ~/progs/ClinCNV/clinCNV.R --normal $Files/batch_1.cov --bed $Path/gcAnnotated.extended_trusight.bed --out $Files/RES --numberOfThreads 8

Below

 1. head -n 10 my_bed.bed; tail -n 10 my_bed.bed

chr1    2985722 2985960 0.7017
chr1    3102587 3103138 0.6316
chr1    3160549 3160801 0.5556
chr1    3301612 3301950 0.5888
chr1    3303139 3303360 0.3801
chr1    3310955 3311158 0.7094
chr1    3312953 3313257 0.6151
chr1    3319253 3319662 0.6381
chr1    3321201 3321550 0.6476
chr1    3321957 3322312 0.7014
......
chrX    153607743   153608479   0.6644
chrX    153608492   153608827   0.5851
chrX    153609011   153609657   0.6022
chrX    153640079   153640651   0.6958
chrX    153641442   153641693   0.6096
chrX    153641717   153642004   0.6202
chrX    153642336   153642627   0.5258
chrX    153647780   153648185   0.5926
chrX    153648269   153648703   0.6221
chrX    153648895   153649443   0.6150

.

 2. head -n 5 batch_1.cov; tail -n 5 batch_2.cov

X.chr   start   end X100_S3_Srt X102_S5_Srt X104_S7_Srt X106_S4_Srt X107_S9_Srt X108_S10_Srt    X109_S11_Srt    X110_S5_Srt X111_S6_Srt X113_S8_Srt X114_S9_Srt X116_S10_Srt    X117_S11_Srt    X125_S2_Srt X127_S3_Srt X129_S4_Srt X130_S5_Srt X131_S6_Srt X132_S7_Srt X133_S8_Srt X135_S9_Srt X136_S10_Srt    X137_S11_Srt    X139_S12_Srt    X17_S2_Srt  X23_S5_Srt  X32_S4_Srt  X52_S1_Srt  X86_S2_Srt  B_S12_Srt   ry_S12_Srt
chr1    112318597   112319000   77.273  124.1538    120.196 27.2283 137.1762    137.6774    143.9801    26.3077 44.1663 28.0819 79.0943 37.2357 47.5509 108.5236    87.34   147.5881    79.5186 70.3871 100.9355    30.3772 129.4888    153.3052    90.6998 126.866 115.6725    68.5782 120.9082    114.8635    46.7395 52.9504 82.603
chr1    112319546   112319995   83.412  116.5367    115.8998    35.1849 107.0111    105.4454    127.0022    26.92247.9599   30.7461 100.5323    42.6303 49.9844 56.6414 66.5234 89.5278 55.8842 56.098  70.5702 23.2138 84.902  121.3163    61.9198 79.0757 119.4053    61.0913 138.4454    97.4232 52.4365 53  71.0045
chr1    112320956   112321214   57.124  111.7713    79.3837 22.593  88.5155 96.3605 114.5349    21.2364 38.155  12.2519 66.7907 23.5426 40.8295 71.0116 65.1822 64.062  43.4612 47.1124 78.155  24.1434 70.2442 85.6822 58.8837 54.2326 81.1705 46.4264 93.1008 65.2326 41.0659 32.8915 61.0543
chr1    112322745   112323036   109.0997    122.4158    125.433 48.9588 119.6632    116.6186    145.1684    36.7938 55.9072 38.3162 110.3196    48.0309 70.3643 85.457  65.7182 102.7148    57.1581 50.2749 83.9897 17.1031 79.8247 106.9244    85.1512 98.2027 98.6529 62.2027 109.9966    134.9485    74.5223 71.677  71.866
....
chrX    32867743    32868037    44.5816 61.1599 54.1429 30.1361 68.0204 122.5442    67.8163 9.6599  27.7211 7.1769  37.2517 35.2313 36.2755 43.0306 30.9966 31.5612 27.5816 50.4014 38.5986 26.8776 25.6769 57.3061 69.7891 30.051  52.8299 52.3639 41.0646 32.381  34.53421.2483   74.5102
chrX    33038154    33038417    35.711  107.3992    104.365 32.7452 108.3954    168.0798    78.057  8.4335  20.7376 8.045646.6882   31.5247 42.384  102.7224    47.8669 86.5247 62.7224 104.9696    97.1711 45.1977 80.4259 164.0875    176.8327    35.1255 56.5323 81.365  79.0951 32.4563 24.6388 13.3612 109.8175
chrX    33146162    33146382    65.8364 114.0682    107.6   48.7227 136.7   190.2455    110.3227    19.6545 35.5818 26.9364 89.5682 73.5955 81.7864 74.1    31.6636 48.15   42.4227 63.0727 78.4136 36.5591 52.2864 130.5909    144.1955    49.9455 94.1136 148.0182    102.3182    83.4455 48.0136 38.7682 126.65
chrX    33229297    33229529    43.3793 77.8534 79.5259 32.6121 84.7672 128.0345    79.2629 15.7974 23.2328 12.1983 36.9353 26.4569 44.2241 57.9828 35.8448 40.0517 33.5043 79.9526 58.3534 18.8578 51.2198 97.6681 105.6595    30.0259 65.0948 78.3103 48.8966 41.3922 27.4397 15.8922 96.7543
chrX    33357274    33357482    43.4183 98.2356 69.0529 34.0288 83.2548 155.5817    81.6442 8.5481  36.6731 15.9567 47.4519 36.0337 61.3413 54.2452 39.9567 58.2837 34.7596 72.8221 62.0721 32.8798 56.0385 117.0337    123.8413    51.4038 74.3077 78.7692 77.7548 55.899  23.7404 24.7452 96.7163

p.s. the biostar makes hot mess when publish this post; i don't know how to save the table view of the data

ADD REPLYlink modified 8 days ago by Kevin Blighe51k • written 8 days ago by danatyermakovich20
2

Hey, I tidied your code and output via the 101 010 button.

ADD REPLYlink written 8 days ago by Kevin Blighe51k

Tidied again

ADD REPLYlink written 8 days ago by Kevin Blighe51k
1

oh, thanks, i see now how the magic 101 010 button :) sorry for mess, i think this is my first posts on biostar

ADD REPLYlink written 8 days ago by danatyermakovich20

ClinCNV for now does not like small panels of genes, mainly due to lack of testing - we simply have not included small panels into our test routine. ClinCNV likes bigger panels since it performs gc and length normalization and in small panels it is not so easy. I'll work on it on Monday, again, but what you can try right now - divide your on target bed file with the command BedChunk into pieces of length of 150 bp, for example. The way how to use the command is described in off target reads section. Then re calculate coverage and run it again. It solved the problem for our collaborators with the same panel, as I remember.

ADD REPLYlink written 8 days ago by kuckunniwid550

okay, thanks. I'll try it today

ADD REPLYlink written 3 days ago by danatyermakovich20

I found a test case that reproduces your error. Will fix it ASAP, will write you once it will be fixed.

ADD REPLYlink written 3 days ago by kuckunniwid550

I have a free time and sent my data to German. I did it a few minutes ago, seems that i've late. Sorry :| But anyway, hope the error can be simple fixed.

ADD REPLYlink written 3 days ago by danatyermakovich20

Try to make a git pull now =) and run the same command. it should work.

ADD REPLYlink modified 3 days ago • written 3 days ago by kuckunniwid550

thanks for the data, it does work, I've sent you the results back.

ADD REPLYlink written 3 days ago by kuckunniwid550
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1089 users visited in the last hour