Advice on how to generate synthetic Copy Number Variation data
2
0
Entering edit mode
18 months ago
K.patel5 ▴ 140

Dear Biostars,

I have developed a CNV detection pipeline for my WES data. While I think it is OK, I would like to test its performance against some synthetic WES data which contains CNVs which I have created synthetically. Does anyone have any experience in generating synthetic WES data based on a FASTQ/ bed file? Or would it be better to spike in duplications or deletions into already existing fastq files which could be used as controls? Any advice on tools which can perform this would be really appeciated.

CNV genomics biology synthetic WES • 745 views
ADD COMMENT
0
Entering edit mode
18 months ago
Prash ▴ 270

Dear K.patel5 This is excellent! Many exome pipelines predict putative/probable CNVs which may not be bona fide. Realistically, they are synthetic, void validation. You could consider them as positives to infer whether or not your pipeline performs well.

If there are less ( and perhaps accurate )CNVs from your exome data, and then you invariably check this from WGS, that should be fine. NOT sure, if this approach sounds good, but to me, it should be okay

Our exomes have several such CNVs mapped too!

Regards Prash

ADD COMMENT
0
Entering edit mode

Thank you for your communication @Prash. Yes I think this kind of ratification would greatly benefit our analysis but we lack any WGS data and only have WES samples. Do you have any tools you could recommend for use to create synthetic samples/ spike-in controls?

ADD REPLY
1
Entering edit mode

Pleasure. During early 2010, SLOPE was a wonderful tool, but the SVs called then were of not that greater precision: https://academic.oup.com/bioinformatics/article/26/21/2684/214667

ADD REPLY
0
Entering edit mode

Thanks @Prash, I can see here they demonstrate their detection tool by generating synthetic data - I can follow this as a blueprint. I suppose there aren't many tools that can create deletions/ duplications and I will have to do this manually.

ADD REPLY
0
Entering edit mode
18 months ago
Prash ▴ 270

Yes, the best would be to employ deepvariant.

ADD COMMENT

Login before adding your answer.

Traffic: 3338 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6