Hello folks,
I am trying to call CNVs from trio (Father, Mother, Son) samples. Father and son are affected and mother is unaffected. I used XHMM (eXome-Hidden Markov Model) for calling CNVs in my exome sequenced reads. XHMM uses principal component analysis (PCA) normalization and a hidden Markov model (HMM) to detect and genotype copy number variation (CNV) from normalized read-depth data from targeted sequencing experiments.
Firstly, I tried using only my trio samples. I got empty list of cnv output file. Later I read that I should have at least >30 samples to predict CNVs. Therefore, I used 50 other exome sequenced samples which was sequenced in our centre along with these three samples.
Trio exome samples were sequenced using Nextera Exome Capture Kit (As per the capture bed file - 214,126 intervals).
50 other exome samples were sequenced using Nimblegen SeqCap EZ- Exome Capture Kit (As per the capture bed file - 195,031 intervals)
Can I merge the exome interval lists from two different kits?
Is there any other alternative for me to identify potential CNVs?
Thanks, Sean for your help, I really appreciate it. Just for testing purpose, I used the Nextera Exome Capture bed file for both the trio (originally Nextera) and 50 (originally Nimblegen) exome samples. I got ~2000 CNVs from 50 samples and 340 CNVs from the trio samples.
I will read both the papers and I will try with EXCAVATOR and CONFIFER. Also, I am planning to use only the intersecting targets between Nextera Exome and Nimblegen Exome capture kits. For getting the intersecting targets, I am going to use bed tools with 90% overlap.