I am attempting to analyse the read count of targeted sequencing to identify potential CNV.
I sequenced the exons of some mouse genes of interest with a Ion Torrent sequencer. This is DNAseq, not RNAseq and the DNA was assayed prior to sequencing to have the same amount of DNA from each mouse.
So I have a matrix with the number of reads per amplicon, for each of my mouses. However, these are raw data and I can't compare these values.
What is the best method for normalize my data ?
I have first normalize each amplicon value with the total number of reads, for each of my mouse. Secondly I've normalize each amplicon with the median of this amplicon. The median of each amplicon is calculated from the values of this amplicon for each mouses.
But I have a lot of false positives with this method.