Hope this isn't a silly question. I have identified a region of my bacterial genome which contains higher than average coverage. Using
bedtools genomecov I have calculated and plotted the coverage of a specific gene:
https://i.imgur.com/pMs5bbZ.png note: X axis is the position in the genome, Y axis is the coverage per nucleotide.
The average coverage for this isolate is ~250. This particular gene hits around 1500 but then dips to ~0.
My question is how do I differentiate between a potential CNV and a paralogous alignment? The graph above is hinting at paralogous sequence. However there are two copies of the gene in the genome of the bacteria.