CNV detection in a small part of the genome is not something that many people do. There are a few reasons I can think of:
- Accurate CNV detection usually depends on training a hidden markov model on a specific type of data. This differs from platform to platform. What you would be asking of the program is to effectively create a new platform that is comprised of just the SNPs you are interested in. This would require that the HMM be trained on some samples of this type. 1.
- Genomic waviness and population allele frequency depend, in a large part, on your specific samples and the facility where you have genotyped them. You will obtain more accurate CNV detection if you can evaluate these factors on a genome-wide scale.
- CNV border detection by HMM depends on evaluating a varying number of SNPS before and after the region you are interested in. By specifying an artificial start and stop position you may get inferior boundaries, since you will not be evaluating the information before and after your region.
If you are truly interested in CNV detection in a specific region in many samples, you should check out methods that do not use an HMM, such as circular binary segmentation. The R DNAcopy package is an easy to use implementation. The results from CBS may not be as accurate as applying an HMM to your data.
If you are worried about speed, you might try PennCNV. This will be faster than running Genome Studio, and if you have multiple cores you can split your input files and run multiple instances at once. I routinely run CNV analysis on 3000 samples using 12 cores, and it takes a couple of hours on an Illumina 1M platform.