Can someone help me figure out the basic biological difference between differential gene expression analysis and copy number variation between two conditions (case/control) either using array or sequencing data.
In common parlance, copy number analysis looks at genomic regions (so, DNA).
Differential gene expression assays transcription of genes (so, it's kind of like copy number of RNA transcripts).
In both cases it is helpful to compare to controls. So for copy number analysis we usually compare to genomic variation across the population. For differential gene expression it is helpful to compare disease state vs controls -- the relative difference in RNA transcript is your differential expression.
Either can be assayed via array or sequencing -- although there are major strengths and weaknesses of both.
Alex already explained it, but I'll just restate slightly differently. You have to remember your basic biology. Copy number analysis refers to measuring DNA in the cell. Specifically, how many copies of a given locus exist in the cell? In most eukaryotic cells there is only one copy of the genome (haploid) or two copies of the genome (diploid). If a locus get's duplicated, or an entire chromosome get's duplicated, this can be detected by microarray or sequencing, because the number of copies increases by 1. Given diploid cells with an extra copy of some locus, the difference between 2 and 3 copies of something is detectable. Differential Gene Expression, on the other hand, is a far different scenario. You're detecting the product of gene expression - RNA. When genes are expressed they produce multiple copies of RNA corresponding to the sequence at the gene locus. Say a given cell has 25,000 loci that can be expressed (still only 2 copies of the genome, but 25,000 places on the genome that can produce RNA), each locus can produce between 0 and 50,000 copies of a transcript per cell. Thus between 2 conditions you have 25,000 elements to measure and you can't make assumptions about the concentration of any specific element in condition 1 or condition 2. But you can measure the difference with 25,000 probes on an array, or map sequence reads to the 25,000 loci and count them under each condition.
So to summarize, for copy number analysis, the expectation is that you have large regions of DNA, most of which will not change their concentration by more than 2 fold (or at least small fold) between two conditions. Whereas for differential gene expression you have thousands of regions in the genome that can change their concentration (via RNA) over 4 or 5 orders of magnitude between two conditions. So you can use the same tools to collect data on each scenario, but your expectation on how that data behave will be different.