First, you need to ask the people who submitted the samples if they are true biological replicates, or technical replicates.
Technical replicates are good for knowing how much variability your library prep and instrument add. A technical replicate would be like taking the liver from one mouse, cutting it into 4 pieces, and treating them like 4 different samples. Any variation between the samples should be an artifact of the library prep and sequencing procedure. So if the exact same sample prepped multiple ways leads to big swings in RPKM, then you know that your prep is lousing things up, and you are going to have very little precision in your estimate of what the "real" expression was in the one sample.
In general, Illumina instruments do a good job with technical replicates, your samples should be very, very similar to each other, and I think if that's the case, combining them might be okay.
Biological replicates are when you, say, expose 4 organisms to the same condition, and differences between the biological replicates is likely not artifacts, but are due to real variations between organisms. You hope that your condition is powerful enough that the difference between the samples will be quite a bit smaller than the differences between two organisms exposed to different conditions.
For instance, let's say you check one control animal, and one condition animal. The control has an RPKM of 3, in one gene, the other animal has an RPKM of 8. A big difference, right?
Well, now you check a bunch of different control animals, and you see that the range of RPKM among control animals is 2-10, and the range from your condition animals was 3-11. Now, it looks like the condition doesn't actually change expression of that gene; it's naturally pretty variable, and the two sets of animals look pretty much the same.
So for biological duplicates, you need to keep them separate, because you need to know the average, and the variance for each gene. Because if there is a lot of organism-to-organism variability, you need to know how significant that is. And combining all the biological replicates together will lose that.
So biological replicates are required for good RNA-seq experiments. But people sometimes cut corners, so don't assume that that's what you have.
It's not the most sophisticated analysis around, but working out the averages and variances for each gene in each group would be a useful, simple place to start. Then, do unpaired t-tests, which look at the averages of each group, and their variances, and give you a number telling you how likely it is that your conditions really is significantly changing the expression if the gene.