I have a Chip-seq dataset described as follows. Two replicates of control sequenced with two replicates each for two (T. factor) treatments. Therefore, a total of 6 fastq files resulting from the same lane.
My question is, how do I normalize the data for a comparison of each treatment with control in a scenario where there are about 70 million and 8 million reads for rep1 and rep2 of first treatment and 2 million and 5 million reads for controls. I am not sure about the total number of reads in the second treatment. I have pasted the stats of bowtie2 output.
8077966 reads; of these:
8077966 (100.00%) were unpaired; of these:
2270927 (28.11%) aligned 0 times
2605491 (32.25%) aligned exactly 1 time
3201548 (39.63%) aligned >1 times
71.89% overall alignment rate
70910425 reads; of these:
70910425 (100.00%) were unpaired; of these:
32129717 (45.31%) aligned 0 times
18056752 (25.46%) aligned exactly 1 time
20723956 (29.23%) aligned >1 times
54.69% overall alignment rate
5435992 reads; of these:
5435992 (100.00%) were unpaired; of these:
1252404 (23.04%) aligned 0 times
1898388 (34.92%) aligned exactly 1 time
2285200 (42.04%) aligned >1 times
76.96% overall alignment rate
2755776 reads; of these:
2755776 (100.00%) were unpaired; of these:
2129160 (77.26%) aligned 0 times
277810 (10.08%) aligned exactly 1 time
348806 (12.66%) aligned >1 times
22.74% overall alignment rate
Should I just go about merging sorted bam files of each replicate and use as MACS input? OR analyze each replicate individually? I did the later and the difference was about 50 peaks for one and 400 peaks for another. I am not sure if I should trust the analyses.
Other option is to normalize all three samples, C, T1, T2 together and maybe look for a coordinate regulation between T1 and T2 with respect to C. But my main concern is normalization in such a manner that each treatment can be compared to control for direct targets.
Thanks for suggestions and ideas. :)
P.S.: I played no role in designing this experiment ;P . The biologists have no clue as to why they did this.