I'd like to study processing CHIPseq data by myself, I choose a paper to minic their data analysis steps.The details for these data : cell line: MCF7 // Illumina HiSeq 2000 // 50bp // Single ends // phred+33
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE52964
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByStudy/sra/SRP/SRP033/SRP033492
I've check the metadata one by one, It really makes me confused.
> GSM1278641 Xu_MUT_rep1_BAF155_MUT SRR1042593
> GSM1278642 Xu_MUT_rep1_Input SRR1042594
> GSM1278643 Xu_MUT_rep2_BAF155_MUT SRR1042595
> GSM1278644 Xu_MUT_rep2_Input SRR1042596
> GSM1278645 Xu_WT_rep1_BAF155 SRR1042597
> GSM1278646 Xu_WT_rep1_Input SRR1042598
> GSM1278647 Xu_WT_rep2_BAF155 SRR1042599
> GSM1278648 Xu_WT_rep2_Input SRR1042600
Is it possible that the control data will be bigger than the treatment samples' data ?
621M Jun 27 14:03 SRR1042593.sra (16.9M reads)
2.2G Jun 27 15:58 SRR1042594.sra (60.6M reads)
541M Jun 27 16:26 SRR1042595.sra (14.6M reads)
2.4G Jun 27 18:24 SRR1042596.sra (65.9M reads)
814M Jun 27 18:59 SRR1042597.sra (22.2M reads)
2.1G Jun 27 20:30 SRR1042598.sra (58.1M reads)
883M Jun 27 21:08 SRR1042599.sra (24.0M reads)
2.8G Jun 28 11:53 SRR1042600.sra (76.4M reads)
Wow, that's a huge font. I have no idea what you want to do, but it's still quite impressive.
It might be useful if you could tell us some things like what your experiment entails, how the libraries were created, where your data comes from, and so forth. Please be as complete as possible if you want useful responses. It's certainly possible for random people on the internet to research the sra files in your post, but I think most people are not interested in doing that. So if you want answers, make things as easy as possible for your potential answerers. It looks like you are trying to redo an existing analysis, but unless you state exactly what you are trying to accomplish, it's difficult to give advice.
Oh, sorry about that. In fact, I give the link to NCBI, the guys interested will look through the details at the GEO page.
So I didn't duplicate the description about the experiment.
It's just a CHIP-seq data analysis question .
BAF155, as a important TF, which can be methylated by CRAM1 gene.
So, they do 2 type of CHIP-seq experiments
One to check where the BAF155 will impact the genome .
The other will check how will the function of BAF155 change if BAF155 can't be methylated for mutation .