Chip-Seq Data Analysis, Newbie To Do Steps
3
3
Entering edit mode
12.8 years ago
Patrick ▴ 40

Hi,

I am new to Chip-Seq data analysis and I am interested in doing this kind of analysis given a genomic position range

  • Find enrichment in H3K4me(1 and 3)
  • H3K9,14Ac
  • P300 occupancy
  • DNAse activity sites
  • TFBS

I would like to know from where to start, which data I have to get and from where ? I've seen a lot of data in ftp://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/ but the read me file is not informative for a newbie

My second question is what are the best open source tools to use for these kind of analysis and what are the steps to follow (all the tutorials part are dealing with explaining what is chipseq and not how to analyze the data ? )

Thank you for your help

chip-seq analysis • 4.1k views
ADD COMMENT
1
Entering edit mode

Did you want to do the peak calling yourself, or use peaks picked already by ENCODE?

ADD REPLY
0
Entering edit mode

If the pics are already done yes i can use them at first and then refine if needed

ADD REPLY
0
Entering edit mode

Also you've listed data you're interested in, but not any of the questions you're interested in.

ADD REPLY
0
Entering edit mode

Bailey T, Krajewski P, Ladunga I, Lefebvre C, Li Q, et al. (2013) Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data. PLoS Comput Biol 9(11): e1003326. doi:10.1371/journal.pcbi.1003326

http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003326

ADD REPLY
2
Entering edit mode
12.8 years ago
Aaron Statham ★ 1.1k

The ENCODE consortium makes available raw and processed data for download here. So for example to get H3K4me3 peaks in hg19 I would follow the Broad Histone link, and download all files ending in "h3k4me3stdpk.broadpeak.gz".

Make sure the data you use is past the embargo date listed on the left before you publish though.

ADD COMMENT
1
Entering edit mode
12.8 years ago
Ian 6.0k

I recommend GALAXY for comparing genome coordinates.

Makes sure all your datasets use the same reference genome, e.g. HG18

ADD COMMENT
0
Entering edit mode
12.8 years ago

A broad peak file looks like this

chr22   16847536        16863983        .       294     .       1.877598        12.7    -1
chr22   16850062        16850215        .       1000    .       13.626036       6.0     -1
chr22   16850752        16850925        .       1000    .       19.582503       15.4    -1
chr22   17306120        17307007        .       482     .       4.994549        6.9     -1
chr22   17394530        17395284        .       452     .       4.493068        3.2     -1

what are the columns refering too ? I can understand the three ones but what about the others ?

ADD COMMENT
0
Entering edit mode

Please create a new question for the above - it is now scheduled to be deleted because it is not an answer but a new question.

ADD REPLY

Login before adding your answer.

Traffic: 3051 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6