Hi everyone,

I am very, very new to the bioinformatics field. I have been tasked with what seems like an insurmountable job, of taking publicly available ChIP-seq data sets, mapping them, and overlaying the peak locations.

They are all different types of files, and I have no idea where to start.

I am not asking to be spoon fed information, I would just like to know a workflow, as in where do I start and how do I progress?

Any help will be very much appreciated!

This workflow seems a nice starting point: NGS data analysis with R / Bioconductor: ChIP-Seq workflow

There are several resources online (google?). These two should get you going

http://www.biologie.ens.fr/~mthomas/other/chip-seq-training

http://www.nature.com/nprot/journal/v7/n1/full/nprot.2011.420.html

I have come across a problem though. My code for bowtie doesn't seem to be working...

bowtie index=mm8.1.ebwt -q SRR1514105.fastq  -v 2 -m 1 -3 1 -S 2> SRR1514105.out > SRR1514105.sam


shoots me back a message saying

Error: unexpected symbol in "bowtie index"


When I remove "index" I get a similar message.

Secondly, I am struggling to understand how to load up MACS. The instructions provided with it are rather cryptic. I feel way out of my depth!

bowtie -v 2 -m 1 -3 1 -q -S mm8 SRR1514105.fastq 2> SRR1514105.out > SRR1514105.sam


Regarding MACS (I presume MACS2), I'm not sure what you mean by "load" in this context. If you have it installed then you just run it (use the callpeak subcommand).

Thanks, I will try that. I'm very new to all of this, so I will probably have some dumb questions...

With MACS, I'm not really sure how to install it. I have everything downloaded, but unlike the scripts in R, I cannot find any way to cause it to install...

pip install --user macs2


Since you appear to be generally new to using the command line, you might also find "bioconda" to be useful. There, we present a single method to install most common bioinformatics packages.

