Question: Meaning of "#NAME?" columns in MACS2 peak output file
4.7 years ago by
nash.claire390 wrote:

Hi everyone,

I have just gone through my first ChIP-seq analysis with Galaxy and MACS2. Just to start, I'd like to point out that I have read the MACS2 and MACS 1.4.2 readmes over and over again as well as doing many google searches and forum searches for this answer first but I'm still not clear and would like some confirmation if possible so I'm not making the wrong assumptions.

After calling peaks with MACS2, I get a bed file and an xls file for my peaks. When I look at my xls file column headers, I have 10 columns like this :-

chr start end length abs_summit pileup #NAME? fold_enrichment #NAME? name
chr1 860036 860236 201 860135 25 16.6324 7.08525 11.87312 MACS2_peak_1
chr1 879295 879531 237 879459 19 10.08018 5.00001 6.17729 MACS2_peak_2
chr1 895015 895266 252 895104 20 10.92848 5.25001 6.86955 MACS2_peak_3
chr1 933521 933733 213 933677 15 9.8888 5.66653 6.02773 MACS2_peak_4
chr1 949415 949781 367 949584 23 14.15974 6.28273



I have to say, I don't find the MACS documentation clear and all that useful for beginners. Especially when it says things like "Information include: chromosome name, start position of peak, end position of peak etc etc". My table includes the column headers described in the macs readme files clearly but I have 2 extra columns that aren't described and aren't labelled and I'd like to know what they are.  I don't want to assume the wrong things. Can anyone tell me what the "#NAME?" columns I have here are?? My bed file table seems to lack this information in the column headers too. 

Also, can someone help me understand, in the pileup column, I understand that this is the number of reads at the summit location? And is this number normalised to reads per million or if I want that information I divide these numbers by the library size after the MACS2 algorithm?? 



Claire (a frustrated tired newbie)

chip-seq
ADD COMMENTlink
4.7 years ago by
UNC Chapel Hill
Joseph Pearson450 wrote:

Check the formula for those two column headers.

I'm betting those are negLOG10(pvalue) and negLOG10(qvalue).

As to your second question, here is a post from the MACS Google Groups describing what the pileup is.

(short answer: no, unless you used a flag to make it normalized)

Good luck!

ADD COMMENTlink

Thanks Joseph!

ADD REPLYlink
3.8 years ago by
ariel.balter140 wrote:

The name -n/--name parameter is just for you. For instance, I use a string made up of the treatment specs (<protein>_<[chip|input]>_<replicate#>).

ADD COMMENTlink
