Meaning of "#NAME?" columns in MACS2 peak output file
2
0
Entering edit mode
8.8 years ago
nash.claire ▴ 490

Hi everyone,

I have just gone through my first ChIP-seq analysis with Galaxy and MACS2. Just to start, I'd like to point out that I have read the MACS2 and MACS 1.4.2 readmes over and over again as well as doing many google searches and forum searches for this answer first but I'm still not clear and would like some confirmation if possible so I'm not making the wrong assumptions.

After calling peaks with MACS2, I get a bed file and an xls file for my peaks. When I look at my xls file column headers, I have 10 columns like this :-

chr      start      end        length     abs_summit   pileup    #NAME?     fold_enrichment   #NAME?      name
chr1     860036     860236     201        860135       25        16.6324    7.08525           11.87312    MACS2_peak_1
chr1     879295     879531     237        879459       19        10.08018   5.00001           6.17729     MACS2_peak_2
chr1     895015     895266     252        895104       20        10.92848   5.25001           6.86955     MACS2_peak_3
chr1     933521     933733     213        933677       15        9.8888     5.66653           6.02773     MACS2_peak_4
chr1     949415     949781     367        949584       23        14.15974   6.28273           9.6336      MACS2_peak_5

I have to say, I don't find the MACS documentation clear and all that useful for beginners. Especially when it says things like "Information include: chromosome name, start position of peak, end position of peak etc etc". My table includes the column headers described in the macs readme files clearly but I have 2 extra columns that aren't described and aren't labelled and I'd like to know what they are. I don't want to assume the wrong things. Can anyone tell me what the "#NAME?" columns I have here are?? My bed file table seems to lack this information in the column headers too.

Also, can someone help me understand, in the pileup column, I understand that this is the number of reads at the summit location? And is this number normalised to reads per million or if I want that information I divide these numbers by the library size after the MACS2 algorithm??

Thanks!

Claire (a frustrated tired newbie)

ChIP-Seq • 5.3k views
ADD COMMENT
1
Entering edit mode
8.8 years ago

Check the formula for those two column headers.

I'm betting those are negLOG10(pvalue) and negLOG10(qvalue).

As to your second question, here is a post from the MACS Google Groups describing what the pileup is.

(short answer: no, unless you used a flag to make it normalized)

Good luck!

ADD COMMENT
0
Entering edit mode

Thanks Joseph!

ADD REPLY
0
Entering edit mode
7.9 years ago
ariel.balter ▴ 260

The name -n/--name parameter is just for you. For instance, I use a string made up of the treatment specs (<protein>_<[chip|input]>_<replicate#>).

ADD COMMENT

Login before adding your answer.

Traffic: 2147 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6