MACS2 Broad peaks output BED format
2
0
Entering edit mode
8.2 years ago

Hi,

I am trying to call peaks for a broad histone mark and I am using MACS2 for Broad peaks (MACS2 bdgbroadcall) which uses the bedgraph file produced by MACS2.

The problem is the output file of MACS2 bdgbroadcall which is apparently a BED file with 15 columns with no header so I don't know what I am looking at except for the chromosome number and peak location I guess.

Any idea about the header?

Thanks

BED15 MACS2 • 25k views
ADD COMMENT
5
Entering edit mode
8.2 years ago
dally ▴ 210

Why not just call peaks using the callpeaks function with the --broad parameter?

EDIT: Mis-read the question. Information below available at https://genome.ucsc.edu/FAQ/FAQformat.html.

This format is used to provide called regions of signal enrichment based on pooled, normalized (interpreted) data where the regions may be spliced or incorporate gaps in the genomic sequence. It is a BED12+3 format.

  1. chrom - Name of the chromosome (or contig, scaffold, etc.).
  2. chromStart - The starting position of the feature in the chromosome or scaffold. The first base in a chromosome is numbered 0.
  3. chromEnd - The ending position of the feature in the chromosome or scaffold. The chromEnd base is not included in the display of the feature. For example, the first 100 bases of a chromosome are defined as chromStart=0, chromEnd=100, and span the bases numbered 0-99.
  4. name - Name given to a region (preferably unique). Use '.' if no name is assigned.
  5. score - Indicates how dark the peak will be displayed in the browser (0-1000). If all scores were '0' when the data were submitted to the DCC, the DCC assigned scores 1-1000 based on signal value. Ideally the average signalValue per base spread is between 100-1000.
  6. strand - +/- to denote strand or orientation (whenever applicable). Use '.' if no orientation is assigned.
  7. thickStart - The starting position at which the feature is drawn thickly. Not used in gappedPeak type, set to 0.
  8. thickEnd - The ending position at which the feature is drawn thickly. Not used in gappedPeak type, set to 0.
  9. itemRgb - An RGB value of the form R,G,B (e.g. 255,0,0). Not used in gappedPeak type, set to 0.
  10. blockCount - The number of blocks (exons) in the BED line.
  11. blockSizes - A comma-separated list of the block sizes. The number of items in this list should correspond to blockCount.
  12. blockStarts - A comma-separated list of block starts. The first value must be 0 and all of the blockStart positions should be calculated relative to chromStart. The number of items in this list should correspond to blockCount.
  13. signalValue - Measurement of overall (usually, average) enrichment for the region.
  14. pValue - Measurement of statistical significance (-log10). Use -1 if no pValue is assigned.
  15. qValue - Measurement of statistical significance using false discovery rate (-log10). Use -1 if no qValue is assigned.
ADD COMMENT
0
Entering edit mode

Thanks, I will try to do it the broad parameter.

ADD REPLY
4
Entering edit mode
8.2 years ago
igor 13k

MACS README explains the output columns. Scroll down to the "Output files" section.

NAME_peaks.narrowPeak is BED6+4 format file which contains the peak locations together with peak summit, pvalue and qvalue. You can load it to UCSC genome browser. Definition of some specific columns are:

5th: integer score for display

7th: fold-change

8th: -log10pvalue

9th: -log10qvalue

10th: relative summit position to peak start

NAME_peaks.broadPeak is in BED6+3 format which is similar to narrowPeak file, except for missing the 10th column for annotating peak summits.

ADD COMMENT

Login before adding your answer.

Traffic: 3070 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6