Bigwig format and operation
1
0
Entering edit mode
2.3 years ago

Dear users, I read through UCSC wiggle format (BigWig) in depth. I understand the format about variable step, fixed step and use of bigwig to visualize in a browser etc. However, I don't have clarity on using ChIP-Seq, ATAC-Seq data in bigWig format.

Example of the Wig file after converting from bigwig format using bigWigToWig. This example data is from ATAC-Seq.

#bedGraph section chr1:0-870999
chr1    0       9999    0

chr1    9999    10099   16.561

chr1    10099   10199   24.2045

chr1    10199   10299   2.54784

chr1    10299   10399   5.09568

chr1    10399   10499   11.4653

chr1    10499   10599   7.64352

chr1    10599   10699   3.82176

chr1    10699   13199   0


In this context, could someone help understanding: 1. What is the real number value in 4th column. Is the read depth for that position? or some transformed value of read depth. Typically in ChIP-Seq or ATAC-Seq, what is the value that one would represent in this column.

1. If this is threshold, how users specify thresholds for selecting significantly enriched regions. (I am not sure if any statistical test is associated with significance, I cannot find any reference but users call it so).

2. I have 6 ATAC-Seq bigwig files for 6 different samples. How do I find the regions of interest in at least 4 of samples.

Bigwig ChIP-Seq ATAC-Seq BigWig • 2.6k views
0
Entering edit mode

oriolebaltimore : Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.

I am not sure if you added " to try and format the example data. I left them in there. You can edit and take them out, if they are not part of example.

Thank you!

0
Entering edit mode

Thanks. I added them purposefully for formatting purpose as I felt this is not code and saw no other formatting option for the sample lines from wig file. Thanks.

0
Entering edit mode

I cannot comment on the specifics of ATAC-Seq but your data is actually in bedGraph format, as described here. The value in the 4th column is the Y-axis value for a graph where the X-axis values are the range between columns 2 and 3.

0
Entering edit mode

I also responded to another similar post and requested the same question in the same forum. I apologize if that violates duplicate question rules.

2
Entering edit mode
2.3 years ago
ATpoint 48k

The bedGraph (or bigwig) format is always the same: chr-start-end-value. Value can actually be anything that can be associated with a stretch of DNA as defined in the first three columns. It can be the raw read count for that interval, it can be normalized read count like reads per million, it can be an enrichment score for this experimental condition over a control experiment, it can be the mean methylation store, the GC content etcetc. Most commonly, people use bedGraph/bigwig to create browser tracks displaying the normalized read count across the genome, and in this case, it would not matter if it is ATAC-seq, ChIP-seq or Whatever-seq. One simply counts the number of reads that cover each base and aggregates bases with equal coverage into one interval to make the files smaller, so if the first 100 bases of a chromosome have coverage of 0, one would write:

chr1    0   100 0


chr1    0   1   0

chr1    1   2   0

chr1    2   3   0

(...)


For statistical analysis, one typically calls peaks (e.g. with MACS) and then makes a count matrix to obtain the raw counts for each replicate per peak. Significances between conditions are then inferred with appropriate statistical frameworks, such as DESeq2, edgeR, csaw etc. Please use the search function and google for differential analysis of ATAC-seq data, there is plenty of material available.

0
Entering edit mode

Perfect. Got it!! Thanks ATpoint.