A matrix sample for Profile plots and heatmaps of Computematrix, deepTools
1
0
Entering edit mode
5 months ago
Farzaneh • 0

Hi everyone, I have a count matrix from feature counts and of course, couple of peak (.bed) files. I want to visualize the peaks all together to show the coverage and overall comparing. I was going to use plotprofiles and plotheatmap of computeMatrix tool of deeptools, but it needs its own made countmatrix, which chaines to using other tools of deeptools including binning the genome with bamcoverage and stuff. For some reason, I'm unable to use computematrix tool of Deeptools, it gives me lots of errors. I think, I'm not quite sure how should be files. I have coordinates of peaks, should I annotate them first? But even if I use annotated version, it again stops. Besides, it never accepts my .bw file that I already have made with Bamcoverage from bam files of the same replicate that I called the peaks and now I'm trying to visualize it.

Can anyone please send me the structure (sample) of computed matrix of this tool so I can re-structure my featureCounts matrix and use that instead? Thanks so much.

chip-seq deepTools chip-sequencing • 587 views
0
Entering edit mode
5 months ago

the computeMatrix inputs required are:

computeMatrix <mode> -S <biwig file(s)> -R <bed file(s)> --upstream <bp upstream> --downstream <bp downstream>


The mode parameter corresponds to scale-regions or reference-point. BigWigs from bamCoverage should be fine, and bed files require the minimal 3 columns: chromosome start end (and to be sure use the .bed extension).

The output matrix structure is:

• a first line in with all the information about regions ranges, labels, etc.
• the first 6 columns will correspond to your regions columns (chromosome, start, end, regionName, score, strand); missing values are filled automatically. If multiple bed files are provided the regions are of the second bed are appended to the regions of the first bed, etc,
• after the first 6 columns you have as many columns as the numbers of bins required (range_region_to_plot / binSize). This number of columns is repeated for each bigwig signal.

Have a look at the panel A of the workflow figure provided in the Rseb package vignette .

0
Entering edit mode

I am trying to use your R package and things are fine but still, am confused. Let me explain better. This is format of my countmatrix: [Table1]

gene_id D1  D2  D3  I1  I2  I3
chr1.721200.726999  287 415 342 373 349 341
chr1.817800.821799  135 163 169 175 183 180
chr1.824800.826199  101 135 133 119 90  146
chr1.926400.928399  117 153 117 146 119 171
chr1.1030200.1030799    15  13  33  26  28  27


I have three replicates for each condition (D;control and I). Those are the peaks coordinates and numbers are count of reads for each range in each replicate.

Range   baseMean    log2FC  lfcSE   stat    pvalue  padj    DE.status
chr11.134033800.134035999   94.90617597 -1.164292979    0.185663217 -6.270994327    3.59E-10    3.49E-07    Down
chr13.87546000.87547399 47.82502954 -0.790517087    0.246031943 -3.213066871    0.001313257 0.043670531 Down
chrX.85982600.85984399  75.36774214 -0.717773107    0.200643491 -3.577355549    0.000347088 0.018684549 Down
chr18.12093200.12094199 61.20629885 -0.716081087    0.222221387 -3.222377008    0.001271317 0.043118093 Down


This is my Deseq2 results on the regions/peaks. I took the peaks regions like genes and used Deseq2. Now, I have these regions that are significantly changed between my conditions. Then, I got back to my peaks countmatrix [Table1] and got the common significant ones and made [Table3].

Range   baseMean    log2FC  lfcSE   stat    pvalue  padj    DE.status   D1  D2  D3  I1  I2  I3
chr1.1047400.1050399    235.7753956 0.502184082 0.117133876 4.287265981 1.81E-05    0.002247641 Up  176 241 170 273 259 305
chr1.108688000.108689599    83.88195318 0.779272925 0.188518673 4.133664386 3.57E-05    0.00371855  Up  61  64  59  108 104 108
chr1.115014800.115016999    94.1785547  0.688516703 0.176135657 3.909013738 9.27E-05    0.007322158 Up  64  88  65  124 118 110
chr1.118018000.118019999    90.61371494 0.604711897 0.179055698 3.3772279   0.000732204 0.030584954 Up  60  79  77  118 113 100
chr1.119548400.119550799    111.5057031 0.563434199 0.161707008 3.484290536 0.000493444 0.02387482  Up  82  94  93  141 133 128


Finally, I annotated these using Chipseeker. These are the headers:

seqnames    start   end width   strand  annotation  geneChr geneStart   geneEnd geneLength  geneStrand  geneId  transcriptId    distanceToTSS


At this point, I don't know what to do! I know that profile plot is not just about significant peaks, and also, I have the same problem with your package if I want to make a matrix. This BigWigs files should be from peak files (.bed)? Or should be from read files (.bam). I can't make wig files from peaks.

0
Entering edit mode

I am not really sure of want do you want to plot exactly.

Anyway, to make plots you do not need to annotate the bed files. you just need bed files.

Actually you do not need to get differential peaks to plot them.

As input you need bigWigs and regions (bed files) to plot. The bigWigs are the signal all over the genome and can be obtained, for example, by the function bamCoverage from deeptools. this function can convert .bam files into bigWigs.

once that you have your bigWigs and your beds (peaks) you just use deeptools to generate a matrix, and then you can use Rseb to plot the matrix

For example if you want to make a plot +/-1kb around the center of all your peaks you can use deeptools as follows:

computeMatrix reference-point -S <path to biwig file(s)> -R <path to bed file(s)> --upstream 1000 --downstream 1000 -o <path to matrix.gz>

I do not know whether it is clear?

0
Entering edit mode

My system couldn't perform and make a whole count matrix with all the replicates and samples I have. That was the issue.

I ended up doing it, control vs different treatments separately.