deeptools plotProfile plots inconsistently
1
0
Entering edit mode
4.2 years ago
Ekalavya ▴ 10

Hi, I am trying to plot a few ChIP-Seq signals on a region set using deepTools plotProfile. I have been using this tool but it is only recently I observed this unusual behavior. When I plotted identical data 4 times the outputs are very different. When I checked the corresponding output bed files, indeed the regions are clustered differently. This is true with the plotHeatmap as well. And, the clustered output bed file from plotProfile is not matching with that from plotHeatmap (in terms of cluster members). This is happening with the kmeans clustering option. Following are the codes I used and results.

computeMatrix reference-point \
  -R regionset.bed \
  -S H3K27ac.bw H3K4me1.bw H3K27me3.bw H3K4me3.bw \
  --referencePoint center \
  -b 3000 -a 3000 \
  --verbose \
  --missingDataAsZero \
  -p max/2 \
  --sortRegions descend \
  --sortUsingSamples 1 \
  --skipZeros \
  -o "matrix.gz"

plotProfile -m matrix.gz --perGroup --kmeans 8 --numPlotsPerRow 4 -out "profile.png" --outFileSortedRegions "profile_K8.bed"

plotHeatmap -m matrix.gz \
        --boxAroundHeatmaps no \
        --whatToShow 'plot, heatmap and colorbar' \
        --legendLocation none \
        --colorList white,red \
        --kmeans 8 \
        --outFileNameMatrix "hpMatrix.tab" \
        --outFileSortedRegions "heatmap_k8.bed" \
        --outFileName "heatmap.png"

profile plots of 4 run with identical data:

Image: profile plots of 4 run with identical data

Is it expected that the clustering will be different every time? or Am I missing something here?

Please help.

Thank you

plotProfile deepTools plotHeatmap • 2.4k views
ADD COMMENT
0
Entering edit mode

Sorry, somehow the image is not uploaded. it can be found here. https://i.ibb.co/c1ggNJc/profile-Plots.jpg

ADD REPLY
1
Entering edit mode

You should have used this link and not the link to the website containing the image (https://ibb.co/xDCCMhS, which you had used originally). I've made the necessary changes (and also formatted your post better) now.

ADD REPLY
2
Entering edit mode
4.2 years ago
colin.kern ★ 1.1k

The clustering can be different each time with kmeans if the initial cluster centroids are chosen randomly. I don't see any information in the deepTools documentation about what initializing method is used, but based on your results it seems like they may be assigned randomly. You can try a different number of clusters to see if you get a more stable result, or try the hierarchical clustering option which should produce the same results every time.

ADD COMMENT
0
Entering edit mode

Thank you very much colin.kern! It is the random starting points of the cluster. It will be good if can have control over this. I am using hierarchical clustering. I am also trying to find out the optimal number of clusters.

ADD REPLY

Login before adding your answer.

Traffic: 1957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6