Hi all- I have coverage files from MNase-seq experiments in bigWig format which have been normalized to show nucleosome occupancy at each position in the genome. I have been very successful in using these coverage files in deepTools, along with bed files containing regions of interest (including TSS) to show occupancy at these regions and in flanking sequences. Now, I would like to add an extra layer to this analysis. Rather than simply aligning around the TSS, I would now like to sort the heatmaps by gene expression. So, I have coverage files, and I have access to some RNA-seq data in BAM format. What are the steps I would need to take in order to sort by gene expression? I would prefer to continue using deepTools if possible. I know that R is often used to generate heatmaps as well, but the R language seems far more complicated than the command line options in deepTools. Keep in mind that I have never performed RNA-seq analysis, and that I am relatively new to bioinformatics in general. Thanks!
Hi Ryan, thanks for your suggestion. I've got some questions in regard to the two methods.
1st method (bed file presorted by expression):
2nd method (by computeMatrixOperations cbind):
If this method is used, isn't it the matrices should be generated by
scale-regions
mode (starting from TSS to TES for expression the data?). Can I set a different length for the regions for RNAseq and occupancy data (e.g. RNAseq data: only TSS to TTS, occupancy data: 3kb upstream of TSS to 3kb downstream of TTS)In the RNAseq files, how does the value computed for each row (by
–sortUsing
incomputeMatrix
) compared/ related to FPKM?Thanks!
Thanks Ryan. I'm now trying to construct the matrix for FPKM plotting. In parallel, I need to perform the
computeMatrix
for the ChIP-seq. files. I am now creating the bed file for that but want to clarify some concepts.When performing
computeMatrix
,--metagene
option is needed for processing the ChIP-seq. file? How to specify bed6 format is used in the command?Thanks
For BED6 files
--metagene
won't do anything, since you don't have exon information. In general, there's little sense in using--metagene
for ChIPseq data, since you generally care whether introns have signal.Thanks Ryan. Do I need to save the BED6 file as .bed6 when inputting into
computeMatrix
?No, the file extension doesn't matter.
Hi Ryan, it's me again. When I performed
computeMatrixOperation
using the self-constructed matrices, a warning prompted outI think that there is something wrong during the matrix construction process. Please find the first few lines of the matrix:
To save the matrix & convert it into .gz, I did it in Ubuntu 1) I've save the matrix as .txt (Character Encoding: Current Locale (UTF-8; Line Ending: Unix/Linus) by "Text Editor"; 2) Then, I compressed them into .gz by right in "files" list
Do you have a standard workflow for doing this? (e.g. in which OS, format, extension, etc..?)
Thanks!
The OS won't matter, though I imagine this would be hard to do on Windows. File extensions are meaningless on Linux and MacOS X, you can change any file extension to anything you want and nothing will change. I would need to see the whole file to be able to figure out what's wrong. My guess is that there's a single line wrong somewhere.