I'm using the program MAnorm to find differential histone binding sites between 2 of my chip-seq samples. If anyone has used it before can they sort my problem? MAnorm can be obtained from here:
It takes as input 4 files:
sample1_peaks, sample2_peaks, sample1_alignment, sample2_alignment where the files look like this:
Chromosome Start End 1 194749677 194750748 1 13570102 13576846
and alignment files:
1 35 99 HWI-ST1014:143:D1PYRACXX:1:2107:10201:100917/2 25 + 1 39 91 HWI-ST1014:143:D1PYRACXX:1:1308:15197:167335/2 25 +
Now when I run the program on my files, it gives me the following error while reading the alignment files: ??? Error using ==> textscan Invalid file identifier. Use fopen to generate a valid file identifier.
Error in ==> step3_split_sequencing_data_2chr at 61 Error in ==> MAnorm_run_through at 63
Ive checked these lines multiple times and I don't see anything wrong with these. My files are in the right folder which is on the path but still it gives me this error. I have another pair of samples for which this program runs perfectly and gives me no error. I just don't understand why it doesn't work with these samples and why it's not reading the alignment files. What could be going wrong?
Any advice would be appreciated!
Edit: commands to run the program:
% Note1: Change the parameter values inside the parameter region to run MAnorm on your samples; % Note2: All the input files should be of tab-delimited text format. % Note3: To know the usage of a MATLAB function, type "help function_name", e.g. help addpath, in MATLAB command line. clear; addpath('C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\CodeLib'); %%%%%% start of parameter region %%%%%% nchr = 30; % number of chromosomes; 24 for human, 21 for mouse % the bed file name of peak coordinate of sample 1 peakfile_1 = 'C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\Input\ss89_k27ac_peaks-Copy-test.bed'; % the bed file name of read coordinate of sample 1: Yale ENCODE MYC ChIP-seq data replicate 1 in HelaS3 cells rawdatafile_1 = 'C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\Input\ss89_k27ac_final.bed'; % shift size of reads for sample 1; which equal to the half of the DNA fragment after size selection tag_shift_1 = 123; % length of reads for sample 1 tag_length_1 = 69; % the bed file name of peak coordinate of sample 2 peakfile_2 = 'C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\Input\ss12_k27ac_peaks-Copy-test.bed'; % the bed file name of read coordinate of sample 2: Yale ENCODE MYC ChIP-seq data replicate 1 in K562 cells rawdatafile_2 = 'C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\Input\ss12_k27ac_final.bed'; % shift size of reads for sample 1; which equal to the half of the DNA fragment after size selection tag_shift_2 = 104; % length of reads for sample 1 tag_length_2 = 50; % directory of work folder to store temperary files; should be made before analysis workdir = 'C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\'; % directory of output folder to store output files; should be made before analysis outputdir = 'C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\output_89_12\'; % name of comparison MApair = 'ss89_ss12'; extention = 10000; % 2*extention = the window size to calculate read density; we recommend extention=1000 for histone modification, extention=500 for transcription factor summit2summit_dist = 5000; % the selected common peaks much have summit-to-summit distance smaller than this value to as matched common peaks; suggest to be half of extention Mcut_unbiased = 1; % Curoff of |M| value to define unbiased peaks between 2 samples; here M value is the log2 fold change at peak region between two samples, value 1 means fold change 2 % Non-cell type-specific binding regions: all peaks associated with |M|<Mcut_unbiaseda. Mcut_biased = 1; % Cutoff of M value cutoff to define biased peaks between 2 samples: Pcut_biased = 1e-2; % Cutoff of P value (of differential binding) cutoff to define biased peaks between 2 samples; % Sample 1-specific binding regions: all peaks associated with M>=Mcut_biased and P<Pcut_biased; % Sample 2-specific binding regions: all peaks associated with M<=-Mcut_biased and P<Pcut_biased; %%%%%% end of parameter region %%%%%% % read peaks; disp(' '); disp('Step1: read peaks'); peak_name_1 = step1_read_MACS_peaks(peakfile_1, nchr, workdir); peak_name_2 = step1_read_MACS_peaks(peakfile_2, nchr, workdir); % classify common or unique peaks disp(' '); disp('Step2: classify common or unique peaks'); Nsamp = 100; % number of random permutation to calculate fold enrichment of overlap step2_classify_2peak_sets_by_overlap(peak_name_1, peak_name_2, workdir, Nsamp); % split raw data by chromosome disp(' '); disp('Step3: read and split raw data by chromosome'); [pathstr1, marker_name_1, ext1, versn1] = fileparts(rawdatafile_1); step3_split_sequencing_data_2chr(rawdatafile_1, nchr, workdir); [pathstr2, marker_name_2, ext2, versn2] = fileparts(rawdatafile_2); step3_split_sequencing_data_2chr(rawdatafile_2, nchr, workdir); % calculate read density in peaks disp(' '); disp('Step4: calculate read density in peaks'); step4_calculate_peak_read_density(peak_name_1, peak_name_2, marker_name_1, marker_name_2, extention, workdir, nchr, tag_shift_1, tag_shift_2, tag_length_1, tag_length_2); % calculate normalized M-A values disp(' '); disp('Final Step: normalize and output the M-A values'); step5_rescale_peak_strength_by_common_peaks(peak_name_1, peak_name_2, marker_name_1, marker_name_2, summit2summit_dist, workdir, MApair, outputdir); % output the normalized M-value to wig file (in order to upload to genome browsers) disp(' '); disp('Summarize data: output the M value of 2 peak sets to wig files'); step6_output_peak_Mvalue_2wig_file(peak_name_1, workdir, MApair, outputdir); step6_output_peak_Mvalue_2wig_file(peak_name_2, workdir, MApair, outputdir); % merge common peaks and define unbiased and biased peaks between two samples based on given M-value and P-value cutoff disp(' '); disp('Summarize data: merge common peaks and extract differential and non-differential binding regions'); step7_define_biased_unbiased_peaks(peak_name_1, peak_name_2, workdir, MApair, outputdir, Mcut_unbiased, Mcut_biased, Pcut_biased) % clean up rmpath('C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\CodeLib'); clear all;
These are the commands that came with the package to run the program. I just changed the file names. The rest is as it is in the package.