I'm using the program MAnorm to find differential histone binding sites between 2 of my chip-seq samples. If anyone has used it before can they sort my problem? MAnorm can be obtained from here:
It takes as input 4 files:
sample1_peaks, sample2_peaks, sample1_alignment, sample2_alignment where the files look like this:
Chromosome Start End 1 194749677 194750748 1 13570102 13576846
and alignment files:
1 35 99 HWI-ST1014:143:D1PYRACXX:1:2107:10201:100917/2 25 + 1 39 91 HWI-ST1014:143:D1PYRACXX:1:1308:15197:167335/2 25 +
Now when I run the program on my files, it gives me the following error while reading the alignment files: ??? Error using ==> textscan Invalid file identifier. Use fopen to generate a valid file identifier.
Error in ==> step3_split_sequencing_data_2chr at 61 Error in ==> MAnorm_run_through at 63
Ive checked these lines multiple times and I don't see anything wrong with these. My files are in the right folder which is on the path but still it gives me this error. I have another pair of samples for which this program runs perfectly and gives me no error. I just don't understand why it doesn't work with these samples and why it's not reading the alignment files. What could be going wrong?
Any advice would be appreciated!
Edit: commands to run the program:
% Note1: Change the parameter values inside the parameter region to run MAnorm on your samples; % Note2: All the input files should be of tab-delimited text format. % Note3: To know the usage of a MATLAB function, type "help function_name", e.g. help addpath, in MATLAB command line. clear; addpath('C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\CodeLib'); %%%%%% start of parameter region %%%%%% nchr = 30; % number of chromosomes; 24 for human, 21 for mouse % the bed file name of peak coordinate of sample 1 peakfile_1 = 'C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\Input\ss89_k27ac_peaks-Copy-test.bed'; % the bed file name of read coordinate of sample 1: Yale ENCODE MYC ChIP-seq data replicate 1 in HelaS3 cells rawdatafile_1 = 'C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\Input\ss89_k27ac_final.bed'; % shift size of reads for sample 1; which equal to the half of the DNA fragment after size selection tag_shift_1 = 123; % length of reads for sample 1 tag_length_1 = 69; % the bed file name of peak coordinate of sample 2 peakfile_2 = 'C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\Input\ss12_k27ac_peaks-Copy-test.bed'; % the bed file name of read coordinate of sample 2: Yale ENCODE MYC ChIP-seq data replicate 1 in K562 cells rawdatafile_2 = 'C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\Input\ss12_k27ac_final.bed'; % shift size of reads for sample 1; which equal to the half of the DNA fragment after size selection tag_shift_2 = 104; % length of reads for sample 1 tag_length_2 = 50; % directory of work folder to store temperary files; should be made before analysis workdir = 'C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\'; % directory of output folder to store output files; should be made before analysis outputdir = 'C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\output_89_12\'; % name of comparison MApair = 'ss89_ss12'; extention = 10000; % 2*extention = the window size to calculate read density; we recommend extention=1000 for histone modification, extention=500 for transcription factor summit2summit_dist = 5000; % the selected common peaks much have summit-to-summit distance smaller than this value to as matched common peaks; suggest to be half of extention Mcut_unbiased = 1; % Curoff of |M| value to define unbiased peaks between 2 samples; here M value is the log2 fold change at peak region between two samples, value 1 means fold change 2 % Non-cell type-specific binding regions: all peaks associated with |M|<Mcut_unbiaseda. Mcut_biased = 1; % Cutoff of M value cutoff to define biased peaks between 2 samples: Pcut_biased = 1e-2; % Cutoff of P value (of differential binding) cutoff to define biased peaks between 2 samples; % Sample 1-specific binding regions: all peaks associated with M>=Mcut_biased and P<Pcut_biased; % Sample 2-specific binding regions: all peaks associated with M<=-Mcut_biased and P<Pcut_biased; %%%%%% end of parameter region %%%%%% % read peaks; disp(' '); disp('Step1: read peaks'); peak_name_1 = step1_read_MACS_peaks(peakfile_1, nchr, workdir); peak_name_2 = step1_read_MACS_peaks(peakfile_2, nchr, workdir); % classify common or unique peaks disp(' '); disp('Step2: classify common or unique peaks'); Nsamp = 100; % number of random permutation to calculate fold enrichment of overlap step2_classify_2peak_sets_by_overlap(peak_name_1, peak_name_2, workdir, Nsamp); % split raw data by chromosome disp(' '); disp('Step3: read and split raw data by chromosome'); [pathstr1, marker_name_1, ext1, versn1] = fileparts(rawdatafile_1); step3_split_sequencing_data_2chr(rawdatafile_1, nchr, workdir); [pathstr2, marker_name_2, ext2, versn2] = fileparts(rawdatafile_2); step3_split_sequencing_data_2chr(rawdatafile_2, nchr, workdir); % calculate read density in peaks disp(' '); disp('Step4: calculate read density in peaks'); step4_calculate_peak_read_density(peak_name_1, peak_name_2, marker_name_1, marker_name_2, extention, workdir, nchr, tag_shift_1, tag_shift_2, tag_length_1, tag_length_2); % calculate normalized M-A values disp(' '); disp('Final Step: normalize and output the M-A values'); step5_rescale_peak_strength_by_common_peaks(peak_name_1, peak_name_2, marker_name_1, marker_name_2, summit2summit_dist, workdir, MApair, outputdir); % output the normalized M-value to wig file (in order to upload to genome browsers) disp(' '); disp('Summarize data: output the M value of 2 peak sets to wig files'); step6_output_peak_Mvalue_2wig_file(peak_name_1, workdir, MApair, outputdir); step6_output_peak_Mvalue_2wig_file(peak_name_2, workdir, MApair, outputdir); % merge common peaks and define unbiased and biased peaks between two samples based on given M-value and P-value cutoff disp(' '); disp('Summarize data: merge common peaks and extract differential and non-differential binding regions'); step7_define_biased_unbiased_peaks(peak_name_1, peak_name_2, workdir, MApair, outputdir, Mcut_unbiased, Mcut_biased, Pcut_biased) % clean up rmpath('C:\MATLAB\MAnorm_MATLAB_Package\MAnorm_MATLAB_Package\CodeLib'); clear all;
These are the commands that came with the package to run the program. I just changed the file names. The rest is as it is in the package.
This might be a better question to email the authors since it sounds like either a bug or a command input error. What the error message is saying is that it cannot open the file that is being asked for, would you mind pasting the command you used to run?
Thanks for your reply, I've pasted the commands used to run the program. This is the file that came with the program.