I have downloaded the 2 paired end fastq files from ENA- documenting MNase Seq data for nucleosome occupancy across the mouse genome: http://www.ebi.ac.uk/ena/data/view/SRR2034494 I mapped the files with bowtie2:
bowtie2 -x mm9 --very-sensitive -p 8 -1 SRR1.fastq -2 SRR2.fastq -S out.sam 94.83% overall alignment rate
everything seems normal here but the problem is that the bed file that results from sam2bed
sam2bed < out.sam > out.bed
has resulted in a bed file with interval sizes that are typically in the 50-70 bp range... This is impossible as nucleosomes are 147 bp long and hence reads of <147 are not possible.
After receiving some advice I have found that the intervals may be too short due to 'sam2bed'. Does anyone know of a way to account for paired ends in sam files when converting to a bed file?
NOTE: I have also tried this with other independent nucleosome datasets and the same thing happens so it is not a question of the data