Hi,
I meet some problem when running Bisnark+DMAP,The data I use is the [Test dataset 1 for DMAP](http://biochem.otago.ac.nz/assets/software/test_MDS_data_large.tar.gz) from http://biochem.otago.ac.nz/research/databases-software/
The commands I used are:
bismark_genome_preparation ./
bismark -q -n 1 -l 40 ~/projects/Methylation/BS_seq/reference1/ ../MDS_chr1_maps.fastq
bismark_methylation_extractor -s --bedGraph --counts --buffer_size 10G --cytosine_report --genome_folder ~/projects/Methylation/BS_seq/reference1/ MDS_chr1_maps.fastq_bismark.sam
diffmeth -e 40,220 -g ~/projects/Methylation/BS_seq/reference1/Homo_sapiens.GRCh37.75.dna.chromosome. -R MDS_chr1_maps.fastq_bismark.sam |awk -f ~/software/meth_progs_dist/src/getcpgpcmeth.awk >diffmeth.single.MDS.list
So I get two files -- the MDS_chr1_maps.fastq_bismark.CpG_report.txt and the diffmeth.single.MDS.list,In the former I get these information:
1       10469   +       0       0       CG      CGC
1       10470   -       0       0       CG      CGA
1       10471   +       0       0       CG      CGG
1       10472   -       0       0       CG      CGC
1       10484   +       0       0       CG      CGG
1       10485   -       0       0       CG      CGG
1       10489   +       0       0       CG      CGC
1       10490   -       0       0       CG      CGG
1       10493   +       0       0       CG      CGC
1       10494   -       0       0       CG      CGG
1       10497   +       43      8       CG      CGG
1       10498   -       0       0       CG      CGG
1       10525   +       47      5       CG      CGC
1       10526   -       58      11      CG      CGG
1       10542   +       48      4       CG      CGA
1       10543   -       63      7       CG      CGG
1       10563   +       40      12      CG      CGC
1       10564   -       64      6       CG      CGT
1       10571   +       46      5       CG      CGC
1       10572   -       64      5       CG      CGG
1       10577   +       0       0       CG      CGC
1       10578   -       44      25      CG      CGA
1       10579   +       0       0       CG      CGG
1       10580   -       39      30      CG      CGC
1       10589   +       0       0       CG      CGG
1       10590   -       63      5       CG      CGG
and the latter:
1       10497   84.31
1       10525   86.78
1       10542   90.98
1       10563   85.25
1       10571   91.67
1       10577   63.77
1       10579   56.52
1       10589   92.65
1       10609   -
1       10617   -
1       10620   -
1       10631   -
1       10633   -
1       10636   -
1       10638   -
1       15720   -
1       15749   -
1       15769   -
1       15789   -
1       15834   -
1       15849   94.74
1       15865   94.74
1       15882   100.00
1       15912   94.74
1       17562   100.00
I don't know how to explain the differences between them such as 10577. Have I done anything wrong or it's a real bug in the software?
Moreover,the latter I think just show me the cytosine on the + strand.if right how could I get the information about the - strand.If not how could I get the information about the strand directly? I see the introduction of the options and the but couldn't get any help.
Thanks for your help
You're piping things through an awk script before writing that file. What's in the awk script? Perhaps that's causing weird results (and anyway, it's hard to know what the 3rd column even means without more information).