Hi to all,
I am trying to align a sample test NOMe-seq data (few reads) using bismark with the NOMe-seq option.
In the output i see mapping around 13-23%, and i wonder if this is right. I am wondering if something is wrong with this alignment, for instance if i did not align on only one of the 4 possible genomes/converted genomes, which could procedure this a bit lower than 1/4 fraction of alignments. Thank you very much if you can help. And sorry in advance if i missed somethig obvious.
The procedure i used is as follows.
- data recieved are paired-end, fastq files trimmed.
- loading bismark 0.22.3 and bowtie 2.1.0
- preparing a regular conversion of the mm10 genome using bismark_genome_preparation. As far as i understood, i need to prepare the genome the same way as for fishing CpG, except that in the end i will look for GpC. In any case, we are looking at methylated Cs.
- bismark --bowtie2 --non_directional --multicore 4 -I 0 -X 1000 --score_min L,0,-0.6 -L 5 -N 1 --genome_folder $genome -1 $fastq_R1 -2 $fastq_R2 -o bismark_out/ (to note: i tried to modify the 0.6 value to 10 or more to increase number of reads aligned, but the quality of aln dropped with cigar patterns that include inserts and deletions. There i can reach 80% and probably more.
- in the end, i get the report below. As you can see number of paired reads (21344) is low.
========================================================================================== Bismark report for: sample_R1_001.final.fastq.gz and sample_R2_001.final.fastq.gz (version: v0.22.3) Bismark was run with Bowtie 2 against the bisulfite genome of genomes/mm10/current/bowtie/2.1.0/ with the specified options: -q -N 1 -L 5 --score-min L,0,-0.6 --ig\ nore-quals --no-mixed --no-discordant --dovetail --minins 0 --maxins 1000 Option '--non_directional' specified: alignments to all strands were being performed (OT, OB, CTOT, CTOB)
Final Alignment report
Sequence pairs analysed in total: 21344 Number of paired-end alignments with a unique best hit: 3778 Mapping efficiency: 17.7% Sequence pairs with no alignments under any condition: 16504 Sequence pairs did not map uniquely: 1062 Sequence pairs which were discarded because genomic sequence could not be extracted: 0
Number of sequence pairs with unique best (first) alignment came from the bowtie output: CT/GA/CT: 1947 ((converted) top strand) GA/CT/CT: 13 (complementary to (converted) top strand) GA/CT/GA: 12 (complementary to (converted) bottom strand) CT/GA/GA: 1806 ((converted) bottom strand)
Final Cytosine Methylation Report
Total number of C's analysed: 131387
Total methylated C's in CpG context: 3086 Total methylated C's in CHG context: 3630 Total methylated C's in CHH context: 10640 Total methylated C's in Unknown context: 43
Total unmethylated C's in CpG context: 6499 Total unmethylated C's in CHG context: 25829 Total unmethylated C's in CHH context: 81703 Total unmethylated C's in Unknown context: 611
C methylated in CpG context: 32.2% C methylated in CHG context: 12.3% C methylated in CHH context: 11.5% C methylated in unknown context (CN or CHN): 6.6%
Bismark completed in 0d 0h 8m 43s