Hi, Biostars forum members,
I have a sorted BAM file generated from mRNA-seq experiments (aligned to human genome hg38). I would like to calculate the mono- and dinucleotides frequencies of the mapped reads in the BAM file (e.g. frequencies of As, Cs, Gs, Ts; and frequencies of AAs, ACs, ....). What I did was first convert the BAM file to bed file with bedtools, and then convert the bed file to fasta file with bedtools, and finally calculate the nucleotides frequencies using the generated fasta file from the previous step.
I feel the whole process takes a long time, especially in the step of converting the bed file to the fasta file. I was wondering if there is an easier way to do it? Any input is appreciated.