So I would like to get the sequence of the genomic regions with mapped reads after an RNA-seq analysis.
First, I have mapped the reads into a reference genome using hisat2 and
Does anyone have a suggestion on how I could extract the regions of the genome with mapped reads?
My plan was to use
bedtools genomecov on the bam file to find the regions with coverage > 0 and then use
bedtools getfasta using the bed with the regions and the genome in fasta format.
However, I'm not sure on how to keep strand information.
I think I have to include
-strand in genomecov and
-s in getfasta, but I have found that
genomecov gives the same results with and without the
-strand flag. I was expecting including
-strand I would get strand information in the bed file but it is not the case. This is the command I have tried:
bedtools genomecov -ibam input.bam -strand -bga | awk '$4>0' >> output.bed
Does anyone have any comment?
Thank you for your time