Question: Masking reference for RNA-seq alignments
thjnant110 wrote:


I have a general question about mapping RNA-seq data to the reference genome. I was wondering whether it matters if the reference genome is masked for repeats or not when mapping RNA-seq reads? if so, should it be hard mask or soft mask?


Devon Ryan
Freiburg, Germany
Devon Ryan wrote:

Most aligners don't care if bases are soft-masked and there's typically no point in hard-masking for aligning RNAseq datasets (though perhaps you have a non-standard dataset). If, for some reason, you need to avoid aligning to repeat regions then use a hard-masked genome.

Thanks for your reply. No, I have a standard RNA-seq dataset and although I have generated the masked genomes, I used the non-masked version for alignment because I thought it should not matter for RNA-seq data because they are coming from exons which I assume contains no long range repeats.

