Mapping short 26-32nt RPF reads to human genome
0
0
Entering edit mode
4.7 years ago
Adrian Pelin ★ 2.6k

I was wondering what are some concerns/considerations when mapping short reads to the human genome. I realize that the first Illumina technologies were providing reads as short as 36bp, but nowadays standard Illumina sequencing provides >100bp reads, often paired-end. My experimental reads come from ribosomal profiling, where mRNA regions shielded by the ribosome remain undigested. These can be sequenced but are very small. 26-32nt.

1) What is the best genome to map to? Currently, I map all my transcriptome experiments to dna_sm (soft mask of repetitive regions) ensembl version of the human genome. Does that increase the odds of erroneous mapping, or should I be mapping to dna_rm (repetitive regions masked with N).

2) What aligner/counter should I use? Typically I use HISAT2 and StringTie to map and get TPM counts. Would something like bowtie2 work better? Should I specify any additional parameters to HISAT2?

3) When calculating transcript levels, do tools like cufflinks and StringTie use the total amount of reads mapped to the genome, or total reads within the regions of the GTF file supplied?

Thanks

RNA-Seq Ribosomal profiling Mapping RPF • 764 views
ADD COMMENT

Login before adding your answer.

Traffic: 2547 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6