Question

Off topic:Removed rRNAs, tRNAs, snoRNAs, etc. still have FPKM values

0

Entering edit mode

6.9 years ago

kyusikkim ▴ 20

Hello All! I've been analysing some ribosome profiling data and can't seem to get rid of non-relevant RNAs (such as rRNA) from Stringtie's analysis. I believe I am using best practices as described below and am working with the Ensembl yeast (S. cerevisiae) genome assembly R64-1-1.

Use Cutadapt to remove 1st bp of read, exclude reads < 25 nt, remove adapter sequence, and filter any reads without an adapter sequence
Generate a "filter" fasta file by going to Ensembl and downloading the cDNA sequences of all non protein coding transcripts
Use Bowtie2 (very-sensitive setting) to map reads against "filter" fasta file and output all unaligned reads into a new fastq file
Use HISAT2 to map filtered reads to genome and convert output SAM into BAM using Samtools
Run Stringtie on HISAT2 output to generate a gtf file with FPKMs for mapped transcripts
Check output gtf file...still has FPKM values for tRNAs/rRNAs/snoRNAs/etc.?

Am I missing something or did I do something wrong?

**I have not yet checked the bam file to see if there are reads still mapping to the filtered transcripts. If I find that the bam file is clean but am still getting this result, wherein does the problem lie?

stringtie hisat2 • 1.7k views

ADD COMMENT • link updated 6.9 years ago by Carlo Yague 8.7k • written 6.9 years ago by kyusikkim ▴ 20