I am trying to do some QC on RNA-seq raw reads. According to FastQC results, there is some rRNA, bacterial RNA and polyA contamination. But here are my problems.
- I have no idea how serious the contamination is. How can I tell it from the results of FastQC?
- Is it necessary to remove contamination? Or is there a cutoff beyond which should I remove the contamination?
How can I remove these contamination?
- How to remove PolyA and bacterial RNA contamination?
For rRNA, I have tried the following: (1) download Mt_rRNA, rRNA and Mt_tRNA sequences from BioMart of Ensembl. (2) using bowtie2 for rRNA + tRNA removal.
step 1: create index bowtie2-build rRNA.fasta rRNA.index step 2: Align to rRNA index inorder to get rRNA free fastq file. bowtie2 -x rRNA.index -1 sampleA.1.fq -2 sampleA.2.fq --phred33 -N 0 --un-conc sampleA-filter.fq --al-conc rRNA.fq -p 8
Is this correct?
Thank you very much!