Mapping RNA-Seq reads onto viral genome

0

Entering edit mode

2.3 years ago

nik.kraemer ▴ 10

Hi everyone,

I have 6 files of paired-end 75 nt RNA-Seq reads from HEK293 I want to map onto the AAV genome. I got the reference genome as a fasta file and the annotation file as gff3/gtf from NCBI. For mapping onto the human genome, I used the STAR mapper, which worked brilliantly. But I do not know how to proceed with the much smaller viral genome and its transcript variants. How do I quantify the different transcript variants in this case?

I tried replacing "CDS" in the gtf file with "exon", since no exon entries seemed to lead to fatal errors, but that did not work out as expected. A lot of the mapped reads are either ambiguos or multimapped.

(--genomeFastaFiles) AAV reference genome from NCBI
(--genomeSAindexNbases) 5, based on ~4700 bp genome
(--sjdbGTFfile) gff3-derived gtf file where CDS is replaced by exon, gff3 file from NCBI
(--sjdbOverhang) 148, because average read length was 149

All other parameters were left as default.

Btw, I am working with Galaxy.

Any tips are appreciated!

Cheers

Galaxy mapping fasta gtf RNASeq • 680 views

ADD COMMENT • link 2.3 years ago by nik.kraemer ▴ 10

Login before adding your answer.