Question

kallisto/deseq2 - quantifying viral RNA reads and mouse transcriptome at the same time

0

Entering edit mode

3.3 years ago

ayy ▴ 10

i have some RNAseq data from influenza-infected mouse lungs, and i wanted to be able to quantify the viral reads in the samples.

i've been able to find the sequences for the influenza virus, and manually created a fasta file for it. I was then able to use kallisto to both make an index and perform a pseudoalignment for this, but i now have concerns about the normalization. I know that deseq2 uses inference of similarly distributed genes to normalize count values for genes, and i was thinking that the best way to normalize these values was to somehow create a reference transcriptome of the concatenated mouse transcriptome and influenza transcriptome, and then utilizing deseq2 to do the normalization for me. i would then pull out the normalized reads of the influenza virus to determine the viral titer in each sample.

so my problems that i have now are: (1) how do i concatenate the two fasta files? is it as simple as just adding the two files together? and (2) how do i prepare a reference for tximport to create a dataframe containing both the murine and viral reads at the same time?

presumably once i have (1) and (2) solved, i can just run deseq2 normally and pull out the viral reads individually to determine the titer.

thanks for your input!

kallisto deseq2 • 782 views

ADD COMMENT • link updated 3.3 years ago by LChart 5.2k • written 3.3 years ago by ayy ▴ 10

score 0 · Answer 1 · 2022-07-08

The contigs and genes for the two species have different names. As such it really is as simple as concatenating files together (and stripping out headers, where necessary). For transcript quantification with kallisto, you can concatenate the transcript fasta files and build an index. For tximport, you can supplement (read: rbind) the standard mouse tx2gene with the gene <-- transcript mapping built from the gff or gtf of the influenza reference you are using.