Virus RNA Seq reads mapping with salmon
2
0
Entering edit mode
5.1 years ago
lokraj2003 ▴ 120

I did RNA seq of the mammalian cells infected with pox virus. Now, I have read files which contains both host and virus reads. I want to align the reads both to host and viral genome. I was thinking I could concatenate the host and virus genome into one file and run salmon against this concatenated genome. However, salmon recommends transcriptome file for assembly which are not available for viruses. Virus genome are available as genebank or gff3 format in NCBI. Is there any way I can concatenate these formats into the format that can be used by salmon ? Or is there any way around to use virus genome as reference in salmon ?

Thanks

RNA-Seq • 1.7k views
ADD COMMENT
1
Entering edit mode
5.1 years ago

You provide Salmon with a transcriptome fasta... so merging the human transcriptome with the pox virus genome fasta file should work.

Note: I linked to the coding region sequences - if you use my suggestion, it might also be wise to include the ncRNA as well.

ADD COMMENT
0
Entering edit mode

Thanks for the suggestion and links. I will probably try with ncRNA too.

ADD REPLY
1
Entering edit mode
5.1 years ago
Adrian Pelin ★ 2.6k

Which poxvirus are you sequencing? I do this all the time with Vaccinia, except I use HISAT and StringTie to quantify transcript levels. Which ever poxvirus you have, you can always extract the CDS regions and treat that as the transcriptome, not a lot of splicing happening in poxviruses....

ADD COMMENT
0
Entering edit mode

I am using Orf virus. Is there any script to combine virus CDS to host genome or you do it manually ? Also, what do you do after getting reads counts ? I mean what tool do you use to separate viral and host reads ?

Thank you

ADD REPLY
0
Entering edit mode

I would use bbsplit to separate the host (sheep?) reads and virus reads. You can map to multiple references. The command below will create files for ecoli reads, salmonella reads as well as reads that don't map to either.

bbsplit.sh in1=reads1.fq in2=reads2.fq ref=ecoli.fa,salmonella.fa basename=out_%.fq outu1=clean1.fq outu2=clean2.fq
ADD REPLY

Login before adding your answer.

Traffic: 2179 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6