How to create a merged human host viral RefSeq?
0
0
Entering edit mode
3.3 years ago
MatStat ▴ 160

Hi all,

I am trying to create one viral reference file with all viral RefSeq genomes known in a human host. Basically something like the file created here but updated for 2021:

Create viral reference We were interested in exploring all viruses existing in humans. So we first obtained reference genomes of all known and sequenced human viruses obtained from NCBI (as of Sep 2015), and merged them into one file (referred to as the "viral reference file") in fasta file format. Merge all virus fasta file into one big fasta file called viruses.fa

Reference for the above citation: GitHub viGEN tutorial

I'd like to know what's the appropriate way to merge files to create *.fa file, or alternatively if anyone encountered a published reference file as this recently that would also help at this point.

All the best

RNA-Seq viral genome RefSeq fasta fastq • 1.0k views
ADD COMMENT
1
Entering edit mode
  • You can use NCBI Datasets to download all viral genomes (use relevant taxID) you are interested in.
  • RefSeq viral genomes can be found on the RefSeq FTP site.

The .fa files can be simply cated together to make multi-fasta file.

human host viral RefSeq

This part is going to be tricky. Unless you have a clear list of virii you are interested in this information may not always be available.

ADD REPLY
0
Entering edit mode

Hi GenoMax,

Thanks for the answer.

What part did you mean is tricky, filtering the viral RefSeq according to human host? If so, do you think the following link solves it?

https://www.ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi?taxid=10239&host=human

Best

ADD REPLY

Login before adding your answer.

Traffic: 2893 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6