How to create GTF file for virus and build database for influenza virus using annovar for annotating virus variants?
2
0
Entering edit mode
8.5 years ago

Dear All,

I am working on influenza virus. I would like to annotate variants from virus using annovar. I am facing issue in building the database for virus. I have just 8 genes for the influenza virus and have their refseq id. I tried finding the genome id but I couldn't . So I saved all the 8 gene's fasta sequence in one file and used it for building the database for annovar. The annovar program throws an error saying that I don't have refGene.txt file.

Can I create custom GTF file for influenza virus based on these 8 genes?

Have anyone faced similar issue?

Can you suggest any other variant annotation tool apart from annovar?

variants annovar SNP • 7.8k views
ADD COMMENT
0
Entering edit mode
8.5 years ago
pld 5.1k

The problem is that a GFF is basically a list of where genes/etc are within a given chromosome (genome in the case of viruses). You can't make one with just CDS sequences. You need the genome and the locations of those sequences within that virus.

http://www.ncbi.nlm.nih.gov/genome/?term=influenza

There are a few genomes here, some of them have GFFs, or you can use some of the tools available to convert the .gbk to GTF. For viruses, it can be tricky, I've changed over to making them by hand it isn't too hard.

http://www.ncbi.nlm.nih.gov/genomes/FLU/Database/nph-select.cgi?go=genomeset

There are genomes here, but they're in fasta format so you'd have to do some leg work to create a GFF.

You could also try here, if you're using publicly available data, this might be the best option.

http://www.fludb.org/brc/home.spg?decorator=influenza

ADD COMMENT
0
Entering edit mode

Thanks Joe for your suggestions. For example, I am interested in these two strains Influenza A virus (A/California/07/2009(H1N1)) and (A/Texas/50/2012(H3N2). From following link http://www.ncbi.nlm.nih.gov/genome/10290, I have following ids,

I don't have genomes for these two strains. Instead I have segments refseq id. So what are the fields manually I need to capture.

ADD REPLY
0
Entering edit mode
2.2 years ago
antcart22 • 0

Hi,

I have a VCF file of a segmented virus genome that I would like to annotate using SnpEff. I realized that I need a database or at the very least the GFF files to be able to use SnpEff. I have the FASTA sequences for each of the genome segments. Is it relatively straightforward to build the GFF files and validate them ? Are there tools that are particularly useful to do this?

Thanks

ADD COMMENT
0
Entering edit mode

don't post a new question as an answer to a really old post - create a new question

ADD REPLY

Login before adding your answer.

Traffic: 2219 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6