Question: How to create GTF file for virus and build database for influenza virus using annovar for annotating virus variants?
4.9 years ago by
United States
bioinforesearchquestions280 wrote:

Dear All,

I am working on influenza virus. I would like to annotate variants from virus using annovar. I am facing issue in building the database for virus. I have just 8 genes for the influenza virus and have their refseq id. I tried finding the genome id but I couldn't . So I saved all the 8 gene's fasta sequence in one file and used it for building the database for annovar. The annovar program throws an error saying that I don't have refGene.txt file. 

Can I create custom GTF file for influenza virus based on these 8 genes?

Have anyone faced similar issue?

Can you suggest any other variant annotation tool apart from annovar?

written 4.9 years ago
4.9 years ago by
United States
pld4.8k wrote:

The problem is that a GFF is basically a list of where genes/etc are within a given chromosome (genome in the case of viruses). You can't make one with just CDS sequences. You need the genome and the locations of those sequences within that virus.

There are a few genomes here, some of them have GFFs, or you can use some of the tools available to convert the .gbk to GTF. For viruses, it can be tricky, I've changed over to making them by hand it isn't too hard.

There are genomes here, but they're in fasta format so you'd have to do some leg work to create a GFF.

You could also try here, if you're using publicly available data, this might be the best option.

written 4.9 years ago by pld4.8k

Thanks Joe for your suggestions. For example, I am interested in these two strains Influenza A virus (A/California/07/2009(H1N1)) and (A/Texas/50/2012(H3N2). From following link, I have following ids,

I don't have genomes for these two strains. Instead I have segments refseq id. So what are the fields manually I need to capture.

written 4.9 years ago by bioinforesearchquestions280
