Question: variant annotation issue
0
gravatar for prasundutta87
2.1 years ago by
prasundutta87330
prasundutta87330 wrote:

Hi,

I have recently performed a variant annotation using Snpeff in water buffalo. The variants were called through bcftools call program (multiallelic caller) from Samtools package and are based on RNAseq reads. The BAM files were produced by aligning the RNAseq reads on the water buffalo genome available in NCBI Genome website.

SnpEff annotated many variants as downstream gene variants which is not right because the variants are called based on RNAseq reads. Can anyone throw any light on this?

ADD COMMENTlink modified 2.1 years ago by Brian Bushnell16k • written 2.1 years ago by prasundutta87330
2

Are you sure you used the same assembly for mapping and annotation ?

ADD REPLYlink written 2.1 years ago by Pierre Lindenbaum118k

Yes. I did. There us only one draft assembly present in the NCBI genome database for a Mediterranean water buffalo.

ADD REPLYlink written 2.1 years ago by prasundutta87330

I hope I am not missing anything logically. I am not sure what exactly to look for or check in order to find the reason behind it. Any suggestion is welcome.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by prasundutta87330

You could try annotating with a different annotator and compare results. That could probably help you understand what's going wrong.

ADD REPLYlink written 2.1 years ago by vakul.mohanty240

Yes..I am trying to do that with VEP..but ensemble does not have water buffalo genome. I am trying to use their standalone Perl version and custom use it by using the water buffalo genome and annotation file present in NCBI Genome database.

ADD REPLYlink written 2.1 years ago by prasundutta87330
0
gravatar for Brian Bushnell
2.1 years ago by
Walnut Creek, USA
Brian Bushnell16k wrote:

Genomes are often higher quality than transcriptomes. If the transcriptome is incomplete (which it always is, for vertebrates) you will get RNA reads mapping to genes that were not annotated. This is a good thing, as it can advance science! You've possibly discovered previously unknown genes.

ADD COMMENTlink written 2.1 years ago by Brian Bushnell16k

The thing is that I annotated the variants using SNpEff using -onlyprotein option (meaning-Only use protein coding transcripts). I though that becasue of this non-coding genes and probable genes will be avoided becasue I am focusing on only specific type of genes. I guess -onlyprotein option did not work correctly or I think I understood it wrongly.

ADD REPLYlink written 2.1 years ago by prasundutta87330

It seems like you might be confused about the nature of annotation. The water buffalo genome is not complete. The water buffalo transcriptome is even less complete. So, you can map reads to an official water buffalo reference, but the results will not be perfectly correct. It does not matter which variant-caller or annotation software you use - if the reference or annotation is incomplete, you will get incorrect results.

ADD REPLYlink written 2.1 years ago by Brian Bushnell16k

Yes..the genome/transcriptome is not complete..what I will do is that I will complement my rnaseq variant calling with DNAseq data variant call and hopefully get a consensus..

ADD REPLYlink written 2.1 years ago by prasundutta87330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1817 users visited in the last hour