According to the Variant Effect Predictor tool from Ensemble, the consequence types upstream gene and downstream gene variants are located within 5 Kb from the transcript. I would like to create 2 categories, "genic" and "non genic" and include up- and downstream gene variants (as well as all other consequence types from ENSEMBL) into one of them. Where to include up- and downstream variants?. Thanks very much
Ensembl uses Sequence Ontology (SO) terms tho describe the consequences of variants. The upstream/ downstream consequence type is defined as being within 5kb of the transfer: http://www.sequenceontology.org/browser/current_svn/term/SO:0001631
So, a particular variant could be an upstream variant of all of (or a subset of) the different transcripts of a gene.
For example, if you consider the different transcripts of BRCA2 (http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000139618;r=13:32315474-32400266) a single variant might be an upstream variant of the BRCA-001 transcript, but not of the BRCA2-003 transcript.
Technically the upstream/downstream variants fall outside of the defined gene boundaries, but the decision whether to include them in one category or another for a particular analysis really depends on the analysis you wish to perform and the definitions of variants you wish to apply during your analysis.
Ben Ensembl Helpdesk