Question

How to get short and sweet gene names/description for gene IDs?

1

Entering edit mode

6.4 years ago

WUSCHEL ▴ 860

I am working on Arabidopsis thaliana omics project. How can I get short and proper gene description for each AGI (Gene IDs) numbers? Downloaded database from Tair10 has pretty lengthy names which are difficult to work with (plotting , summarizing in downs-stream works)

e.g.

AT1G01050       Soluble inorganic pyrophosphatase 1 OS=Arabidopsis thaliana (sp|q93v56|ipyr1_arath : 419.0)

AT1G01800   Enzyme classification.EC_1 oxidoreductases.EC_1.1 oxidoreductase acting on CH-OH group of donor(50.1.1 : 434.7) & (+)-neomenthol dehydrogenase OS=Arabidopsis thaliana (sp|q9m2e2|sdr1_arath : 357.0) (original description: none)

From where I can get / how to modify the names short and sweet?

RNA-Seq R proteomics • 2.2k views

ADD COMMENT • link updated 6.4 years ago by finswimmer 16k • written 6.4 years ago by WUSCHEL ▴ 860

score 6 · Accepted Answer · 2019-02-18

6

Entering edit mode

6.4 years ago

finswimmer 16k

Go to ensembl's BioMart
Choose Dataset: Ensembl Plant Genes and Arabidopsis thaliana genes
Choose Filters->Genes -> Input external references ID list -> Gene Stable ID(s) and paste your IDs into the textfield
Choose Attributes->Gene and select Gene Stable ID and Gene name
Click Result and download in the format you like

You can extract the IDs from the file example above with a simple cut -f1 input_file > gene_ids.txt.

fin swimmer

ADD COMMENT • link 6.4 years ago by finswimmer 16k

0

Entering edit mode

Thank a heaps finswimmer :)

ADD REPLY • link 6.4 years ago by WUSCHEL ▴ 860