Question: How to get short and sweet gene names/description for gene IDs?
gravatar for Wox
13 months ago by
Wox320 wrote:

I am working on Arabidopsis thaliana omics project. How can I get short and proper gene description for each AGI (Gene IDs) numbers? Downloaded database from Tair10 has pretty lengthy names which are difficult to work with (plotting , summarizing in downs-stream works)


AT1G01050       Soluble inorganic pyrophosphatase 1 OS=Arabidopsis thaliana (sp|q93v56|ipyr1_arath : 419.0)

AT1G01800   Enzyme classification.EC_1 oxidoreductases.EC_1.1 oxidoreductase acting on CH-OH group of donor(50.1.1 : 434.7) & (+)-neomenthol dehydrogenase OS=Arabidopsis thaliana (sp|q9m2e2|sdr1_arath : 357.0) (original description: none)

From where I can get / how to modify the names short and sweet?

rna-seq R proteomics • 399 views
ADD COMMENTlink modified 13 months ago by finswimmer13k • written 13 months ago by Wox320
gravatar for finswimmer
13 months ago by
finswimmer13k wrote:
  • Go to ensembl's BioMart
  • Choose Dataset: Ensembl Plant Genes and Arabidopsis thaliana genes
  • Choose Filters->Genes -> Input external references ID list -> Gene Stable ID(s) and paste your IDs into the textfield
  • Choose Attributes->Gene and select Gene Stable ID and Gene name
  • Click Result and download in the format you like

You can extract the IDs from the file example above with a simple cut -f1 input_file > gene_ids.txt.

fin swimmer

ADD COMMENTlink written 13 months ago by finswimmer13k

Thank a heaps finswimmer :)

ADD REPLYlink written 13 months ago by Wox320
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1692 users visited in the last hour