EMBL-EBI gene ids are not matching with data downloaded from other data bases
1
0
Entering edit mode
5.9 years ago
Bioin ▴ 10

Dear Biostars,

I would like to use Sorghum Expression Atlas - E-MTAB-3839 data downloaded from the like https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-3839/. Previously I have downloaded Sorghum data from other databases like Phytozome etc, problem is gene ids of EMBL data are different from data downloaded from other sources. Gene id example from other dbs: Sobic.001G000100 Gene id example of ArrayExpress data: SORBI_3003G276100 Is there any way to convert or map EMBL/ArrayExpress gene id to Phytozome Sorghum gene id. Kindly help me to resolve this issue. Thank you.

gene genome assembly next-gen SNP • 1.5k views
ADD COMMENT
0
Entering edit mode

The correct observation should be: the IDs from the others are different then those from EBI/EMBL ;) .

The best thing to do is to look in phytozome (?) to see if they offer alternative IDs, otherwise: have a look at the locus_tag info in the EMBL data , that one should (in theory) reflect more the IDs used by other databases

ADD REPLY
0
Entering edit mode

Thanks for your suggestions, unfortunately neither of them worked for Sorghum data.

ADD REPLY
1
Entering edit mode

OK, if you can't find a textual link between the two IDs, you can probably only fall back on creating it yourself.

One approach is: get the CDS fasta file of the annotations, both from EMBL and from phytozome and blastn them to each other and then create a correspondence table for the IDs. This will work in most cases but it's likely not gonna be a 100% waterproof approach.

ADD REPLY
0
Entering edit mode

I will try this approach. Thank you.

ADD REPLY
1
Entering edit mode
5.9 years ago
Denise CS ★ 5.2k

The gene ID in Expression Atlas is the same in Ensembl Plants i.e. SORBI_3003G27610. It seems the gene annotation in Ensembl Plants is provided by phytozome so if the IDs do not match with phytozome, it's worth brining this up to both Ensembl Plants and Phytozome. I had a look at Ensembl Plants BioMart and could not see an option to convert Ensembl Plants (or Array Express IDs) to Phytozome IDs. You can convert them to NCBI IDs if this is of any help. For SORBI_3003G27610, we have 8069790 as Entrez Gene ID.

ADD COMMENT

Login before adding your answer.

Traffic: 1778 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6