Getting The Number Of Chromosome From Gene Id For A Lot Of Genes
2
0
Entering edit mode
10.5 years ago
thjnant ▴ 160

Hello,

I have a vcf file that instead of the number of the chromosomes, I have the gene IDs, such as this:

ENSGALG00000000011|ENSGALT00000000012|57|1123|1125
ENSGALG00000000011|ENSGALT00000000012|57|1123|1125 ENSGALG00000000011|ENSGALT00000000012|57|1123|1125

I want to know to which chromsome each of these gene IDs belong and extract all the genes that belong to the Z or W chromosome. Indeed, one way is to search for each gene ID in Ensemble but there are more than 7000 genes. I was wondering whether there is any automatic way of doing this.

Thank you very much in advance, Cheers, Sandra

• 3.8k views
ADD COMMENT
0
Entering edit mode

The answer to most "map ID X to ID Y" questions is BioMart; see answer from lelle and numerous other answers at this site.

ADD REPLY
4
Entering edit mode
10.5 years ago
lelle ▴ 830

You can use the Ensembl Biomart Instance to do this

ADD COMMENT
1
Entering edit mode

You can extract all the gene IDs by using awk

awk -F "|" '{for(i=1;i<=NF;++i){print $i}}' input

You can get all the gene id this way (Assuming your input contain only what you've shown here) Then you can use those for biomart and quickly get the conversion as lelle stated

ADD REPLY

Login before adding your answer.

Traffic: 2550 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6