How to filter out the non-coding genes?
2
0
Entering edit mode
9.9 years ago
liux.bio ▴ 360

Hi,

Biostars. I have a list of genes with Ensembl gene ids and I want to filter out non-coding genes and get protein-coding genes. I am using Bioconductor package BiomaRt, but can't find a direct way. Any suggestions?

Many thanks!

bioconductor genome • 4.3k views
ADD COMMENT
0
Entering edit mode

Can you use the R script you used to figure this out? I am trying to do something similar now and am having trouble.

ADD REPLY
2
Entering edit mode
9.9 years ago
Prakki Rama ★ 2.7k

I used Transcript Biotype in the attribute to check if it is actually protein coding or not. It seems pretty straightforward using Biomart.

ADD COMMENT
0
Entering edit mode

Got it. Thank you!

ADD REPLY
0
Entering edit mode
4.6 years ago
Scott McKay ▴ 30

Can you post the R script you used to figure this out? I am having a terribly hard time trying to do something similar here.

ADD COMMENT
1
Entering edit mode

Use the biomaRt vignette. If you have ids as ensembl gene ids it is pretty easy. Let x be a character array with your ensembl gene ids (with and without version information)

goids = getBM(attributes = c('ensembl_gene_id', 'gene_biotype'), 
              filters = 'ensembl_gene_id', 
              values = x, 
              mart = ensembl)

Below is the information for gene biotypes.

https://useast.ensembl.org/info/genome/genebuild/biotypes.html

Also I think you should add your post as comment rather than answer.

ADD REPLY

Login before adding your answer.

Traffic: 2736 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6