Annotation best practices
0
0
Entering edit mode
3.9 years ago

What are the recommended strategies and packages to reduce RNAseq data from ENSEMBL gene IDs to protein coding genes with a valid NCBI.ID? I normally use the package annotables and then filter for “protein_coding” and !is.na(entrez). After that I come up with a few duplicated entrez IDs, what’s the best way to deal with those?

RNA-Seq gene • 822 views
ADD COMMENT
1
Entering edit mode
ADD REPLY
1
Entering edit mode

That is a different question and a different answer than I would give. The OP wants to know how to deal with Ensembl IDs that map to two or more NCBI Gene IDs, neither of which are as, um, unreliable as a HUGO Gene Symbol.

I tend to favor sticking with either Ensembl IDs or NCBI Gene IDs, and ignoring the differences between the two - there's no profit in trying to figure out why a given Ensembl ID maps to two NCBI Gene IDs or vice versa. Just use one or the other and be done with it.

ADD REPLY
0
Entering edit mode

thanks, I normally prefer entrez and convert to gene names at the very end to make sense of it or when a tool only accepts HUGO.

ADD REPLY
0
Entering edit mode

Originally posted on Bioconductor https://support.bioconductor.org/p/131622/

ADD REPLY

Login before adding your answer.

Traffic: 3123 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6