Gene starts with "LOC" prefix ?
1
4
Entering edit mode
3.8 years ago
sunnykevin97 ▴ 980

HI

I'm working with RNA_Seq data from non-model organisms after differential gene expression analysis I find out their are lot of genes starting with prefix "LOC" and further searched in web I found out that these are genes which don't have any orthologs. I further performed downstream analysis I unable to convert LOC's into ENTREZID's/ENSEMBL ID's using clusterprofiler(bitr function). How do I proceed further for downstream analysis something line GO/KEGG analysis. Should I ignore them completely ? I had total of 7 samples after differential gene expression analysis they found to be 4501 for each sample.

If I search these ID's in NCBI I getting the gene information.

suggestions please!

RNA-Seq gene sequence • 4.5k views
ADD COMMENT
0
Entering edit mode

Can you provide an example or two? Sometimes LOCs have informative aliases that you can use. If you have the Entrez Gene IDs you can fetch a list of all aliases for each of them.

ADD REPLY
0
Entering edit mode

The number after the LOC is the EntrezID. You can access these entries by the URL https://www.ncbi.nlm.nih.gov/gene/{number}

For your examples,

LOC117740983    117740983    https://www.ncbi.nlm.nih.gov/gene/117740983
LOC117726460    117726460    https://www.ncbi.nlm.nih.gov/gene/117726460
LOC117746502    117746502    https://www.ncbi.nlm.nih.gov/gene/117746502
ADD REPLY
0
Entering edit mode

I thought the same, thanks for the suggestions.

ADD REPLY
0
Entering edit mode

Hi, this is very useful, thanks. How would I go about running GO enrichment analysis with this list?

ADD REPLY
0
Entering edit mode

Since LOC genes are uncharacterized there is likely no way to do GO enrichment analysis on those.

ADD REPLY
2
Entering edit mode
3.8 years ago
vkkodali_ncbi ★ 3.7k

These don't appear to have useful gene symbols. But you can get names for these using Entrez Direct as follows:

esearch -db gene -query 'LOC117740983' | esummary | xtract -pattern DocumentSummary -element Id,Name,Description
ADD COMMENT
0
Entering edit mode

Most LOC entries are uncharacterized locations, at least on human and mouse genomes.

ADD REPLY
0
Entering edit mode

How do I run a batchmode using esearch ? I had morethan 1000 geneID's in a file, using the above command. Suggestions please.

ADD REPLY
5
Entering edit mode

Use epost method. Put your queries in a file, one per line.

$ more tt
117740983
117726460
117746502

$ epost -db gene -input tt | esummary | xtract -pattern DocumentSummary -element Id,Name,Description
117746502   LOC117746502    neuropilin-1a-like
117740983   LOC117740983    transmembrane protein 230-like
117726460   LOC117726460    NADH-cytochrome b5 reductase 3
ADD REPLY

Login before adding your answer.

Traffic: 2739 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6