Question: Gene starts with "LOC" prefix ?
0
gravatar for sunnykevin97
10 weeks ago by
sunnykevin97100
sunnykevin97100 wrote:

HI

I'm working with RNA_Seq data from non-model organisms after differential gene expression analysis I find out their are lot of genes starting with prefix "LOC" and further searched in web I found out that these are genes which don't have any orthologs. I further performed downstream analysis I unable to convert LOC's into ENTREZID's/ENSEMBL ID's using clusterprofiler(bitr function). How do I proceed further for downstream analysis something line GO/KEGG analysis. Should I ignore them completely ? I had total of 7 samples after differential gene expression analysis they found to be 4501 for each sample.

If I search these ID's in NCBI I getting the gene information.

suggestions please!

rna-seq sequence gene • 224 views
ADD COMMENTlink written 10 weeks ago by sunnykevin97100

Can you provide an example or two? Sometimes LOCs have informative aliases that you can use. If you have the Entrez Gene IDs you can fetch a list of all aliases for each of them.

ADD REPLYlink written 10 weeks ago by vkkodali2.1k

https://www.ncbi.nlm.nih.gov/search/all/?term=LOC117740983 https://www.ncbi.nlm.nih.gov/gene/117726460 https://www.ncbi.nlm.nih.gov/search/all/?term=LOC117746502

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by sunnykevin97100

The number after the LOC is the EntrezID. You can access these entries by the URL https://www.ncbi.nlm.nih.gov/gene/{number}

For your examples,

LOC117740983    117740983    https://www.ncbi.nlm.nih.gov/gene/117740983
LOC117726460    117726460    https://www.ncbi.nlm.nih.gov/gene/117726460
LOC117746502    117746502    https://www.ncbi.nlm.nih.gov/gene/117746502
ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by RamRS30k

I thought the same, thanks for the suggestions.

ADD REPLYlink written 10 weeks ago by sunnykevin97100

These don't appear to have useful gene symbols. But you can get names for these using Entrez Direct as follows:

esearch -db gene -query 'LOC117740983' | esummary | xtract -pattern DocumentSummary -element Id,Name,Description
ADD REPLYlink written 10 weeks ago by vkkodali2.1k

Most LOC entries are uncharacterized locations, at least on human and mouse genomes.

ADD REPLYlink written 10 weeks ago by RamRS30k

How do I run a batchmode using esearch ? I had morethan 1000 geneID's in a file, using the above command. Suggestions please.

ADD REPLYlink written 7 weeks ago by sunnykevin97100
2

Use epost method. Put your queries in a file, one per line.

$ more tt
117740983
117726460
117746502

$ epost -db gene -input tt | esummary | xtract -pattern DocumentSummary -element Id,Name,Description
117746502   LOC117746502    neuropilin-1a-like
117740983   LOC117740983    transmembrane protein 230-like
117726460   LOC117726460    NADH-cytochrome b5 reductase 3
ADD REPLYlink written 7 weeks ago by genomax89k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1624 users visited in the last hour