How to retrieve gene ID, gene description, after cuffdiff analysis
1
0
Entering edit mode
6.8 years ago

Genome and annotations files were used from NCBI.

I have done cuffdiff analysis using two samlpes; the representative results (first four columns) look as follows

test_id                  gene_id                     gene                                             locus
XLOC_000287 XLOC_000287 -                                           NC_029256.1:528755-528975
XLOC_000329 XLOC_000329 LOC9268150,LOC9272685   NC_029256.1:1111166-1342272
XLOC_000358 XLOC_000358 LOC4325981                          NC_029256.1:1742769-1791492
XLOC_000428 XLOC_000428 LOC4324516                          NC_029256.1:2707641-2711452
XLOC_000506 XLOC_000506 LOC4326215                          NC_029256.1:3858575-3884051
XLOC_000507 XLOC_000507 LOC107275818                    NC_029256.1:3893929-3900126
XLOC_000524 XLOC_000524 LOC4323878                          NC_029256.1:4127236-4140633

What does gene ID (XLOC_000329 or XLOC_000358) mean in this table?

Also the under gene column details are LOC9268150 or LOC4325981, but it doesn't seem like gene name.......

How to retrieve gene ID, gene description?

How I can do gene ontology and KEGG analysis after that?

RNA-Seq gene • 2.4k views
ADD COMMENT
1
Entering edit mode
6.8 years ago
Chirag Parsania ★ 2.0k

XLOC_ID is id given by cufflink : "A unique identifier describing the gene being tested". basically it represents unique gene model created by cufflink. LOC id comes from your gff file which you gave during cuffdiff run. Seems your id is from plant Oryza sativa. You can get details of each gene from gff file which you used. For GO analysis you can download this file. Use softwares like bingo or enrichment map to get significantly enriched GO. Here I assume that you LOC id is present in the downloaded file. If your LOC id is not there in the GO mapping file then you have to convert your ID in the format of go_association file.

~C. ~C.

ADD COMMENT
0
Entering edit mode

Thank You Chirag.

I did search for each locus indivudually in the gff list; and could find gene details from gff file as you suggested; but is there any way I can retrieve all the gene details in one time?

How did you get gene association from Gramene?

I tried searching LOC ID in gene_association.gramene_oryza; but could not find it......

You suggested.... If your LOC id is not there in the GO mapping file then you have to convert your ID in the format of go_association file.
How do I do it?

ADD REPLY
0
Entering edit mode

Hi Padmaja,

To get the detail of each gene you have to process last column of gff / gtf file. If you are not comfortable with programming you can even do in the excel with few find and replace operations.

If you look at the gene association file carefully it contains gene to go mapping for various species of rice. First you have to filter this file for your species. You can do this by applying filter on taxon column (column number 13). I believe taxon id for your species is 39947. (Please confirm taxon id before you filter the file). Once you have go association file for your species you can use this for GO enrichment as background dataset.

I checked the file and it has LOC ids. If id of your interest is still not there then you can convert your LOC id to uniprot id and then use uniprot id for GO enrichment. you can use conversion tool to get corresponding uniprot id for your LOC id.

Read the statistics here for the go association file

~C.

ADD REPLY
0
Entering edit mode

Thank You so much..............

ADD REPLY

Login before adding your answer.

Traffic: 1895 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6