readGAF in GO analysis via GOseq
1
0
Entering edit mode
3.5 years ago
citronxu ▴ 20

Hi there,

I'm working principally on A.thaliana, and would like to conduct GO analysis via r package GOseq. Cuz the local database does not support AT, relevant GO annotation should be introduced. The GO annotation for AT has been downloaded from gene ontology website, whose maintainer is shown to be TAIR, problem occur when I use readGAF function in r, and it simply returned error as below:

Error in readGFF("~/Downloads/tair.gaf") : 
  reading GFF file: line 1 has less than 8 tab-separated columns

Then I checked the content of the file, found that first few lines are irrelevant infos, like

!gaf-version: 2.1
!
!Generated by GO Central
!
!Date Generated by GOC: 2020-09-10
!
!Header from source association file:
!=================================
!
!Generated by GO Central
!
!Date Generated by GOC: 2020-09-10
!
!Header from tair source association file:
!=================================
!Project_name: The Arabidopsis Information Resource (TAIR)
!URL: http://www.arabidopsis.org
!Contact Email: curator@arabidopsis.org
!Last Updated: 2020-07-01
!=================================
!
!Header copied from paint_tair_valid.gaf
!=================================
!Created on Tue Sep  1 09:52:00 2020.
!PANTHER version: v.15.0.
!GO version: 2020-08-10.
!
!=================================
!
!Documentation about this header can be found here: https://github.com/geneontology/go-site/blob/master/docs/gaf_validation.md
!
TAIR    locus:2008970   AT1G11880       GO:0000009  TAIR:AnalysisReference:501756966    IEA InterPro:IPR007315  F   AT1G11880   AT1G11880|F12F1.28|F12F1_28 protein taxon:3702  20200618    InterPro        TAIR:locus:2008970

so that I deleted those, and retried readGAF, but got another error here:

Error in readGFF("~/Downloads/tair.gaf") : 
  reading GFF file: line 1 has more than 9 tab-separated columns

basically the format that all infos in seems not fit...? did I download the wrong file or should I use other function in R to solve it?

Welcome any ideas, many thanks in advance.

rna-seq R • 1.2k views
ADD COMMENT
1
Entering edit mode
3.5 years ago
ATpoint 81k

You use the wrong function. You need readGAF but use readGFF ;-)

ADD COMMENT
0
Entering edit mode

right....many thanks, how stupid I am :)

ADD REPLY
0
Entering edit mode

Don’t worry, happens to all of us :)

ADD REPLY
0
Entering edit mode

hallo, another error arisen...

Error in readGAF("~/Downloads/tair.gaf") : 
  At least one DB object ID has multiple DB object symbols or names in gene ontology annotation file.

would that be the problem of file itself or certain argument could solve that? many thanks:)

ADD REPLY

Login before adding your answer.

Traffic: 2571 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6