gff file
0
0
Entering edit mode
2.6 years ago
devhimd ▴ 10

I have a GFF file of the UniProt I want to extract the lipidation, modified residue and its modifications, and the residue number on how to do that?

I want only the lipidation, modified residue, and its modifications, residue number in a CSV file

Is there any programming code or awk script to do that?

GFF UNIPRTOT • 1.4k views
ADD COMMENT
1
Entering edit mode

I think it would be better if you post a sample of the gff file and what you want to extract from it

ADD REPLY
0
Entering edit mode
##gff-version 3
##sequence-region P01112 1 189
P01112  UniProtKB   Chain   1   186 .   .   .   ID=PRO_0000042996;Note=GTPase HRas  
P01112  UniProtKB   Initiator methionine    1   1   .   .   .   Note=Removed%3B alternate;Ontology_term=ECO:0000269;evidence=ECO:0000269|Ref.12 
P01112  UniProtKB   Chain   2   186 .   .   .   ID=PRO_0000326476;Note=GTPase HRas%2C N-terminally processed    
P01112  UniProtKB   Propeptide  187 189 .   .   .   ID=PRO_0000042997;Note=Removed in mature form   
P01112  UniProtKB   Nucleotide binding  13  18  .   .   .   Note=GTP;Ontology_term=ECO:0000269;evidence=ECO:0000269|PubMed:16698776;Dbxref=PMID:16698776    
P01112  UniProtKB   Nucleotide binding  29  35  .   .   .   Note=GTP;Ontology_term=ECO:0000269;evidence=ECO:0000269|PubMed:16698776;Dbxref=PMID:16698776    
P01112  UniProtKB   Nucleotide binding  59  60  .   .   .   Note=GTP;Ontology_term=ECO:0000269;evidence=ECO:0000269|PubMed:16698776;Dbxref=PMID:16698776    
P01112  UniProtKB   Nucleotide binding  116 119 .   .   .   Note=GTP;Ontology_term=ECO:0000269;evidence=ECO:0000269|PubMed:16698776;Dbxref=PMID:16698776    
P01112  UniProtKB   Nucleotide binding  145 147 .   .   .   Note=GTP;Ontology_term=ECO:0000269;evidence=ECO:0000269|PubMed:16698776;Dbxref=PMID:16698776    
P01112  UniProtKB   Region  166 185 .   .   .   Note=Hypervariable region   
P01112  UniProtKB   Motif   32  40  .   .   .   Note=Effector region    
P01112  UniProtKB   Modified residue    1   1   .   .   .   Note=N-acetylmethionine%3B in GTPase HRas%3B alternate;Ontology_term=ECO:0000269;evidence=ECO:0000269|Ref.12    
P01112  UniProtKB   Modified residue    2   2   .   .   .   Note=N-acetylthreonine%3B in GTPase HRas%2C N-terminally processed;Ontology_term=ECO:0000269;evidence=ECO:0000269|Ref.12    
P01112  UniProtKB   Modified residue    118 118 .   .   .   Note=S-nitrosocysteine;Ontology_term=ECO:0000269;evidence=ECO:0000269|PubMed:9020151;Dbxref=PMID:9020151    
P01112  UniProtKB   Modified residue    186 186 .   .   .   Note=Cysteine methyl ester;Ontology_term=ECO:0000269;evidence=ECO:0000269|PubMed:8626715;Dbxref=PMID:8626715    
P01112  UniProtKB   Lipidation  181 181 .   .   .   Note=S-palmitoyl cysteine;Ontology_term=ECO:0000269,ECO:0000269,ECO:0000269,ECO:0000269;evidence=ECO:0000269|PubMed:15705808,ECO:0000269|PubMed:16000296,ECO:0000269|PubMed:2661017,ECO:0000269|PubMed:8626715;Dbxref=PMID:15705808,PMID:16000296,PMID:2661017,PMID:8626715 
P01112  UniProtKB   Lipidation  184 184 .   .   .   Note=S-(15-deoxy-Delta12%2C14-prostaglandin J2-9-yl)cysteine%3B alternate;Ontology_term=ECO:0000269;evidence=ECO:0000269|PubMed:12684535;Dbxref=PMID:12684535   
P01112  UniProtKB   Lipidation  184 184 .   .   .   Note=S-palmitoyl cysteine%3B alternate;Ontology_term=ECO:0000269,ECO:0000269,ECO:0000269,ECO:0000269;evidence=ECO:0000269|PubMed:15705808,ECO:0000269|PubMed:16000296,ECO:0000269|PubMed:2661017,ECO:0000269|PubMed:8626715;Dbxref=PMID:15705808,PMID:16000296,PMID:2661017,PMID:8626715    
P01112  UniProtKB   Lipidation  186 186 .   .   .   Note=S-farnesyl cysteine;Ontology_term=ECO:0000269;evidence=ECO:0000269|PubMed:8626715;Dbxref=PMID:8626715  
P01112  UniProtKB   Glycosylation   35  35  .   .   .   Note=(Microbial infection) O-linked (Glc) threonine%3B by P.sordellii toxin TcsL;Ontology_term=ECO:0000269,ECO:0000269,ECO:0000269,ECO:0000269;evidence=ECO:0000269|PubMed:19744486,ECO:0000269|PubMed:8626575,ECO:0000269|PubMed:8626586,ECO:0000269|PubMed:9632667;Dbxref=PMID:19744486,PMID:8626575,PMID:8626586,PMID:9632667    
P01112  UniProtKB   Cross-link  170 170 .   .   .   Note=Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in ubiquitin);Ontology_term=ECO:0000269;evidence=ECO:0000269|PubMed:30442762;Dbxref=PMID:30442762   
P01112  UniProtKB   Alternative sequence    152 189 .   .   .   ID=VSP_041597;Note=In isoform 2. VEDAFYTLVREIRQHKLRKLNPPDESGPGCMSCKCVLS->SRSGSSSSSGTLWDPPGPM;Ontology_term=ECO:0000303,ECO:0000303;evidence=ECO:0000303|PubMed:14500341,ECO:0000303|PubMed:15489334;Dbxref=PMID:14500341,PMID:15489334  
P01112  UniProtKB   Natural variant 12  12  .   .   .   ID=VAR_026106;Note=In CSTLO. G->A;Ontology_term=ECO:0000269,ECO:0000269,ECO:0000269;evidence=ECO:0000269|PubMed:16170316,ECO:0000269|PubMed:16329078,ECO:0000269|PubMed:16443854;Dbxref=dbSNP:rs104894230,PMID:16170316,PMID:16329078,PMID:16443854 
ADD REPLY
0
Entering edit mode

This is the GFF file I want to extract only lipidation, nitrocysteine, and cysteine thioester from this file. I want to make a CSV file where it consists of the modification in one column and the residue numbers of the respective modification in another column. I don't want the other information like the evidence and the ECO etc...,.

So how should I proceed to write a program, awk script, or Perl script?

ADD REPLY
0
Entering edit mode

I suppose here you are referring to this gff file : https://www.uniprot.org/uniprot/P01112.gff

I am not a regular user of awk or Perl, but I am using R regularly so I can suggest a simple R script:

library(ape) #if ape is not installed then you can do so using  - install.packages("ape")

This is how you read gff:

uniprot_gff = read.gff("uniprot.gff", GFF3 = TRUE)

This is how you can subset the "table" of the gff file for column "type" which contains "lipidation", "nitrocysteine" and "cysteine" entries and also have its respective residue number:

  uniprot_gff_subset = uniprot_gff[uniprot_gff$type %in% c("Lipidation","Nitrocysteine","Cysteine"),][,c("type","start","end")]
ADD REPLY
0
Entering edit mode

can you write using python?

ADD REPLY
0
Entering edit mode

I have been out of touch with python, but I am sure you can use similar logic (possibly even package) with python

ADD REPLY

Login before adding your answer.

Traffic: 2517 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6