PDB glossary for three letter codes
1
0
Entering edit mode
8 months ago
joe ▴ 230

Can someone point me to a glossary of all three letter codes one might encounter in a .cif file for RNA and DNA?

There are some standard ones, as outlined in the cif dictionary but there are many more I can find...

PDB cif R bio3d Rpdb • 231 views
ADD COMMENT
1
Entering edit mode
8 months ago
joe ▴ 230

In case someone comes back to this, below is something in R that works;

system("wget http://ligand-expo.rcsb.org/dictionaries/Components-pub.cif")

cif.ref.path <- paste0(getwd(), "/Components-pub.cif")

cif.ref.data <- readLines(cif.ref.path)


data.grep <- grep("data_", cif.ref.data)

all.data <- c()
for(i in 1:(length(data.grep)-1)){
  this.cif <- cif.ref.data[data.grep[i]:data.grep[i+1]]

  this.chem_comp.id <- strsplit(this.cif[grep("_chem_comp.id", this.cif)], " ")[[1]]
  this.chem_comp.id.take <- this.chem_comp.id[length(this.chem_comp.id)]

  this.chem_comp.name <- strsplit(this.cif[grep("_chem_comp.name", this.cif)], " ")[[1]]
  this.chem_comp.name.1 <- this.chem_comp.name[2:length(this.chem_comp.name)]
  this.chem_comp.name.2 <- this.chem_comp.name.1[which(this.chem_comp.name.1 != "")]
  this.chem_comp.name.3 <- str_remove_all(this.chem_comp.name.2, '\"')
  this.chem_comp.name.take <- paste0(this.chem_comp.name.3, collapse = " ")

  this.chem_comp.type <- strsplit(this.cif[grep("_chem_comp.type", this.cif)], " ")[[1]]
  this.chem_comp.type.1 <- this.chem_comp.type[2:length(this.chem_comp.type)]
  this.chem_comp.type.2 <- this.chem_comp.type.1[which(this.chem_comp.type.1 != "")]
  this.chem_comp.type.3 <- str_remove_all(this.chem_comp.type.2, '\"')
  this.chem_comp.type.take <- paste0(this.chem_comp.type.3, collapse = " ")

  this.chem_comp.pdbx_type <- strsplit(this.cif[grep("_chem_comp.pdbx_type", this.cif)], " ")[[1]]
  this.chem_comp.pdbx_type.take <- this.chem_comp.pdbx_type[length(this.chem_comp.pdbx_type)]

  this.data <- cbind(this.chem_comp.id.take, this.chem_comp.name.take, this.chem_comp.type.take, this.chem_comp.pdbx_type.take)
  all.data <- rbind(all.data, this.data)
}


all.data.df <- data.frame(all.data)
colnames(all.data.df) <- c("chem_comp.id", "chem_comp.name", "chem_comp.type", "chem_comp.pdbx_type")
ADD COMMENT

Login before adding your answer.

Traffic: 1588 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6