I have a large collection of PDB files, each containing the structure of a protein in complex with an RNA molecule. I would like to find out (without going by hand through each PDB) the type of the RNA molecule (tRNA, mRNA, rRNA). Is there any program/web service that can help me achieve this?
One way to do this would be to use PDBML XML files, rather than PDB format. These contain the tag PDBx:pdbx_description, which describes each type of molecule in the structure.
grep "PDBx:pdbx_description" 3J13.xml
We see (just showing the first 3 lines):
<PDBx:pdbx_description>16S ribosomal RNA</PDBx:pdbx_description> <PDBx:pdbx_description>mRNA</PDBx:pdbx_description> <PDBx:pdbx_description>P site tRNA</PDBx:pdbx_description>
You can parse the XML files in the language of your choice (Ruby, Python, Perl...)