Getting sequence information by using atoms
1
0
Entering edit mode
5 weeks ago

I have a cif format file and its contents are as follows.

My purpose is to get sequence information (amino acid alphabet) from atoms information. Is there a function written in Python to achieve this?

atoms python cif protein • 344 views
0
Entering edit mode
awk '($1=="ATOM") {print$6}' < in.cif | paste -s -d '-'


??

0
Entering edit mode

Please do not post images of the data. Always post data it self and expected output for better understanding the issue.

0
Entering edit mode
5 weeks ago
Joe 20k

You should be able to use BioPython for this, and it has functions already for this type of thing. Alternatively you could always use something like UCSF Chimera or pymol.

If you want to use generic commandline tools, as Pierre pointed out, the information you need can be obtained from column 6.

Note however, it may not be this simple to extract a sequence from this, as crystal files often have discontinuities in the sequences when compared to their genomic annotations.