How to retrieve representative at 30% sequence identity in PDB
1
0
Entering edit mode
4.5 years ago
bosimiya • 0

I plan to use the PDB advanced search to filter sequences. I need to create a test set of protein sequences. The selection conditions are probably chain length, resolution, macromolecule type, etc., which are all easy to implement.

But there is another restriction: retrieving representative at 30% sequence identity. How do I achieve this?

PDB • 834 views
ADD COMMENT
0
Entering edit mode
4.5 years ago
Mensur Dlakic ★ 30k

First you download a FASTA file with all PDB sequences.

Next you cluster them down to 30% identity using MMseqs2.

ADD COMMENT

Login before adding your answer.

Traffic: 3242 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6