Our goal is to select high quality structures from each metafold, where
possible, from our Consensus Domain Dictionary
(CDD) for simulation. The
structures of the domains within each metafold in our CDD were examined.
Where there was more than one structure available within a metafold,
domains comprising an entire chain in the PDB were preferred over
domains from mulitmers and multi-domain proteins. Shorter domains were
preferred over larger ones with a maximum cutoff of 450 residues and
high resolution crystal structures (resolution cutoff of 3 Å) over
NMR structures. Potential protein domains were also selected to be
self-contained and likely to fold independently. We preferred domains
without cofactors (with the exception of heme, Zn and Ca), were not part
of a membrane and were globular with regular secondary structure
elements. Finally, for each metafold, the list of suitable protein
target structures was reduced by selecting for those systems with
available experimental data in the forms of NMR observables, folding and
unfolding studies, drug design, or protein interaction. If at least one
domain within a metafold met these criteria the domain was assigned as
the fold representative for that metafold and simulated. If no domains
within a metafold met all of the above criteria, a domain was assigned
as the fold representative and the reasons for rejection were annotated.
In the 2009 CDD, 807 metafolds had at least one domain suitable for
simulation while 888 metafolds were rejected.