This paper summarizes the findings of a recent, British Library-funded research project into computer techniques for searching the three-dimensional protein structures that occur in the Protein Data Bank. The work focuses on the secondary structures of proteins and utilizes both angular and distance geometric information. Algorithms are presented for the auto matic identification of secondary structure elements, of sec ondary structure motifs and of proteins with similar secondary structures.
Get full access to this article
View all access options for this article.
References
1.
R.A. Abagyan and V.N. Maiorov, A simple qualitative representation of polypeptide chain folds: comparison of protein tertiary structures, Journal of Biomolecular Structure and Dynamics5 (1988) 1267-1279.
2.
E.E. Abola , F.C. Bernstein and T.F. Koetzle, The ProteinData Bank, CODATABulletin56 (1984) 38-43.
3.
E.E. Abola, F.C. Bernstein, S.H. Bryant, T.F. Koetzle and J. Weng, Protein Data Bank. In: F.H. Allen, G. Bergerhoff and R. Sievers (Editors), Crystallographic Databases: Information Content, Software Systems. Scientific Applications (Data Commission of the International Union of Crystallography, Cambridge , 1987).
4.
F.H. Allen , S. Bellard, M.D. Brice, B.A. Cartwright, A. Doubleday, H. Higgs, T. Hummelink, B.G. Hummelink-Peters, O. Kennard, W.D.S. Motherwell, J.R. Rodgers and D.G. Watson, The Cambridge Crystallographic Data Centre: computer-based search, retneval, analysis and display of information, Acta CrystallographicaB35 (1979) 2331-2339.
5.
J.E. Ash, P.A. Chubb, S.E. Ward, S.M. Welford and P. Willett, Communication , Storage andRetrieval of Chemical Information (Ellis Horwood, Chichester, 1985).
6.
D.J. Barlow and J.M. Thornton, Helix geometry in proteins, Journal of MolecularBiology201 (1988) 601-619.
7.
F.C. Bernstein , T.F. Koetzle, G.J.B. Williams, E.F. Meyer, M.D. Bnce, J.R. Rodgers, O. Kennard, T. Shimanouchi and M. Tasumi, The ProteinData Bank: a computer-based archival file for macromolecular structures, Journal of MolecularBiology112 (1977 ) 535-542.
8.
A.T. Brint , H.M. Davies, E.M. Mitchell and P. Willett, Rapid geometric searching in protein structures, Journal of Molecular Graphics7 (1989) 48-53.
9.
A.T. Brint and P. Willett, Pharmacophoric pattern matching in files of 3-D chemical structures: comparison of geometric searching algorithms, Journal of Molecular Graphics5 ( 1987) 49-56.
10.
A.T. Brint and P. Willett, Algorithms for the identification of three-dimensional maximal common substructures , Journal of Chemical Information and Computer Sciences27 (1987) 152-158.
11.
A.T. Brint and P. Willett, Identifying 3-D maximal common substructures using transputer networks, Journal of Molecular Graphics5 ( 1987) 200-207.
12.
J.M. Burridge and S.J.P. Todd, Protein secondary structural representations using real-time interactive computer graphics, Journal of Molecular Graphics4 (1986) 220-222.
13.
P.Y. Chou and G.D. Fasman, Beta-turns in proteins, Journal of MolecularBiology115 (1977 ) 135-175.
14.
R. Diamond , Applications of computer graphics in molecular biology, Computer Graphics Forum3 (1984) 3-11.
15.
E. Hohne and R.G. Kretschmer, Description of secondary structures in proteins, Studia Biophysica108 (1985) 165-186.
16.
Y. Isogai , G. Nemethy, S. Rackovsky, S.J. Leach and H.A. Scheraga, Charactensation of multiple bends in proteins, Biopolymers19 (1980) 1183-1210.
17.
S.E. Jakes , P. Willett, D. Bawden and J.D. Fisher, Pharmacophoric pattern matching in files of 3-D chemical structures: evaluation of search performance, Journal of Molecular Graphics5 ( 1987) 41-48.
18.
S.E. Jakes and P. Willett, Pharmacophoric pattern matching in files of 3-D chemical structures: selection of interatomic distance screens, Journal of Molecular Graphics4 ( 1986) 12-20.
19.
T.A. Jones , Interactive computer graphics: FRODO, Methods in Enzymology115 ( 1985) 157-171.
20.
W. Kabsch and C. Sander, Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features, Biopolymers22 (1983) 2577-2637.
21.
A.M. Lesk , Detection of 3-D patterns of atoms in chemical structures, Communications of the ACM22 (1979) 219-224.
22.
M. Levitt and C. Chothia, Structural patterns in globular proteins , Nature261 (1976 ) 552-557.
23.
M. Levitt and J. Greer, Automatic identification of secondary structure in globular proteins, Journal of MolecularBiology114 (1977) 181-239.
24.
M.N. Liebman et al.Structural analysis of carboxypeptidase A and its complexes with inhibitors as a basis for modelling enzyme recognition and specificity, Biopolymers24 (1985) 1721-1758.
25.
E.M. Mitchell , Protein secondary and tertiary structure searching in files of 3-D atomic coordinates taken from the Protein Data Bank. PhD thesis (University of Sheffield, 1988).
26.
E.M. Mitchell , P. Willett, P.J. Artymiuk and D.W. Rice, Three-dimensional substructure searching in the Protein Data Bank (British Library Research and Development Department, London, 1988).
27.
C.J. Rawlings , W.R. Taylor, J. Nyakairu.J. Fox and M.J.E. Sternberg, Reasoning about protein topology using the logic programming language PROLOG, Journal of Molecular Graphics3 (1985) 151-157.
28.
F.M. Richards and C.E. Kundrot, Identification of structural motifs from protein coordinate data: secondary structure and first-level supersecondary structure, Proteins: Structure, Function, and Genetics3 (1988) 71-84.
29.
J.S. Richardson , The anatomy and taxonomy of protein structure , Advances in Protein Chemistry34 (1981) 167-339.
30.
M.G. Rossman and A. Liljas, Recognition of structural domains in globular proteins, Journal of MolecularBiology85 (1974) 177-181.
31.
G.E. Schulz and R.H. Schirmer, Principles of Protein Structure (Spnnger, New York, 1979).
32.
R.E. Tarjan , Graph algorithms in chemical computation, ACS Symposium Series46 (1977) 1-20.
33.
J.R. Ullman , An algorithm for subgraph isomorphism, Journal of the ACM16 (1976) 31-42.
34.
P. Willett, Similarity and Clustering inChemical Information Systems (Research Studies Press, Letchworth, 1987).