Abstract

It was with great pleasure that I was invited to review this book on methods in predictive toxicology edited by Christoph Helma and colleagues. The book is divided into 14 chapters as follows:
Chapter 1: A Brief Introduction to Predictive Toxicology
In this chapter the author makes a succinct and clear definition of predictive toxicology used in the context of this book that nicely sets the scene for the topics covered by the rest of the book. Because there are many interpretations of the phrase Predictive Toxicology, it is important to set reader expectations early. The author also highlights that predictive toxicology is a blend of many scientific disciplines and as such no one person will have a deep knowledge of all areas involved. This is also an important point to stress for a reader hoping to learn more about this field of science.
Chapter 2: Description and Representation of Chemicals
This section deals with the different methods by which chemical structures can be represented mathematically so that they can be correlated with some measure of the compounds activity—in this case toxicity. The authors provide a brief review of the approaches that can be applied but it falls short of being a comprehensive discussion of the methods in the area. Given that this chapter could be a book unto itself then this is an acceptable overview for the inexperienced reader.
Chapter 3: Computational Biology and Toxicogenomics
In this chapter the authors provide an excellent overview of the field of toxicogenomics and highlights how this experimental technique can be used to study and predict the effects of chemicals on a biological system. It speaks to some of the common approaches to data normalization and subsequent analysis techniques that are often used in this field. This section also includes a discussion on the ability to use toxicogenomics to predict toxicity through comparison to a reference database and using various classification methods. Toxicogenomics has been applied to the prediction of numerous toxicological endpoints and incorporation of a few of these would have provided some useful examples to enable the reader to get a fuller comprehension of the subject matter. In addition, the authors stop short of describing how gene expression analysis may provide insights through biological pathway analysis or could be combined with other ∼omic technologies to gain greater insight to the underlying toxicology mechanisms.
Chapter 4: Toxicological Information for Use in Predictive Modeling: Quality Sources and Databases
In this chapter the author outlines the critical need for access to toxicological data in order to develop predictive models. It is often an understated fact that much of the work in developing a predictive toxicity model lies in the curation and cleaning of toxicological data. Quite correctly the author highlights the importance of data quality when considering the use of predictive modeling techniques and goes on to provide examples of criteria that could be used to assess data quality. The section also provides an excellent review of the public domain sources where toxicity data can be found although no attempt to describe the overall quality of each source is made.
Chapter 5: The Use of Expert Systems for Toxicology Risk Prediction
The authors of this section introduce the concept of expert systems and how these methods may be applied to prediction of toxicity. The authors go on to describe in some detail two examples of expert systems developed specifically for this application. The section is somewhat disappointing in that it focuses on the early work conducted in the mid-1990s as part of the StAR project but falls short of describing in any depth the more recent developments in the area. However, the authors provide a useful discussion on the use of argumentation in predictive toxicology whereby facts and inferences can be used in a balanced approach to understanding the hazards and risks presented by a novel chemical substance.
Chapters 6, 7, and 8: Regression- and Projection-Based Approaches in Predicting Toxicology; Machine Learning and Data Mining; Neural Networks and Kernel Machines for Vector and Structural Data
These three chapters provide an introduction to some of the basic techniques used in QSAR and predictive toxicology. They cover a variety of techniques including multiple linear regression, principle component analysis, data mining, and machine learning algorithms such as neural networks and support vector machines. Each chapter focuses heavily on the underlying theory behind the approaches with some good worked examples. Disappointingly though they provide too few examples of where and how these techniques have been successfully applied to specific problems in the field of predictive toxicology. Whilst these chapters are excellent summaries of the science behind these computational approaches, they also illustrate the breadth of scientific disciplines involved in the field of predictive toxicology. As such the reader should be prepared to read some of the recommended introductory texts in order to fully appreciate and understand these theories.
Chapter 9: Applications of Substructure-Based SAR in Toxicology
This section deals with the use of qualitative associations between chemical substructures and a corresponding toxicological action more commonly referred to as structure-activity relationships (SARs). It provides an excellent discussion on how these approaches are developed and used as part of an overall strategy for hazard and risk assessment. It is important for the reader to understand the limitations of these SAR approaches as well as those methods discussed in other chapters so that they can be employed most effectively. At present there are no predictive toxicology systems that are sufficiently well developed that they could be used in the absence of expert opinion. However, although the author begins with a broader discussion of multiple SAR systems and approaches, the chapter rapidly focuses in on the MultiCASE system as the primary example and therefore one questions the value of this section in light of Chapter 12 that is entirely dedicated to this application.
Chapter 10: OncoLogic: A Mechanism-Based Expert System for Predicting the Carcinogenic Potential of Chemicals
The author of this chapter provides a brief review of the underlying approaches used in development of mechanism-based SAR and describes in some detail the OncoLogic system developed for the prediction of carcinogenicity. The chapter provides excellent insight to how the developers and expert panel at the US EPA organized their thought processes when considering the hazards presented by a novel chemical and how these may be encapsulated into a computer system. The examples provided help enormously in understanding the depth and breadth of knowledge that is incorporated into OncoLogic; however, it was disappointing to see an absence of screen shots from the system that would illustrate how one might actually use the system in practice.
Chapter 11: Meta: An Expert System for the Prediction of Metabolic Transformations
This chapter is the weakest section in the book in that it makes no attempt to provide the reader with an obvious link to the overarching theme, i.e., that of predictive toxicology. The author speaks to only one, single approach for metabolism prediction without highlighting the need to consider metabolic activation and its role in the expression of toxicity. In addition, the review provides little or no insight as to how well the system performs compared to some known metabolic fates of chemicals. I was left with the feeling that this chapter was written and included as an afterthought to the rest of the book.
Chapter 12: MC4PC—An Artificial Intelligence Approach to the Discovery of Quantitative Structure-Toxic Activity Relationships
In this section the authors outline the theory behind one approach for predictive toxicology used in the development of a family of applications, the MCASE and MC4PC software. They provide excellent insight into how these systems work and could be used in the mining for structure-activity relationships in a set of diverse chemicals. However, the authors do not adequately stress the importance of data quality in these automated systems for predictive modeling. This is an important factor for the novice user of a system and should not be overlooked when using the application. It would also have been useful for the authors to illustrate how the program could be used in an example where the toxicology is not necessarily categorical (i.e., positive or negative) but has a more continuous scale, for example, in the case of hepatotoxicity.
In the Ames prediction example provided by the authors, the concept of only calling overall positive predictions where a chemical is predicted positive in more than one Ames strain/S9 combination seems to be in direct contradiction to the way the biological assay is interpreted and would lead to questions around the ability of the system to adequately distinguish positive from negative compounds. Finally, when reporting the evaluation of a system using an unbalanced test set (in terms of the ratio of active to inactive compounds), the use of the Kappa statistic may provide a more appropriate measure of performance than the use of sensitivity, specificity, and concordance.
Chapter 13: PASS: Prediction of Biological Activity Spectra for Substances
This section describes the development of a system designed to predict the biological activity spectra for novel chemicals based on an analysis of its chemical structure and comparisons to a training set of compounds classified as active or inactive. The primary descriptor used is the multilevel neighborhoods of atoms that provide a means to describe unique fragments of a molecule by growing the fragment from the initial starting point to an increasing depth and complexity based on each atom’s neighbors. The authors explain that the training set is derived from a database of some 50,000 known actives but do not adequately explain how inactive compounds are derived. Assuming that the lack of evidence for activity indicates and absence of effect is a dangerous assumption to make and will ultimately undermine the success of any method applied. Similarly, the authors fail to explain how the PASS system would be able to compensate for the effects of stereochemistry, particularly relevant as they use thalidomide as a worked example in the manuscript.
Chapter 14: lazar: Lazy Structure-Activity Relationships for Toxicity Prediction
In this chapter the author describes a novel method and how it has been applied to the area of predictive toxicology. It uses a fragment-based approach to identify fragments that are present in a test molecule and are abundant in a known set of active compounds. Initial results from this approach look promising yielding an overall accuracy of 78% for Ames mutagenicity prediction that is comparable to many other systems. The author provides the reader with a thoughtful analysis of where modifications to the approach could improve its performance including the question of coverage in terms of unknown fragments and the nonlinear relationships between activating and inactivating fragments.
In conclusion, this book is an excellent introduction to some of the scientific disciplines involved in predictive toxicology and I would recommend this to anyone starting out in this area of science and research. It does not, however, provide an experienced reader with sufficient details or in-depth discussion for any of the individual topics and as such cannot be used as a replacement for more detailed texts on computational approaches or issues related to toxicology. Although it has some shortcomings with respect to in-depth details, the book provides an intelligent and comprehensive overview of the field and the editors and authors are to be congratulated.
