Abstract
The objective of this panel is to better understand how machine learning (ML) systems influence users and to identify strategies to make such interactions more effective when designing decision support systems and joint cognitive systems for real-time control. The panelists will draw upon their insights from the fields of aviation, defense, ground transportation and medicine, as well as the literature on human-automation and human-AI interaction. The panelists will discuss findings regarding both the positive and negative influences that the design of such systems can have on perceptual, cognitive and decision making processes. They will further provide concrete examples of system designs. Topics for discussion will include: (a) The Ironies of AI Based on Machine Learning: Challenges and New Directions, (b) Understanding the Influences of ML on Joint Cognitive Performance, (c) Mixed Initiative Access to Information, (d) Focusing and Accelerating Decision Making, and (e) Designing the Operational Domain.
Overview
The objective of this panel is to better understand how machine learning (ML) systems influence the perception, decision making and performance of users and to identify strategies to make such interactions more effective when designing decision support systems and joint cognitive systems for real-time control. The panelists will draw upon their insights from the fields of aviation, defense, ground transportation and medicine, as well as the literature on human-automation and human-AI interaction.
In response to questions and comments, panelists will discuss findings regarding both the positive and negative influences that the design of such systems can have on perceptual, cognitive and decision making processes of users and on resultant system performance. They will further provide concrete examples of system designs.
While the panel will apply the expertise of the panelists on this topic in response to the inputs from the audience, panelists will be specifically prepared to highlight insights in the following areas.
Understanding the Influences of ML on Joint Cognitive Performance
Studies from fields like radiology indicate challenges and opportunities offered by ML. As an example, studies using one particular ML system indicated that, for the detection of lung lesions, the technology alone performed with a sensitivity of 0.83 versus 0.52 for practicing radiologists working without this technology. On the other hand, for the detection of pleural effusions, the ML system had a sensitivity of 0.74 versus 0.88 for the radiologists (Niehof et al., 2023).
Other studies in radiology have demonstrated challenges in using this technology. Mehrizi et al. (2023) for example, conducted a study where radiologists could access the analysis of an ML system upon request for 2,760 decisions. Among other results they found:
When looking only at cases where the AI offered correct suggestions, the average accuracy of human decisions was 78%.
When looking only at cases where the AI offered incorrect suggestions, 28% of the time radiologists accepted these incorrect diagnoses.
To address these concerns, a number of approaches have been investigated, including:
Solution 1. Develop more interpretable explanations for an ML system’s conclusions to improve the ability of the user to evaluate the recommendation of the ML.
Solution 2. Design to support complementary performance of the person and the ML system, providing the person with information displays that support a role allowing an independent assessment of a situation (such as development of a differential diagnosis) in conjunction with considering and evaluating the recommendations of the ML.
The Ironies of AI Based on Machine Learning: Challenges and New Directions
Despite the many accomplishments of AI systems based on ML, shortcomings exist that pose substantial challenges for human interaction with these systems. Following on to Bainbridge’s “Ironies of Automation,” this includes five ironies of AI, including difficulties people have with understanding AI and forming appropriate adaptations to their shortcomings and opaqueness in ML systems that shroud needed information on AI limitations and biases, which in turn drive human decision biases and difficulties in understanding the AI reliability, despite the fact that AI remains insufficiently intelligent for many of its intended applications.
Several avenues are available for overcoming these challenges through design of the user interface, including improving the transparency and understandability of ML based AI systems, exposing biases and limitations, and improving information needed to assess system reliability and attribution.
While research indicates that such approaches may be helpful, it also shows that achieving success will depend on carefully identifying users’ situation awareness needs and presenting the information effectively so as to avoid overload. A methodology for identifying situation awareness needs for users working with AI systems will highlighted for discussion along with examples of its application for creating more transparent AI systems.
Fostering Effective Joint Performance
A primary challenge in design of ML decision support systems (DSS) is to foster better joint performance than either the person or the ML could achieve alone. This challenge also applies to applications where the ML is supporting teams of people. Decades of human factors research with non-ML DSS has shown that providing a high performing DSS does not guarantee that the joint performance will be better than either alone. This finding has been replicated in the ML literature.
One approach to overcoming this challenge currently being explored in the ML literature is to generate explanations for why the ML came up with its decision (e.g., feature-based, example-based, and counter-factual explanations). This approach to explanation has led to mixed results at best.
An alternative approach to “explanation” will be highlighted for further discussion that focuses on helping the person on the scene answer the question “Is this proposed solution correct, will it work?” rather than “Why did the ML come up with this solution?” The objective is to provide the person on the scene with the information they need to evaluate the appropriateness of the recommended solution for themselves. This includes providing information needed to answer: (a) Does the ML correctly understand the current situation (i.e., is it solving the right problem?) and (b) is the proposed solution likely to work (i.e., is it meeting my goals and respecting the problem constraints I am aware of?).
Examples of this approach that should lead to more effective joint performance will be presented.
Mixed Initiative Access to Information
AI, in the form of Large Language Models, seems well-positioned to augment human memory through methods such as Retrieval Augmented Generation (RAG). Herein, artifacts such as technical data related to maintenance procedures could be uploaded and retrieved on-demand. Key considerations for this use of AI would include creating accurate prompts, ensuring data accuracy and currency, and ensuring appropriate access.
Focusing and Accelerating Decision Making
AI can also help to overlay structure for inherently unstructured tasks. Herein, AI tools could help identify patterns amongst massive data points/streams. They can offer recommended courses of action (COAs) for humans to reason over. Focusing and accelerating the decision space are two key potential benefits of using AI in this way.
Ultimately, the ability to pair humans and machines will the key technology, not the AI in isolation. For ML tools one must consider the quality, accuracy, breadth, and currency of the training data. And one must put thought into how well the training context matches the targeted domain.
Designing the Operational Domain
Many cognitive engineering approaches focus on interface and interaction design. Examples of this include calls for explainable and transparent AI. Cognitive engineering needs to go beyond the interface to guide development of the underlying model. Creating a robust joint-cognitive system depends on interface and interaction design, as well as on operational domain design.
Examples from vehicle automation and large language models (LLMs) demonstrate the need to design the operational domain. In the case of vehicle automation, some manufacturers explicitly define the operational domain, indicate it in the interface, and structure the interaction to prohibit use in situations outside the operational domain. A similar approach seems needed with LLMs, where they amplify human productivity in domains where they perform well, but degrade performance when people cross the jagged edge of the implicit operating domain.
Operational domain design includes defining a representative sample of training data, selecting a cost function that emphasizes robustness, and avoiding wicked domains. Poor robustness often stems from training the model with data that are not representative of the full operating domain. Operating domain design also involves selecting a cost function for training the model that reflects robust rather than optimal performance. Finally, operational domain design involves avoiding “wicked” domains that are defined by distributions with long twitchy tails.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
