Sage Journals: Discover world-class research

Abstract

Families making medical decisions for an incapacitated loved one need to process medical information for various care pathways while balancing different perspectives and experiencing distress. A decision aid tool to help family members make medical decisions on the patient’s behalf should be easy to use and not create additional burdens. A formative usability study with high, medium, and low verbosity versions of an initial decision aid was conducted with university students before clinical testing. The results showed no significant differences in usability metrics between different verbosity levels, but the qualitative findings indicated areas of improvement in the organization of information that could lead to improved usability of the decision aid. While the findings are limited due to the study being void of some of the burdens families face in true clinical situations, the study allowed us to identify numerous usability issues before testing with target users.

Keywords

usability healthcare human factors decision aid user experience

Introduction

Decisional incapacity in the clinical context is when a patient loses the ability to make or communicate medical decisions for themselves and to understand potential benefits, harms, and alternatives for proposed healthcare (Pope, 2023). It is estimated that the prevalence of decisional incapacity is about 40% among adults in inpatient and residential hospice settings and is more than 90% among adults in intensive care units (DeMartino, 2017). When a patient experiences decisional incapacity, making medical decisions falls to the patient’s family or chosen decision-maker.

The task of making medical decisions for an incapacitated loved one is ridden with multiple complicating factors. These factors could include emotional distress from seeing a loved one in an incapacitated state, processing large amounts of medical information to understand relevant medical outcomes, and balancing one’s capacity to financially or logistically provide the support needed for different decisions. In light of these complexities, guardians of patients with clinical incapacity would greatly benefit from a decision aid to help them communicate with the healthcare team and make these difficult decisions.

These decisions are often made during high emotional load, and a decision aid tool to help family members make medical decisions on the patient’s behalf should be easy to use and not create additional burdens for caregivers. In designing such a decision aid, there needs to be a careful balance between providing sufficient information to aid decision-making and not providing an overwhelming amount of information, which could create additional burdens. To reach a balance between the two, user feedback through iterative evaluations is necessary. However, given the sensitive nature of caregivers’ situation when making healthcare decisions for loved ones, conducting formative evaluations with them should be carefully considered.

In order to reduce the burden placed on caregivers, an initial decision aid design was used to conduct a formative usability study with a convenience sample of Rice University students. In order to find a balance between providing sufficient information and not overwhelming users, the initial decision aid design was used to create three verbosity conditions: low, medium, and high. Quantitative usability metrics, as well as qualitative feedback from participants, were collected in order to gauge insights for usability improvements. This approach will allow us to identify and resolve significant usability errors with the decision aid before conducting evaluations with actual caregivers in the next phase of testing.

Methods

Formative usability testing of three verbosity conditions of an initial decision aid in a between-subjects study design was conducted with Rice University undergraduate students. The study was reviewed and approved by the Institutional Review Board (IRB) of Rice University.

Materials

An initial decision aid design was the basis for the creation of three versions of the decision aid, presented as each of the three verbosity conditions randomly assigned to participants in a between-subjects study design. The initial decision aid comprised three sections spanning pages 1-10. The first section, pages 1-3, was not informative on care options but contained prompts for users to consider (1) the context surrounding the family member of the patient, (2) the condition of the patient prior to the hospital visit, and (3) what risks and quality of life the patient may accept. The second section, pages 4-9, contained information regarding (4) the three types of medical care to consider, (5) intensive care, (6) risks and benefits of intensive care, (7) conservative care, (8) comfort care, and (9) cardiopulmonary resuscitation (CPR). The third section was a final (10) decision page, which contained summarized care options presented as a decision grid.

The high verbosity condition was the entirety of the initial decision aid design containing all the sections. The third section, or the final decision page, contained a summarized reiteration of the information on the care options already presented in prior pages. Therefore, the third section alone was determined to be the low verbosity condition. The medium verbosity condition contained the first and third sections, which included question prompts along with the summary information of care options provided on the decision page. A summary of the contents of the verbosity conditions is as follows:

High Verbosity (HV) condition: all sections of the initial decision aid, pages 1 through 10

Medium Verbosity (MV) condition: first section, which are prompt pages 1 through 3, and last section, which is the final decision page 10

Low Verbosity (LV) condition: last section, which is the decision page 10

Since the decision aid was being tested on a set of proxy users in a laboratory setting, participants were given a hypothetical clinical scenario and asked to utilize the decision aid when formulating decisions. The hypothetical scenario described the clinical state of a grandparent who had critical injuries requiring intensive care. The scenario was printed on the instruction page, which asked participants to mark a care option on the last page of the decision aid for the grandparent if the scenario was true for them.

A printed survey was used to collect participants’ responses on five metrics determined to be important in assessing the usability of the decision aid. To understand user perception of ease of use, the positively worded System Usability Scale (SUS) (Brooke, 1995; Sauro & Lewis, 2011) was provided on the survey, without the item “I think I would like to use this system frequently” (Lewis & Sauro, 2017). To understand participants’ perception of workload, the Overall Workload (OW) scale (Hill et al., 1992) was included in the survey. Participants’ perception of the amount of information provided on the decision aid, information quantity, was measured on a 7-point Likert scale with anchors “did not have enough information” and “had too much information” on either end and a middle anchor “had the right amount of information.” Participant’s perception of the usefulness of the information, information usefulness, was measured on a 7-point Likert scale with anchors “not useful at all” and “extremely useful” at either end. Confidence in decision-making was measured using a 100-point confidence scale adapted based on work by Jackson and Kleitman (2014), with anchors “not confident at all” and “extremely confident” at either end of the scale.

Procedure

Usability Study

A total of 27 undergraduate students from Rice University were enrolled in the usability study—nine participants for each of the three verbosity conditions. A signed copy of the IRB-approved informed consent form was obtained from all participants, who then verbally noted their demographic information. Participants were comprised of twenty-three females and four males. The age of participants ranged from 18 to 21 years (M = 19.22, SD = 1.01). During this initial intake stage of the study, participants were verbally briefed about considerations of a usability study, which emphasized that the study was testing the decision aid and not them, and that researchers are not to aid them in their task. Participants were also given a breakdown of the study procedure, which included completing the task, filling out a survey and participating in a short interview.

For the task, participants were first asked to read the scenario and the instructions for the usability task on the instruction page. After they indicated that they had finished reading, they were given the decision aid verbosity condition randomly assigned to them. A timer was started as soon as the decision aid was handed to them, and stopped when participants indicated that they had marked their decision on the decision aid. Right after, participants were asked to complete the printed survey containing the SUS scale, the OW scale, and the scales measuring information quantity, information quality, and confidence in decision-making.

During the semi-structured interview portion of the study, participants were asked how they used the decision aid and if they had any initial thoughts. Then they were asked to circle areas of the decision aid that they found helpful with a green highlighter, to circle areas they did not find useful with an orange highlighter, and asked why they found those areas useful or not useful. Finally, participants were asked about their opinions on and suggestions for the decision aid. Follow-up questions were asked to understand participant feedback further if anything was unclear.

After the conclusion of each study session, it was ensured that all feedback, written survey and timer data were electronically entered onto a cloud service in a de-identified manner. All hand markings and written records were stored securely in a file cabinet.

Analysis

The quantitative measures collected from the survey responses and task time were compared between verbosity conditions. One-way ANOVAs were conducted to compare means of low, medium, and high verbosity conditions for SUS, OW, information quality, information quantity, confidence in decision-making, and task time.

There is no wrong or right way to use the decision aid; we were primarily interested in how users interacted with it and what issues they experienced while using it. As this formative usability study did not employ a think-aloud protocol, all qualitative insights were captured in the feedback provided by users during the semi-structured interview. It is important to note that a disadvantage to only capturing feedback through interviews is that participants may miss out on sharing some aspects of usability that they experienced. All reported feedback was compiled and codified in a spreadsheet. Each piece of feedback was traced to how many participants expressed it during the study, both overall and according to verbosity conditions. All usability feedback was binned into distinct areas of improvement for the decision aid. Additional relevant comments other than usability feedback were noted.

Results

Quantitative Usability Data

One participant enrolled in the medium verbosity condition missed scoring one statement on the SUS, and therefore their SUS score could not be included in the analysis. Descriptive statistics of the usability metrics collected are shown in Table 1. One-way ANOVAs showed no significant differences between the three verbosity conditions for all quantitative metrics: SUS, OW, information quantity, information usefulness, confidence in decision-making, and task time.

Table 1.

Descriptive statistics of quantitative usability metrics for low, medium, and high verbosity decision aids, as well as the overall average.

Quantitative Metrics	Low verbosity (LV)	Medium verbosity (MV)	High verbosity (HV)	Overall
System usability scale (SUS) (1–100)	76.2 (11.2)	81.9 (10.3)	75.2 (16.23)	77.6 (12.8)
Overall workload (OW) (1–100)	51.7 (26.7)	36.9 (25.6)	48.3 (24.6)	45.6 (25.5)
Information quantity (1–7)	3.67 (1.0)	3.33 (1.5)	3.78 (1.1)	3.6 (1.2)
Information usefulness (1–7)	5.11 (1.2)	4.78 (1.2)	5.22 (1.6)	5.0 (1.3)
Confidence in decision (1–100)	71.1 (18.3)	55.6 (19.4)	62.2 (22.2)	63.0 (20.3)
Task time (seconds)	73.4 (7.5)	342.2 (127.7)	770.9 (997.0)	395.5 (629.7)

Qualitative Usability Data

Most of the qualitative feedback regarding the usability of the decision aid could be categorized into two distinct areas of improvement: (a) how informational pages regarding the types of medical care were organized and (b) the final decision page. The remaining pieces of feedback that could not be binned with each other were labeled under a third category, (c) miscellaneous. Feedback that was only expressed by one participant and could not be grouped with other feedback under a common theme was not included in the analysis. The qualitative feedback received is illustrated in Table 2.

Table 2.

Usability feedback received from participants, grouped according to areas of improvement. Each feedback is presented along with the percentage of participants who expressed it in each of the verbosity conditions (% of LV, % of MV, and % of HV) and in the overall study (% of Total).

Areas of improvement	Qualitative feedback	% of LV	% of MV	% of HV	% of total
Description of types of medical care	More information regarding care options	56%	15%	11%	37%
	Risks and benefits are/would be helpful	33%	11%	33%	26%
	Outcome statistics are/would be helpful	11%	11%	33%	19%
	Break-up large bodies of text, for example, bullet	11%	0%	11%	7%
	Same formatting for every care option	0%	11%	11%	7%
	Difference in care options made clearer	0%	0%	11%	4%
Final decision page	Spectrum most useful/helpful to compare options	56%	33%	22%	37%
	Photos were not useful and were repetitive	56%	0%	33%	30%
	Remove redundant chunks of text	33%	0%	0%	11%
	Present options on horizontal axis	22%	11%	0%	11%
	Remove generic titles for care options	78%	33%	4%	41%
Miscellaneous	Decision aid should be less cold/gloomy	33%	11%	22%	22%
	Relevance of condition prior to illness should be made clearer	11%	33%	0%	15%
	Addition of cost comparisons	33%	11%	0%	15%
	Question on value of patient helpful	0%	22%	22%	15%
	Length is intimidating	0%	0%	33%	11%
	Definition of family irrelevant	0%	11%	11%	7%

In addition to the usability feedback received, 21 participants (78%) expressed that they would want the decision aid in making medical decisions about an incapacitated loved one, 12 (45%) expressed that the information provided was sufficient to make a choice between the care options, 11 (41%) expressed that the information was not sufficient, and 10 (37%) expressed that they would want a doctor to be present as well as having the decision aid. Seventeen (63%) participants noted that they read all parts of the decision aid, and several expressed that they would want to read all the information given to them if they were making a medical decision about a loved one.

Discussion

The sample size of nine participants per verbosity condition for a total of 27 participants is sufficient to capture an average of 85.5% to 94.7% of usability errors per verbosity condition (Faulkner, 2003). Therefore, the sample size was deemed sufficient for capturing usability issues prior to clinical testing. No significant differences were found between verbosity conditions on any of the quantitative metrics in one-way ANOVAs, but these results are not powered well (power <.2 across all quantitative measures), given that the sample size was small. Due to the lack of statistical significance, we relied on qualitative usability feedback from participants to contextualize the quantitative trends found.

An overall mean SUS score of 75.8 indicated that the perceived usability of the three verbosity conditions of the decision aid was “good” (Bangor et al., 2009) but could be improved. In this discussion, we dive into how both the qualitative and quantitative data inform areas of usability improvement for the next iteration of the decision aid.

The MV condition had the best score for the perception of ease of use, SUS (M = 81.9), indicating that the parts included in the MV condition may have had the highest ease of use. It is possible that the unique part of the HV condition, which is the description of the types of care options, lowered the overall SUS score of the HV condition (M = 75.2). This is supported by the fact that one of the major areas of improvement that emerged from participant feedback was regarding descriptions of care options. The lower score of the LV condition (M = 76.2) could also be explained by the fact that another major area of improvement identified is regarding the decision page, which is all that was presented in the LV condition. These trends were mirrored in the perception of workload as well. The MV condition had the lowest OW score (M = 36.9), indicating the best usability, whereas the HV (M = 48.3) and LV (M = 51.7) conditions had higher OW scores, indicating higher levels of usability issues. Furthermore, the high variability in task time compared to the mean for the HV condition (M = 770.9, SD = 997.0) shows a large positive skew in the distribution of data, indicating that some participants took much longer than others to get through the HV decision aid possibly due to usability issues.

Users would like more information about their care options than what was presented in the MV and the LV condition, based on 10 participants sharing this feedback. This is further supported by the fact that the HV condition also scored closest to the ideal value of four regarding information quantity (M = 3.78). Furthermore, a large portion of participants read every part of the decision aid, and several indicated that they wanted to know all the information available when making a medical decision about a loved one. With the combination of users wanting more care information as the case in HV but indicating low ease of use and high workload for descriptions of the care information, it was determined that the next iteration of the decision aid should include detailed descriptions of care options but reorganized to be more usable. Usability feedback regarding the description of the care options, such as focusing on risks/benefits and outcome statistics, chunking large bodies of text into smaller sections, and consistent formatting between all care options, could raise ease of use and reduce workload while increasing confidence in decision-making.

Ten participants indicated that they found the decision page helpful in comparing the different care options. It is clear that participants wished for a visual aid that had the key pieces of information pertaining to each care option and allowed them to easily compare the care options in making a medical decision. This is further confirmed by one of the higher scores in information usefulness seen in the LV condition (M = 5.11), which only contained the decision page. With the combination of users wanting a visual aid with key pieces of information but indicating a large proportion of associated usability issues, it was determined that the next iteration of the decision aid should include a reformatted version of the decision page. Integrating usability feedback regarding the decision page, such as removing anchoring photos, making titles more descriptive/useful, removing redundant chunks of text, and presenting information in a horizontal format, could improve the overall usability of using the decision aid.

Conclusion

We demonstrated that through a formative usability study of an initial decision aid with students, we were able to identify numerous usability improvements before testing with caregivers. In the practice of usability research, careful considerations need to be made around iterative testing with the target population in situations where they are challenging to recruit or the context around the situation being tested is of a sensitive nature. Therefore, it is important to solve as many usability problems as possible before testing with target users.

The study showed that no significant differences were found in the usability of different verbosity conditions for a decision aid, indicating that participants did not prioritize verbosity when using the decision aid. These results are mirrored in the qualitative feedback received, where despite making suggestions around areas of improvement, participants expressed that they would want to read all the information given to them if they were making a medical decision about an incapacitated family member. Despite participants expressing a desire for more information, the ease of use score was the lowest for the high verbosity condition, indicating that other aspects were at play in the usability of the decision aid. Participants provided usability feedback that boiled down to two main areas of improvement: (a) reorganized descriptions of types of medical care and (b) reformatted the page to help users easily compare care options. For the next iteration of the decision aid, we will integrate the feedback regarding the two distinct areas of improvement based on our analysis and consider integrating the miscellaneous feedback as well, which will likely raise the usability scores of the decision aid.

The findings in this study are limited by the fact that the study was conducted in a laboratory setting where participants are not experiencing the same emotional burdens as target users. Future studies will need to be conducted to confirm the final design of the decision aid in a clinical setting.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Shababa Matin

References

Bangor

Kortum

Miller

(2009). Determining what individual SUS scores mean: Adding an adjective rating scale. Journal of Usability Studies, 4(3), 114–123.

Brooke

(1995). SUS: A quick and dirty usability scale. Usability Evaluation in Industry, 189, 4–7.

DeMartino

E. S.

Dudzinski

D. M.

Doyle

C. K.

Sperry

B. P.

Gregory

S. E.

Siegler

Sulmasy

D. P.

Mueller

P. S.

Kramer

D. B.

(2017). Who decides when a patient can’t? Statutes on alternate decision makers. The New England Journal of Medicine, 376(15), 1478–1482. https://doi.org/10.1056/NEJMms1611497

Faulkner

(2003). Beyond the five-user assumption: Benefits of increased sample sizes in usability testing. Behavior Research Methods, Instruments, & Computers, 35, 379–383.

Hill

S. G.

Iavecchia

H. P.

Byers

J. C.

Bittner

A. C.

Zaklade

A. L.

Christ

R. E.

(1992). Comparison of four subjective workload rating scales. Human Factors, 34(4), 429–439.

Jackson

S. A.

Kleitman

(2014). Individual differences in decision-making and confidence: Capturing decision tendencies in a fictitious medical test. Metacognition Learning, 9, 25–49. https://doi.org/10.1007/s11409-013-9110-y

Lewis

J. R.

Sauro

(2017). Can I leave this one out? The effect of dropping an item from the SUS. Journal of Usability Studies, 13, 38–46.