Abstract
Self-service technology is widely used in the financial sector to ensure consumers can easily conduct financial transactions. For example, banks have successfully applied cardless withdrawal services to ATMs. However, many factors still exist as to why cardless withdrawals cannot be widely promoted. In the preliminary investigation work of this study, we investigated the user experience design of cardless cash withdrawals of the top three banks in Taiwan’s ATM market share. After summarizing many problems, we optimized the design to include two sets of typography and guidance. We used the task-oriented cognitive walkthrough method to survey 36 participants and adopted post-task and post-session self-report assessments, such as SEQ, NASA-TLX, satisfaction, preference, NPS, etc. Finally, it is supplemented by semi-structured interviews to understand users’ performance, thoughts, and problems when completing relevant tasks. It was found that users with different experiences have different needs for the interface, which are reflected in different workload items. In addition, the correlation between the scales was discovered, and suggestions were provided for future industry workers who want to practice data-driven UX.
Plain language summary
Why was the study done? Modern ATMs prioritize usability as financial services shift towards user experience. While cardless services offer enhanced security by allowing app verifications reducing password theft risks, their longer operation times hinder widespread adoption. This study examines the ATM cardless system with smartphone verification to assess user interactions and understanding. What did the researchers do? Initially, a situational survey was executed. We assessed cardless interfaces of the top three banks based on ATM count, using a task-oriented cognitive walkthrough and feedback from 30 participants. Selecting one bank with superior outcomes, we designed two typography and guidance versions. Subsequently, similar research steps from the initial phase were repeated. Experiments used four material combinations, with two tasks: “applying for cardless service” and “cardless cash withdrawal.” Participants were unfamiliar with cardless transactions. What did the researchers find? We collected data from 36 participants across four groups with diverse ATM experience. Metrics included operation times, errors, and scores from SEQ, NASA-TLX, satisfaction, preference, SUS, and “willingness to use and NPS.” Interviews provided insights into issues and participants’ opinions. High and low-experience users differed in interface usage and workload. We analyzed correlations in data-driven UX, discussed information architecture, visual hierarchy, and attention allocation, and offered UX recommendations. Our optimized design greatly surpassed the bank’s original performance and user ratings. What do the findings mean? The optimized design outperformed the original bank interface. We refined the visual design by employing heuristic evaluation and visual hierarchy theories. Beyond task performance metrics, we understood the varied experiences between seasoned and new users. Conclusively, we offer industry insights on user experience evaluation.
Keywords
Introduction
According to the latest statistics from Financial Supervisory Commission R.O.C.—Banking Bureau (2023), there are 32,913 ATMs in Taiwan as of the end of August 2023, with the density ranking highest globally. ATMs not only help emerging and mature markets move towards urbanization and digitalization but also reduce the frequency of human error among physical bank employees, thereby ensuring efficient operations and saving consumers time in queues. Gradually, ATMs will no longer be just automated devices for withdrawing or deposits; in the future, banking services will be ubiquitous and consumer-led (King, 2012, 2018). NCR, an Australian ATM development company, said that today’s ATMs have become an established and familiar part of the banking system, and in the future, improving the system’s usability will be the primary development focus (Cluckey, 2020; NCR, 2021). The Financial Supervisory Commission launched the “Building Digital Financial Environment Plan 3.0” in 2015 (Financial Supervisory Commission R.O.C, 2013), which promoted the broader application of ATMs than in the past, thereby bringing more convenience and security to meet consumer demand for transactions. The service model of financial enterprises in the future is more user-experience-oriented. On the other hand, the form and function of ATM also gradually refer to the smartphone user experience; for example, users are familiar with the touch screen, finger pinching, mobile authentication, and other operation methods on the smartphone, it is recommended that the above operation experience can also be added to the ATM (NCR, 2021).
Coventry et al. (2003) found that the first successful operation of a task on an ATM increases user satisfaction and, thus, the willingness to use the feature. However, the needs and expectations of bank customers for services are also expanding, and the quality of financial services should be characterized by independence, flexibility, freedom, and flexibility to meet these needs (Khalfan & Alshawaf, 2014; Lustsik, 2004). Customer satisfaction ultimately leads to the use process being linked to brand loyalty (Churchill & Surprenant, 1982; Jamal, 2007), and satisfaction occurs when customers compare their perception of actual product/service performance to their expectations (Oliver, 1980). Many studies have also shown that the quality of e-banking services significantly impacts customer satisfaction (Asiyanbi & Ishola, 2018; Bei & Chiao, 2006; Hammoud et al., 2018; Ranaweera & Neely, 2003; Zhou, 2004). Das et al. (1996) surveyed customer satisfaction with Canadian public services and found that user satisfaction is affected by user experience and consumer demand.
IBM believes that security is an important factor in operating ATMs, and authentication is defined as four elements: what the consumer knows (password), what he has (bank card), what he has (fingerprint), and what can be done (OTP, appointment number, QR code; Kessem, 2018; Lin, 2015). Among them, cardless services have the advantage of security; consumers must verify their identity through mobile apps so they do not worry about password theft (C. Gordon, 2016). In addition to contactless transactions through smartphones, today’s ATMs have also developed card-not-present transaction services with the help of mobile device authentication technology, thereby replacing bank card identity verification. Forgetting to carry a bank card is the main motivation for consumers to operate cardless services for the first time (Chang, 2019). Card-not-present services offer a variety of use cases to meet consumer demand for cash (CommBank, 2014).
Currently, many banks provide different cardless cash withdrawal methods, including iris recognition, fingerprint recognition, face recognition, third-party application binding, NFC sensing, and smartphone verification. However, people’s doubts and distrust of biometrics and the limitations of smartphones in sensing functions do not apply to most users. Among the various cardless withdrawal methods, NFC cardless services can reduce the transaction time from 30 to 40 s to about 10 s. Speed is an important part of the consumer experience and can provide more convenient services than traditional ATMs (Walker, 2017). De Luca and Frauendienst (2008) evaluated the function of ATM password authentication and found that the operation time and steps will also affect the user’s satisfaction with the system and suggested that the time taken by identity authentication should be maintained within 10% of the total operation time and proposed that the public tends to use password as identity authentication. Seifert et al. (2012) set the operation of the ATM part on the smartphone and found that the hybrid operation of the ATM and smartphone is faster and more comfortable. Currently, cardless service is not a common way to operate ATMs. The reason is that cardless withdrawals take more time than traditional transaction operations. As a result, banks are still unable to popularize cardless services despite continuous promotion (Chenlin, 2020).
Therefore, this study will investigate the ATM cardless service that uses smartphone identity verification to understand users’ cognition and operating behavior of the information on the interface. In addition to in-depth quantification of users’ operating performance, it can be conducted through users’ self-reports, including SEQ, NASA-TLX, satisfaction scale and preference, SUS, “willingness to use,” and NPS method. Evaluate. We hope to achieve three goals:
(1) Optimize the interface visual design to shorten the cardless withdrawal operation time and provide a good user experience.
(2) Through various user experience survey methods, we can understand different users’ pain points when operating cardless services on ATMs.
(3) Proposing an ATM cardless service interface design scheme that conforms to user experience as a reference for future related interface design.
Literature Review
Evaluate the Core Concepts of UX
According to the ISO 9241-210 (International Standard, 2019) definition, “User experience is user’s perceptions and responses that result from the use and/or anticipated use of a system, product or service.…users’ perceptions and responses include the users’ emotions, beliefs, preferences, perceptions, comfort, behaviors, and accomplishments that occur before, during and after use.” Based on this definition, this study adopts some core concepts and indicators of user experience, among which the information architecture of the interface needs to be examined. Using websites as an example, Garrett (2002) proposes that the order of “Elements of UX” from abstract to concrete is Strategy, Scope, Structure, Skeleton, and Surface. In a nutshell, the surface layer must map the structure layer to user needs and corporate goals. Rosenfeld et al. (2015) proposed that users, content, and context be taken as the core concepts of information architecture and applied to user experience. Camilli et al. (2011) found that simplifying the information architecture reduces user error rates in the operating system while effectively increasing user satisfaction. Visual hierarchy can be key in information architecture planning (Kingston, 2020). The “Usability Heuristic” is a concrete and effective approach when information architecture corresponds to Skeleton and Surface layers. Nielsen’s (1994)“Usability Heuristics” is still applicable to user interface design, where experts evaluate the “visibility of system status,”“user control and freedom,”“consistency,”“error prevention and recognition,”“flexibility and efficiency,”“aesthetic and minimalist design,”“providing user help,” etc. Therefore, when we review the existing ATM human interface, reviewing its information architecture will be one of the critical points, and we will use “Usability Heuristics” to evaluate design conditions. In addition, information architect Morville (2004) further proposed seven indicators of user experience honeycomb: usable, useful, desirable, findable, valuable, accessibility, and credibility as UX goals.
Lee and Allaway (2002) found that enhancing the customer’s sense of personal control over the service experience reduces perceived risk—financial, performance, social, psychological, safety, and time/convenience losses (Dowling, 1986; Peter & Tarpey, 1975) and accelerates user willingness to use the new Self-Service Technology. Event predictability, controllability, and outcome desirability will allow users to perceive personal control (Averill, 1973). This echoes the Usability Heuristics, where user control is one of the items that evaluate interface design. Years of practice and academic research in display interface design, information architecture, visual design, and user experience between the inseparability.
Visual Hierarchy and User’s Attention
The visual hierarchy controls the delivery of the experience; if the user does not know where to pay attention, the layout likely lacks a clear visual hierarchy (K. Gordon, 2020). Considering visual design principles and being aesthetically pleasing, it can improve usability, trigger positive emotions, and further strengthen brand recognition (Lupton, 2015; Poulin, 2018). Perhaps this explains why so many UX practitioners have transitioned from visual designers.
Faraday’s visual hierarchy guidelines (Faraday, 2000) divide the visual hierarchy model into two phases; the first stage is the “search phase,” which guides the user to significant “entry points” through motion, size, image, and color. The visual characteristics of text style and position guide the user from strongest to least impactful, and the second stage is the “scan phase,” when the user extracts information from nearby elements. This is achieved by following grouping principles and reading conventions. However, Still’s (2018) study found that the entry point of visual attention is best predicted by position, color, or text style rather than size or image. However, attention is driven not solely by stimulus-driven, triggered by visual features.
In the early dichotomy (bottom-up vs. top-down), attention drive developed into a trichotomy, in which Theeuwes (2019) proposed three factors: stimulus-driven, goal-driven, and history-driven selection. His findings suggest that lingering biases of previous selection episodes significantly influence attentional selection. Therefore, history-driven selection can also be used at the visual level to guide the user’s attention; for example, we can use the “center stage” and “Z-layout” or “F-layout” in the general user design pattern of the ATM interface.
Task-Oriented Cognition Walkthrough
The “Cognitive Walkthrough” method is suitable for assessing the ease of learning the system from the perspective of a new user (Salazar, 2022). It is a low-cost, fast, and effective method for evaluating ease of use in humans (Lyon et al., 2021). C. Lewis et al. (1990) developed cognitive walkthroughs to evaluate ready-to-use interfaces such as kiosks and ATMs. In contrast to heuristic evaluation and usability testing, it focuses on the user’s cognitive activities, specifically their goals, when performing specific tasks (Mahatody et al., 2010). Cognitive walkthroughs are an influential method designers use to solve cognitive challenges in design (Ning et al., 2019).
The design and implementation process includes determining the task conditions, layering the task analysis, prioritizing the task, converting the task into a test scenario, grouping the test subject, and finally identifying and classifying usability problems. The implementation allows the test subject to go through each task step, allowing the designer to identify what problems exist in the interface design (C. Lewis & Wharton, 1997). Observers will observe four critical questions in the cognitive exercise methodology (Blackmon et al., 2002): (1) Will users try to achieve the right result? (2) Will users notice that the correct action is available? (3) Will users associate the correct action with the result they are trying to achieve? (4) After the action is performed, will users see that progress is made toward the goal? It is also necessary to record the time each task was completed and the number of errors, note the timing of the potential problems hidden in the subject’s behavior and conduct interviews afterward.
User Self-Report Evaluation
In addition to observing the user’s key questions and performance in the cognitive exercise, the post-task self-report assessment can measure the overall perceived usability and compare the existing problems with the observations (Laubheimer, 2018). This study used two types of questions: post-task questionnaires, completed immediately after completing the task, and capture participants’’ impressions of the task, which in turn collects many subjective answers. The other is post-session questionnaires, which reflect the overall experience usability reported by users because of the peak-end effect (Cockburn et al., 2015; Kahneman et al., 1993). Post-task questionnaires include Single Ease Question (SEQ), NASA-TLX, satisfaction and preference, and post-session questionnaires include SUS, willingness to use score, and net promoter score (NPS).
SEQ assesses “ease” on a seven-point scale with a single item (Sauro & Dumas, 2009). Albert and Dixon (2003) think it’s more important to compare post-task evaluation to pre-task scores, known as expectation measurement. The task evaluation results are based on the average expectation score and the average experience score as the X and Y axes, respectively, which are “Fix it fast,”“Don’t touch it,”“Promote It,” and “Big Opportunity.”
NASA-TLX (Task Load Index) is a standard questionnaire used in many human factors and ergonomics studies. NASA-TLX assesses user stress during operation based on a weighted average of six indicators: mental Demand (MD), physiological demand (PD), temporal demand (TD), performance (OP), effort (EF), and frustration (FR). This scale will be divided into two steps. First, the user compares six metrics in pairs to determine which ones are more important for the user to operate the system. Users are then asked to rate six metrics on a scale of 0 to 100 and then weigh the scores for that metric to get the final score (Bianchi et al., 2010; Hart, 2006; Hart & Staveland, 1988; Laubheimer, 2018). After 35 years of development, Helton et al. (2022) argue that Physical Demand should be removed because it may not be meaningfully combinable with cognitive demands.
Perceived usability predicts users’ willingness to adopt mobile banking (Agyei et al., 2020). System Usability Scale (SUS) is considered an inexpensive and effective tool for evaluating the ease of use of a system (Tassabehji & Kamala, 2012). The mechanism is the most widely used questionnaire in academia and industry for perceived usability. Studies have shown that SUS has high reliability validity and is suitable for assessment in different contexts (Peres et al., 2013). Since its development in 2008, SUS’s position as an assessment tool for perceived ease of use is still considered robust and stable (J. R. Lewis, 2018).
The Net Promoter Score (NPS) was translated into a new metric by Reichheld (2003) with a single question, “How likely are you to recommend [OOO] to a friend or colleague?” This metric predicts future business growth better than previously used customer satisfaction or loyalty metrics. Since then, NPS has been widely used in different industries, including consumer marketing, IT services, healthcare, and more, and as a powerful indicator of customer loyalty. Its strength lies in its ease of management and understanding, and it helps to understand customer satisfaction with a service or product (Owen, 2019). Baehre et al. (2022) have demonstrated NPS has the greatest predictive value when predicting sales growth. Participants will answer “Willingness to introduce this product to your family or friends in the future” based on their experience and are scored on a 10-point scale from very reluctant (0) to very willing (10). The results are based on the score, and the participants are divided into Detractors (0–6 points), Passives (7/8 points), and Promoters (9/10 points). Past studies found correlations between ease of use and NPS (Friedman & Flaounas, 2018; Sauro, 2010). User experience variables contribute 32%–40% to 32%–40% of the likelihood that users will recommend a product (Bradner & Sauro, 2012).
Summary
Based on the above discussion, this study will start from visual layer design to the UX improvement of ATM cardless service. Cognitive walkthroughs and post-task/post-session user self-reports were used to gain insight into the problems to be solved to achieve the objectives of this study.
Related Work
The number of ATMs in Taiwan exceeded 30,000, and the three banks with the largest number were selected as the research objects in the early stage of this study, namely Bank A (6,023), Bank B (4,191), and Bank C (3,647). We collated and analyzed the User Flow of these three ATMs, including the application process of cardless service and the smartphone cardless withdrawal process. The investigation process at this stage is the same as in the following research method (refer to Figure 2). The purpose is to find out the problems from the interface design of these three companies, select the interface of one ATM, and sort out the design direction and conditions that need to focus on optimization as the basis for experimental design.
There were 30 participants in this stage and participants aged 20 to 39 who had not used cardless services. The result is that Bank B has the best performance among the three banks, so we choose B to optimize further and modify the user experience process and interface components of the ATM cardless cash interface and propose two proposals for different ATM operation experiences, which are used as materials for later experiments. In this study, we knew from the willingness to use an assessment that the willingness to use cardless withdrawal after the operation of the three bank participants significantly decreased by 30% to 60%. The ATM cardless withdrawal interfaces of the three banks cannot fully meet the operational needs of the test participants, and there is still room for improvement.
In the survey results of this stage, we synthesized the three problems and proposed solutions and UX goals presented in Table 1.
The Solution to the Current Problem, the Heuristic Evaluation, and the UX Goal.
Materials
Based on the above-mentioned survey, the ATM of one of the best-performing banks was selected. The user experience process and interface components in the ATM cardless withdrawal interface were further optimized and corrected, and two proposed designs were proposed for subsequent experimental verification.
The bank’s smartphone app interface encountered fewer problems during the experiment but more problems with the ATM interface. There are two main problems: first, the hierarchical structure of the entry configuration of the cardless service application affects the time of the subject’s operation task, and the other is the lack of appropriate guidance instructions when switching the carrier (ATM and smartphone). Not only did the participants not know how to complete the task, but they also spent more time completing it. Therefore, we optimized the material design based on the survey results (please refer to Table 1). For details on the new design and the arrangement of the two sets of task pages, please refer to Figure 1.

Design and process of two sets of tasks.
Typography A continues the traditional form of interface information. The operation interface of ATM adopts the “Center Stage” pattern, maintaining the principle that one interface focuses on one thing, and the user’s operation matters are displayed in the center of the interface. On the cardless withdrawal homepage, the Z-Layout pattern maintains the reading order from top to bottom and places the application entry in the lower right corner. On the Cardless Withdrawal Verification page, the way the user can choose to authenticate for the Cardless Withdrawal is listed.
Typography B is divided into two sections. Typography B establishes the structure of the publication surface by the order of the left-to-right visual hierarchy and uses the size to determine the visual weight. According to the importance and relevance of the information, the information is divided into chunks to highlight the main information on the right. For example, applying for a cardless service is usually a one-time operation, classified as secondary information. On the cardless withdrawal homepage, the application entry is configured on the left side of the interface, and the cardless withdrawal-related operations are configured on the right side. On the cardless withdrawal verification page, provide the cardless withdrawal authentication method the user wants to choose, then proceed to the authentication operation.
Guidance a (Ga) uses a numerical list to present the six steps while bolding the font and using darker color blocks to highlight important keywords. GA is equipped with a small icon with the same serial number as the smartphone for users’ reference and use, hoping to help users quickly capture the guide, continue to operate on the smartphone app and obtain the smartphone cardless withdrawal serial number.
Guidance b (Gb) is mainly icon-based, with the same large icon as the serial number of the smartphone, presenting six steps in the way of focusing on one step on one slide page, and users can tap the left and right arrow buttons on the interface to switch between the front and back pages. At the same time, important keywords are highlighted in bold font for users’’ reference and use, hoping to help users follow the guidance confidently and without panic, continue to operate on mobile online banking, and obtain the mobile cardless withdrawal serial number.
Method
The experiment was divided into four phases: the first stage was the participant’s personal background and ATM/cardless cash withdrawal-related experience questionnaire; the second stage was task-oriented cognitive walkthroughs; the third stage was post-mission assessment, including SEQ, NASA-TLX, satisfaction, and preference; the fourth stage is to evaluate overall usability with SUS, willingness to use score and NPS, and semi-structured interviews are conducted after completion of the assessment (Figure 2).

Experiment progress.
Task Design for Cognitive Walkthroughs
The task design of the cognitive walkthrough was mainly divided into two tasks: Task 1 was the cardless service application, and Task 2 was the cardless smartphone cash withdrawal. A researcher was responsible for chairing, observing, and documenting. The exercise provided two scenarios with the following objectives and task objectives:
Task 1 was to apply for cardless services for the first time. The scenario was “You have recently learned about the trend of cardless services as fintech. To experience the new technology, try applying for a cardless service in case you need it.” The goal of the task was to complete the service activation of “Cardless smartphone Cash Withdrawal” and see the “Service Successfully Activate” screen.
Task 2 was to complete cardless withdrawals. The situation was, “You’re about to go out for lunch, and when you get to the restaurant, you find that your wallet was left at home. To purchase lunch smoothly, please withdraw 1,000 dollars via smartphone serial number without a card.” The goal was to obtain the withdrawal serial number from the smartphone interface and enter it in the ATM interface.
Since the material contained two different typography (A/B) and two guidance (Ga/Gb) designs, it was divided into four groups: A-Ga, A-Gb, B-Ga, and B-Gb. Each participant only performed one of these sets.
Participants
A total of 36 participants, aged between 20 and 29, were recruited, none of whom had experience in cardless services. Based on the results of the ATM-related experience survey in the first stage, we divided the participants into three groups of low, medium, and high experience and evenly distributed them into four groups; that is, each group had nine participants, three each with low experience, medium experience, and high experience.
Equipment and Location
This study was affected by the global epidemic of COVID-19, and it was impossible to conduct physical experiments with participants, so it was verified through remote experiments. The interactive interface was designed in Figma. We used Google Meet online conferencing software for remote calls. The researcher shared the experimental operation interface through the computer remote control software AnyDesk and granted the participants’ screen control permissions. Before the experiment began, researchers confirmed that the ATM screen display window width was maintained at 1,024 × 768 pixels, and the smartphone interface simulated the screen size of the iPhone 11 Pro. All scale assessments were completed in electronic questionnaires after the session.
Design of the Scale
SEQ is based on the Likert 7-point scale: one is extremely easy, and seven is extremely difficult. After the researcher announced each task to the subject, the participant was asked, “How easy is it that you expect to complete Task 1/Task 2?” After each task was completed, the test participant was asked again, “After completing the task, how do you think the difficulty of Task 1/Task 2 is?” The satisfaction question was “How satisfied are you with the ATM layout/guide screen” and the preference question was the multiple-choice question “What is your preference for the layout/guide” The participants only sew all the designs at this time, and must choose one from each of A and B, Ga and Gb. Please refer to Appendix 1 for the SUS Scale (10 questions). NASA-TLX is available on NASA’s official website (https://humansystems.arc.nasa.gov/groups/tlx/). The question of willingness to use is “Based on your experience just now, how do you want to use cardless withdrawals in the future.” The NPS scaled from 1 to 10 and asked the subject, “Will you introduce this feature to your family or friends in the future.”
Results
In this experiment, four groups were proposed: A-Ga, A-Gb, B-Ga, and B-Gb, and 36 participants were obtained.
Cognitive Walkthrough
Before performing the cognitive walkthrough, participants estimated that the average time it would take to complete Task 1, “Cardless Service Application,” was 3 min and 43 s. The results of Task 1 show that the average operation time of Typography A is 64 s, while the average operation time of Typography B is 59 s. However, compared with A, although B’s average operation time is shorter, B’s number of errors (seven times) is higher than A’s (six times). However, statistics show no significant difference between A and B. In A and B, we provide two application portals in the information structure of the ATM cardless withdrawal service, one on the cardless withdrawal first page and the “?” button on the cardless withdrawal verification page. In A, 13 participants applied on the home page, while five chose to apply from the cardless withdrawal verification page. In B, 17 participants applied on the homepage, and only one chose to apply from the cardless withdrawal verification page, which is why the average task operation time is lower than A.
Before Task 2, participants estimated that the average time to complete a withdrawal was 2 min and 36 s. The statistics of the results of Task 2 showed that the average operation time of Ga was 65 s, and the average operation time of Gb was 90 s. Using t-test analysis, it was found that there was a significant difference between them (p = .002 < .01). After interviews, it was learned that the Gb page focuses on one-step guiding, which requires more interaction and time cost for participants to understand the method of obtaining the withdrawal serial number, which is the reason why the operation time of Gb is significantly higher than that of Ga.
In terms of the number of errors, Ga had a total of five errors, while Gb had four errors. The results show that although Ga has more errors, participants have a shorter operation time for Ga’s cardless withdrawal task. In addition, two participants from Gb encountered touch performance due to hardware issues, which caused them to tap the next step arrow multiple times without noticing, while one participant from Ga made an error because they missed a step. However, the statistical analysis results show no significant difference between Ga and Gb regarding the number of errors.
We further explored the time it takes to use the guidance and calculate the time after participants click the “How to get the withdrawal serial number” button, finish reading the guidance, and click the “I Got It” button at the bottom of the interface (Table 2). The statistical results found that participants spent 20 s on Ga and 41 s on Gb. Using t-test analysis, it was found that there was a significant difference between them (p = .000 < .001).
Task Performance: Operation Time and Number of Errors.
Note. N = 36.
p < .01. ***p < .001.
Post-Task Self-Report Assessment
SEQ Scale
The SEQ questionnaire at this stage explored the ease of participants in operating two tasks. Participants were asked to fill in the estimated difficulty of the task before the task and evaluate it again after the task. Finally, calculated its growth rate.
The statistical results show that, as shown in Table 3, in Task 1, the average task difficulty before the A task was 3.3 points, the average task difficulty after the task was 1.4 points, and the task difficulty was reduced by 57.6%. B’s average task difficulty before the task was 2.8 points, B’s estimated average task difficulty before the task was 1.5 points, and the post-task difficulty was reduced by 46.4%. It can be seen from this that the task difficulty of the two proposals is easier than participants imagined. However, in Task 1, B is slightly more difficult for participants.
Experiment 2: Task 1: Average Score and Growth Rate of SEQ Single Task Difficulty for Each Page Proposal.
Note. N = 36; 7-point Likert scale, ranging from very easy at 1 point to very difficult at 7 points.
NASA-TLX Workload Assessment
This section first analyzes participants to evaluate the weights of the five workloads. The calculation results show that, as shown in Table 4, Mental Demand (MD), Temporal Demand (TD), Frustration (FR), and participants are the three main indicators that they pay attention to when using ATM.
Weights for NASA-TLX Workload.
In Task 1, cardless service application, the average workload of 18 participants for A is 16.81, while the average workload of the other 18 participants for B is 21.52. From this result, the workload of A is lower than that of B, and the average load of MD, OP, EF, and FR, respectively, is lower than that of B. Under the independent sample t-test, there is a significant difference between A and B in the MD index (p = .038 < .05). This study learned from interviews that B differs from the previous ATM interface layout. Low-experience participants are less accustomed to the sequential layout of the left-to-right visual hierarchy and spend more brainpower understanding the new interface.
In Task 2, cardless cash withdrawal, the average value of the Ga-weighted workload of the 18 participants is 19.31, and the average value of the Gb workload of the 18 participants is 21.18. This result shows that the workload of Ga is lower than that of Gb, and the average load of TD and OP is lower than that of Gb, respectively, but it is not statistically significant. It was learned from the interviews that the higher interaction cost of Gb increases the time required for participants to operate tasks and is reflected in the performance of operational tasks (Table 5).
NASA-TLX Workload Weighted Average Scores for Tasks 1 and 2.
Note. N = 36.
p < .05.
Satisfaction and Preference Assessment
Participants were asked to rate their satisfaction with their assigned design proposal and operation process. Satisfaction was scored using a 7-point Likert scale, with one being strongly dissatisfied and seven being strongly satisfied. When the preference is a multiple-choice question, let participants look at all interfaces simultaneously during the assessment (that is, participants only saw another set of unused designs at this time), and choose one from A and B; choose one from Ga and Gb.
We can find from Table 6 that participants have higher satisfaction and preference for the use of A. Through interviews, it was found that participants are more accustomed to A’s operating behavior and reading order, and using an interface that focuses on one thing makes them feel more comfortable and at ease. However, some participants in B mentioned that the interface presentation is innovative and fresh due to the usage habits of this layout and that they can successfully complete tasks after becoming familiar with the operations.
Satisfaction Score and Preferred Number of People for Each Design.
Note. N = 36; 7-point Likert scale, ranging from strongly dissatisfied at 1 point to strongly satisfied at 7 points.
It can be found from Table 6 that participants have higher satisfaction and preference for the use of Ga. It was learned from the interviews that 67% (12) Ga participants believed that the bold fonts and darker color blocks of the guidance instructions, combined with the icons, effectively highlighted important keywords and helped participants quickly understand the information of the guidance instructions. However, some Gb participants believed that the interaction cost of constantly checking back and forth between ATMs and mobile online banking increased their burden.
The Difference Between High and Low Experience
This section mainly presents the impact of typography and guidance proposals on participants’ task performance with different ATM usage experiences. We conducted a t-test analysis on the results of NASA-TLX and added the opinions and data from interviews with participants after the mission for reference.
Comparison of Typography A and B
The overall mean value of low-experienced participants in Typography A (M = 20.17) is lower than that of B (M = 17.08). Among them, the mean score of A (M = 77.50) in Mental Demand is lower than that of B (M = 185.83), and there is a significant difference (p = .008 < .01). Low-experienced participants of B said that B is different from the previous ATM interface. Because the participants are not used to the reading order of this page, they need to spend more mental energy to understand the new interface. They further stated that they were worried that making mistakes during the operation of this layout would cause their bank cards to be withdrawn, so they did not have the confidence to successfully complete the task under unfamiliar B.
The average workload of high-experience participants in B (M = 9.25) is lower than that in A (M = 15.62). The mean value of Temporal Demand (TD; M = 35.83) is lower than that in A (M = 97.50), and there is a significant difference. (*p = .012 < .05). B’s participants particularly emphasized that the progress bar on the left side of the interface during the cardless service application process allows them to clearly confirm the completed steps and effectively confirm their own progress, which helps them evaluate the time it takes to complete the subsequent steps. In addition, six participants said that the information presented by B is more distinct on a visual level, thus helping experienced users to retrieve necessary information quickly and effectively from left to right of the ATM interface. It was also learned from the interviews that five A’s high-experience participants needed to switch between a physical numeric keypad and a virtual numeric keypad during the operation process, which increased their mental load during the operation task.
Comparison of Ga and Gb
The mean value of the workload of low-experience participants in Ga (M = 29.54) is higher than that in Gb (M = 22.29), but there is no significant difference between them. During the interview, four of Ga’s low-experience participants stated that the amount of information arranged on one page with all the guidance information was relatively large, so they were worried about accidentally omitting one of the steps during the operation and causing errors, and they needed to operate more carefully.
The mean value of workload of high-experience participants in Ga (M = 11.43) is lower than that in Gb (M = 18.42), where the mean value of TD in Ga (M = 49.17) is lower than that in Gb (M = 147.50) with a significant difference (**p = .009 < .01). Nine high-experience participants believe that Gb’s one-page with a one-step guide made them think that the operation process is more complicated and rigorous, so they read each step carefully to ensure that no step is missed. However, most high-experience participants felt that the steps were easier than expected after completing the operation, making them feel that the time it took to complete the task could have been faster and shorter. What is more worth mentioning is that five high-experience subjects in Gb did not continue to understand the instructions at the ATM after completing step one and successfully completing the task. In addition, after completing step one, some high-experience participants returned to the ATM to quickly understand the remaining steps before returning to their mobile phones to obtain the serial number. They further explained that the steps to obtain the serial number were relatively simple, so this design increased the interaction cost for them to constantly check back and forth between the ATM and their smartphones.
Correlation Between Task Performance and Scales
Spearman correlation test of task performance with NASA-TLX, satisfaction, preference, willingness to use, usability, and NPS. We found that operation time was positively correlated with workload and negatively correlated with preference in both tasks. But we can also see that satisfaction, preference, and ease of use are not correlated with NPS in either task (Table 7).
Spearman Correlation Test of Task Performance with NASA-TLX, Satisfaction, Preference, Willingness to Use, Usability, and NPS.
Note. Using Spearman correlation testing, sample size N = 36; M is mean; SD is standard deviation; 1 to 8 is rho value.
p < .05 (two-tailed). **p < .01 (two-tailed).
Comparison of the Original Interface with Two Sets of Interfaces Optimized for the Design
Comparison of Task Performance with SEQ
This section compares the participants’ average operation time and the number of errors in operating the two tasks between the new designs and the bank’s original interface. And use one-way ANOVA variation statistical analysis to confirm whether there is a statistically significant difference between them.
In Task 1, A and B are better than the original interface regarding operation time and number of errors. Moving the cardless service application entrance to the first page can reduce the time and number of errors required to complete the task, allowing participants to complete the task on the first page confidently.
We further used one-way ANOVA variation statistical analysis to confirm whether there is a statistically significant difference in the operation time and number of errors between the original interface and new designs. The results showed that the operation time (***p = .000 < .001) and the number of errors (**p = .002 < .01).
In addition to the operation time of Gb being higher than the bank’s existing interface, the number of errors of Ga and Gb is also higher than that of the bank’s existing interface. Some participants in Gb encountered unresponsiveness in the interface, causing participants to touch accidentally. Therefore, the number of errors caused by the two guidance proposals is higher than that of the original interface. In addition to Ga being better than the original interface in terms of operation time, the SEQ scores of Ga and Gb are also better than the existing interface. This study infers that guidance instructions help participants effectively obtain the withdrawal serial number. Therefore, even if the Ga has many errors, it can effectively reduce participants’ time to complete the task (Table 8).
The Average Operation Time, Number of Errors, and Average Scores of SEQ of Individual SEQ Tasks in the Original Interface and Proposals.
Note. One-way ANOVA was used; N = 36; M is mean; SD is standard deviation.
p < .01 (two-tailed). ***p < .001 (two-tailed).
On the other hand, after one-way ANOVA analysis of task operation time, number of errors, and SEQ, the results showed that task operation time (**p = .000 < .001) and number of errors (p = .002 < .01). All are statistically different, but not different from SEQ scores (p = .258 > .05). Task 2 has a significant difference only in operation time (p = .009 < .01).
Post-Session Self-Report Assessment
SUS
The questionnaire consists of 10 questions; each question is a sentence describing the system being evaluated and uses a 5-point Likert scale based on the user’s subjective feelings, ranging from strongly disagree (one point) to strongly agree (five points). Through a specific algorithm, the score range will fall between 0 and 100 (Brooke, 1996). Past research found that SUS can be used as a scoring standard through the Curved Grading Scale (CGS; Sauro & Lewis, 2011).
The average SUS scores of A-Ga, A-Gb, B-Ga, and B-Gb are shown in Figure 3. The SUS scores of the four groups of proposals are ranked as B-Ga > B-Gb > A-Gb > A-Ga. The results showed that the SUS scores of the four groups of proposals after optimizing the cardless withdrawal interface all fall into the A Grade. Among them, the average SUS score of proposal A-Ga is 90.56, the average SUS score of proposal A-Gb is 90.83, the average SUS score of proposal B-Ga is 92.22, and the average SUS score of proposal B-Gb is 91.11.

SUS results of original design and four proposals.
The average score for Q5 in the SUS questionnaire (I think the functions of the cardless service are well integrated) is as high as 4.7 points. Most participants said that the interface successfully provided guidance instructions at appropriate steps, and the switching process between the ATM and mobile online banking was smooth, helping them to complete the task smoothly. In addition, the SUS questionnaire’s average score for Q9 (I am confident that I can use cardless services) is as high as 4.9 points. Through interviews, this study learned that progress bars helped recall and confirm completed steps or progress and evaluated the time required to complete subsequent steps, which increased participants’ confidence in operating tasks.
After the optimization, the overall average usability score of the ATM cardless withdrawal interface has been significantly improved compared to the original interface, and the SUS level has been improved from B Grade to A Grade. This study further analyzed the individual scores of the subjects. Four participants rated the original interface SUS as A Grade, and as many as 29 participants rated the optimized interface SUS as A Grade. Therefore, it can be explained that the optimized ATM cardless withdrawal interface optimization is more user-friendly regarding usability and user’s operating behavior and needs.
“Willingness to Use.”
The statistical results show that the usage intention of the four groups of proposals has significantly improved after operating the interface. The average willingness score ranking after use is A-Ga > A-Gb > B-Gb > B-Ga, and the average willingness growth rate ranking is B-Ga > B-Gb > A-Gb > A-Ga. The average scores and growth rates of the four groups can be found in Table 9. It can be seen from the results that A-Ga’s willingness to use after interface operation is the highest among the four groups of proposals. However, although B-Ga’s usage intention after interface operation is the lowest among the four groups of proposals, it has the highest growth rate.
Compare the Willingness and Growth Rate of the Bank’s Existing Interfaces and the Four Group.
Note. 7-point Likert scale, ranging from strongly disagree at 1 point to strongly agree at 7 points.
The willingness to use the optimized interface after operation has been significantly improved compared to the original interface. The intention to use the original interface dropped by 34.1%, while the intention to use the optimized interface increased by 94.6%. Therefore, it can be explained that the optimized interface will make users more willing to use cardless withdrawals in the future.
NPS
NPS scores are divided into three broad categories: Promoters (9–10), Passives (7–8), and Detractors (0–6). The NPS total partition range is from −100% to +100%. NPS > 0 indicates more Promoters than Detractors, while NPS < 0 indicates poor satisfaction, with more Detractors than Promoters (Melnic, 2016). The NPS calculation formula is as follows. Table 10 compares the data from the original interface with the four sets of proposals.
The Comparison of NPS and the Number of Each Category Between the Original Interface and Four Proposals.
Note. NPS = Promoters %—Detractors %.
The overall NPS of the optimized interface has been significantly improved compared to the original interface. Further analysis of the NPS levels of participants shows that the original interface and the optimized interface “Promoter” have 4 and 24 participants, respectively, accounting for 40% and 66.7% in total. Therefore, it can be explained that the optimized interface makes users more willing to recommend this cardless cash withdrawal service to their relatives and friends.
In addition, compare the average NPS of the original and optimized interfaces. The results show that the average NPS score of the original interface is 7.8 points, while the average NPS score of the optimized interface is 8.7 points. After independent sample t-test analysis, the confirmation results showed a significant difference between the two (p = .037 < .05).
Discussion
This study selected the bank with the best performance from the three banks with the largest ATMs in Taiwan as the optimization target. We adjusted the information structure, used visual hierarchy design methods, and added assistance mechanisms to optimize the bank’s ATM cardless withdrawal problem. Below, we discussed the impact of visual hierarchy design on the distribution of participants’ attention, the differences between people with high and low experience, and data-driven user experience design.
The Impact of Visual Hierarchy Design on Attention Allocation
Unlike web pages, the ATM interface operates in a real environment under the pressure of queuing, safety considerations, and no real human assistance. How to reduce users’ operation time and error rate tests the effective allocation and guidance of ATM designers to user attention in visual design. The optimized proposals of this study use a visual hierarchical design to influence the distribution and control of the participant’s attention. In task-oriented cognitive walkthroughs, current goals (goal-driven) are undoubtedly the main deployment of attention distribution in tasks. Therefore, the flattening of the information architecture hierarchy is not a disadvantage in the functional characteristics of the ATM interface. For example, we have moved the cardless withdrawal request portal to the home page. Dramatically reduce the time and error rate users spend searching for this feature. Even if the information architecture is flat, we still use color to design the difference in visual search and recognition on the same page, which can help users distinguish the main function, Call to Action, and other secondary function buttons at the stage of perception. This virtually reduces the user’s time and workload (Figure 4).

The color design of the buttons expresses the information architecture; on the left is the existing interface of the bank, and on the right is the optimized design.
In addition, the “consistency” of design reduces the gap between the participants learning the new interface. For example, in the B, the Z-Layout is designed to meet the visual search characteristics of the regular user browsing the digital layout. From the title at the top of the page, in the order of reading in Z-shaped order, to the “Call to Action” button in the lower right corner. This is to “history-driven” attention distribution to make the test subject more intuitive when scanning (Figure 5).

Z-layout reading order.
However, we even moved the “Apply Now” button to the first page, giving it a special button color; it was originally expected to be able to deploy physical salience (stimulus-driven) attention. Still, it turned out that participants were affected by history-driven attention, triggering the “Banner Blindness” phenomenon in 39% (7) participants. The application service areas (the content of the blue box in Figure 6) of A and B were misjudged as advertising areas. How to attract attention without being affected by “Banner Blindness” may be discussed in more depth in future visual design research.

The area in the blue box that causes “Banner Blindness.”
In addition to the design of buttons, using the visual design “chunking” skill for information is a way to solve cognitive load. Recoding smaller units of information into larger units (Thalmann et al., 2019) adjusts the user’s visual order in terms of visual perception and cognitively reduces working memory (Allen et al., 2021; Farooqui et al., 2023; Norris & Kalm, 2021). Designers use the visual hierarchy of “chunking” to guide the user’s attention, visual search order, and information processing. This practice showed its effectiveness in the eight-digit withdrawal serial number that requires the user’s short-term memory during the task and the need to operate back and forth between the smartphone and the ATM. 39% (14 participants) of participants proposed “chunking,” which helped them memorize and enter serial numbers quickly and rhythmically, reducing the load they needed to memorize them back and forth (refer to Figure 7).

Numbers are easy to remember when chunking.
In addition, 50% (nine participants) said they ignored icons and focused mainly on text. Past studies (Kozak et al., 2021) have found that the indirect regulation of human visual memory, in the competition between text memory and picture memory with shared resources, the instruction to remember text will reduce the intensity of picture memory. However, these participants said in interviews that they chose only to read words under the dual pressure of queuing and financial risk.
Human-Centered Design (the Difference Between Low-Experienced and High-Experienced)
One of the interesting findings of this study is that people with high and low experience in using ATMs have different performances on different designs. The “center stage” design of the B layout allows low-experienced participants to reduce learning and focus on the main operations on each page. The difference in analysis results is reflected in their mental demand. The average operation time of the low-experienced person in A layout is shorter, and the mental demand is less, resulting in their preference for the A layout design, higher satisfaction, and higher evaluation of usability. This is also reflected in the design of the guide page in Task 2, which is a one-by-one step-by-step operation, and although it took longer for low-experience participants, they are one score more satisfied with Gb than Ga, and twice as many people prefer Gb as Ga, although they are not statistically significant.
Conversely, for highly experienced participants, “time” seemed what they cared about most. In Task 1, layout B’s average operation time is less than layout A’s. But even an eight-second gap is reflected in their temporal demand and has a significant difference (A: M = 97.5, B: M = 35.83, p = .012 < .05). The temporal demand of highly experienced people is the only significant difference in other load items. Similarly, the time spent by highly experienced people operating the guidance page in Task 2 was statistically significant. The Gb six steps are distributed to six pages, adding to their temporal demand (Ga: M = 49.17, Gb: M = 147.50, p = .009 < .01). Instead, Ga listed six steps on the same page, using visual hierarchy design techniques to enhance the clarity of information, so that the satisfaction of high-experienced people with Ga (M = 6.5) was 1.5 higher than that of Gb. It is directly reflected in the preferred choice: Ga has 11 people, and Gb only one person.
From the above discussion, we can also find that Gb uses the “Slide” pattern to change pages. The design pattern of progressive disclosure (Carroll & Carrithers, 1984; Nielsen, 1994; Spillers, 2004) is gradually exposed, providing a better experience for low-experience and even novices. It is worth mentioning that the “progress bar” designed by the progressive disclosure mode is used. This is the biggest difference between the optimization design of this study and the original bank interface. 69% (25) of the participants believed that the progress bar helped recall and confirm the completed steps or progress and assess the time it took to complete the next steps, which increased participants’ sense of security during the task operation. 25% (nine participants) believed that the progress bar helped them understand the steps that needed to be performed, and they could judge whether to continue according to the current situation, helping to reduce the participants’ stress during the operation.
Based on the combined results, B-Ga suits high-experience and A-Gb for low-experience users. Interface design can be individualized to align with human-centered design concepts, although not practiced on current systems. However, regarding the depth of personal data collection by banks, coupled with the development and application of artificial intelligence, it is just around the corner to practice interfaces that can meet personal experiences and habits in the future.
Data-Driven UX Practical Advice
This study adopted a commonly used scale in the field of user experience and collected self-reported data from users. Regarding cognitive walkthroughs, objectively observing the correlation between operation time and error rate and post-task load, satisfaction, preference, and usability are worthy of designers’ in-depth understanding and further from the interviews to gain insights. However, judging from the analysis results of this study, we do not recommend that only a single assessment of “usage intention” and “net promoter score” are often seen in the industry to examine customer loyalty. Customer loyalty does not come purely from user experience but may consider factors unknown to other researchers (J. R. Lewis, 2018), such as customer welfare, security, law, personal information privacy protection, etc. (Ryu, 2018). On the other hand, relying solely on these two scales has its limitations, and it is impossible to obtain differences among users with different experiences. Therefore, we suggest that when the industry is engaged in user experience surveys, they should timely introduce different experience surveys at different stages of interface design to finetune the design. Based on the comparison before and after this study’s design, in the interface’s visual design, corporate designers can aim to improve the usability of products/services to enhance brand loyalty.
Conclusion
After investigating the current situation of ATM cardless services, this study attempts to optimize the ATM interface design with better performance to obtain a more perfect user experience. The verification results prove that the optimized design performs better than the bank’s original interface on all data. We considered heuristic evaluation to improve interface design and Faraday’s Visual Hierarchy Model to optimize the design of the visual hierarchy. The user experience survey used task-oriented cognitive walkthroughs, NASA-TLX, SEQ, satisfaction, preference, SUS, “willingness to use” scale, and NPS. In addition to obtaining task performance and self-reported data and understanding the relationship between the data on each scale, it also understands the differences between high and low-experienced users. Finally, we provide advice to the industry on user experience surveys.
Limitations
The experimental phase of this study coincided with the COVID-19 epidemic, so all remote experiments were adopted, and the situation of ATMs in the actual environment could not be reproduced in the scenario simulation of cognitive walkthrough. In addition, due to the influence of the online experiment under the test, most participants recruited are between 20 and 39 years old, and it is suggested that it can be extended to other age groups in the future to understand their operation situation, willingness, and ideas, and apply it to design and help improve the versatility and future development of ATM cardless services.
Footnotes
Appendix 1
Authors’ Contributions
Methodology, writing - review and editing, resources, and supervision: Meng-Cong Zheng. Conceptualization, investigation: Joon Ming Nigell Lay. Methodology, formal analysis, writing - original draft preparation, visualization: Ching-I Chen.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Data Availability Statement
The data supporting this study’s findings are not openly available and are available from the corresponding author upon reasonable request.
