Abstract
This study extends the ACT-R base-level learning equation by incorporating environmental uncertainty, observation duration, and individual learning patterns to better model human memory retrieval in dynamic, multi-object environments. We hypothesize that retrieval efficiency depends not only on frequency and recency of interactions but also on scene entropy, time spent observing, and a learning pattern parameter (α). Using a household task scenario in the AI2-THOR simulator, we analyzed search times as proxies for memory access. Our modified equation integrates entropy to reflect visual uncertainty, scales decay by observation duration, and includes α to capture task-specific learning dynamics. Results from eight participants revealed two distinct retrieval patterns, Increasing–Decreasing and Decreasing–Increasing, reflecting how prior learning influenced search behavior. Participants with shorter search times showed stronger memory retention and α is +1. Model evaluation showed that, compared to the original ACT-R equation (Mean Absolute Error [MAE] = .3580, correlation = .3078), the modified version achieved a lower average MAE of .1868 and a higher correlation of .7776. These results highlight the need for and importance of contextual and temporal factors in memory modeling and support the development of adaptive systems that account for learning patterns in visually complex environments.
Introduction
Simulating human cognition remains a significant challenge. ACT-R, a widely used cognitive architecture, models memory, decision-making, and perception through structured mechanisms (Anderson & Lebiere, 1998; Byrne, 2012; Newell, 1990; Ritter, 2019). It divides knowledge into declarative (facts stored in chunks) and procedural (if-then production rules) forms. A core component of ACT-R is its base-level learning equation, which calculates chunk activation based on frequency and recency (Anderson, 2007; Anderson & Lebiere, 1998). While effective in static settings, this mechanism struggles in dynamic environments where object locations change. Such settings demand flexible retrieval patterns that adapt to uncertainty and evolving context. Although prior extensions to ACT-R have improved retrieval modeling (Kim et al., 2013; Stocco et al., 2010; Tehranchi et al., 2021), modeling uncertainty in multi-object environments remains unresolved.
One crucial yet underexplored factor in memory retrieval is uncertainty, which influences how information is encoded, maintained, and recalled. Retrieval is more effective when uncertainty levels at recall match those during learning, suggesting memory is tuned to environmental unpredictability (Ogasa et al., 2024). Under high uncertainty, people shift from strict recall to adaptive learning, enhancing cognitive flexibility (Nicholas et al., 2022). The brain also monitors uncertainty in working memory to modulate confidence in retrieved content (Yoo et al., 2021), indicating that uncertainty is a core mechanism supporting adaptability and decision-making in complex dynamic environments. Observation duration also significantly affects memory strength. Shorter viewing times weaken reconsolidation, leading to faster forgetting time (Inda et al., 2011), while well-timed retrieval practice improves long-term retention (Kriechbaum & Bäuml, 2023). Longer exposure enhances retrieval-related neural activity, reinforcing connections essential for recall (Bosshardt et al., 2005). Thus, observation time is not incidental but central to how effectively information is retained and accessed. In dynamic environments, where objects appear, disappear, or change locations, entropy provides a useful way to quantify uncertainty. Entropy, introduced by Shannon (1948), quantifies uncertainty by measuring the information content of an event, representing the expected information needed to describe a variable’s state (Thomas & Joy, 2006).
This study investigates how humans learn and retrieve memories while completing household tasks in a dynamic multi-object environment. We hypothesize that retrieval is influenced not only by frequency and recency, but also by entropy, observation duration, and individual learning patterns. Using the AI2-THOR simulator (Kolve et al., 2017), participants performed a sequence of tasks. We analyzed search times as proxies for retrieval and introduced a modified equation that integrates entropy, observation time, and a learning parameter
Method
The tasks were designed in the AI2-THOR simulator to refine the base-level learning equation for more accurate modeling of human behavior. Within this environment, participants interacted with objects via a virtual agent. This setup enabled the analysis of human behavior in multi-object tasks using the recorded interaction data. The study was approved by the Institutional Review Board of (hidden due to the double-blind process; IRB approval number: STUDY00026274). Participants (n = 8; 3 female) were graduate students from (hidden due to the double-blind process).
Experiment Design
To evaluate human learning and memory, we developed a series of household tasks as shown in Table 1. The experiment consisted of two distinct phases: training and testing, each with a completely different map configuration. During training, participants practiced agent control in AI2-THOR through guided tasks. The testing phase included three tasks: Exploration, Rearrangement, and Main. In Exploration, participants located specific objects by navigating the environment. In Rearrangement, they completed nine subtasks involving object placement based on prior observations, verbally confirming placements to reinforce memory. A three-minute break followed, during which visual input was blocked to promote memory decay. In the final main task, participants performed four subtasks relying solely on memory, enabling assessment of the ACT-R base-level learning equation in modeling retrieval behavior.
Overview of the household tasks used in the study, detailing both the Training Phase (designed to familiarize participants with the simulator through tasks such as cooking sliced bread and cooling a tomato) and the Testing Phase, which is subdivided into the Exploration, Rearrangement, and Main task.
The user interface included two main sections (see Figure 1). The right side displayed a live log window showing textual feedback for each executed action (e.g., success, failure, cancellation). The left side presented the agent’s visual perspective. Participants had access to ten action options. For navigation, the actions included “MoveTo,” “LookUp,” and “LookDown.” When the “MoveTo” button is pressed, a list of available locations appeared, and participants can select a destination by typing its two-digit index. The selected location was logged in real time. For object interactions, seven actions were provided: “PickupObject,” “PutObject,” “OpenObject,” “CloseObject,” “ToggleOnObject,” “ToggleOffObject,” and “SliceObject.” When one of these buttons was pressed, a list of interactable objects was displayed, and the participant can select the desired object. To minimize errors, the system prompts for confirmation with the message, “You want to select ‘<OBJECT/LOCATION>’? (Y/N),” and the action was executed only after a “Y” confirmation. Additionally, the “ObjectState” action provides information on the current states of objects in view, while the “GameStatus&Submission” button allows participants to check the current task and its completion status (see Table 1). Once a task was completed, the next task automatically appeared. Pressing the button at the end of the final task concluded the experiment. All actions, object observations, and timestamps were recorded for later analysis.

The user interface of the experiment. The left side on the screen displays the agent’s current visual perspective within the AI2-THOR simulated environment, while the right side on the screen shows the live log streaming window that provides real-time textual feedback on each executed action (e.g., indicating success, failure, or cancellation).
User Study Procedure
The experiment was conducted in a quiet room with only the researcher and the participant present. Participants were seated in front of a 27-inch monitor and adjusted their position for comfortable keyboard use. They received a briefing on the study’s objectives and procedures and were instructed to read the on-screen instructions provided throughout the study. Participants used only the keyboard during the experiment. The experiment began with the training phase, during which the researcher assisted participants in learning to control the agent. At the start of the Exploration task in the testing phase, the researcher verbally instructed participants to explore the map. In the Rearrangement task, they were asked to remember object locations during placement. No verbal instruction or assistance was provided during the main task to ensure participants relied solely on memory.
Base-Level Learning Equations
Human memory retrieval in ACT-R is represented by the base-level learning equation, which quantifies accessibility based on frequency and recency:
where
where n is the number of interactable objects at a given position (assuming a uniform probability distribution). Recognizing that attention may vary based on both the number of objects and the time spent observing a scene, we propose a modified base-level learning equation (
Here, H modulates the decay parameter
In order for comparing the models’ predictive performance on the search times, we used the retrieval time equation presented by ACT-R (Bothell, 2017; Schunn & Anderson, 1998):
Where
Results
We analyzed time data for each action in subtask (1), “Prepare a slice of toasted bread on a plate and bring it to the dining table,” during the Main task of the testing phase. The subtask (1) was designed to make participants interact with three objects, so this subtask was the most suitable to capture the dynamical learning patterns. It was assumed that learning occurred up to the completion of the Rearrangement task, and that performance in the Main task reflected memory-based behavior. In this phase, participants relied solely on memory to complete the task. Due to the nature of household tasks, many participants navigated the environment and opened closed receptacles (such as cabinets and drawers) when unable to recall the locations of target objects. To assess performance, we measured search time. Search time was measured as (1) the interval between checking a subtask goal and interacting with the target object, and (2) the time between finishing one search and starting the next. For example, if the goal was checked at t = 100 s and object interaction began at t = 180 s, the search time would be 80 s. Since all actions were timestamped, we were able to calculate search times for each target object in subtask (1) of the main task.
Figure 2 shows the search times for each target object in Subtask (1) of the main task, plotted for all participants. All participants interacted with the objects in the same order: “Bread,” “Knife,” and “Plate.” Two distinct patterns were observed from the data. The first was an Increasing–Decreasing pattern, where search times increased from “Bread” to “Knife” and then decreased for “Plate” (participants P1, P3, P5, P8). The second was a Decreasing–Increasing pattern, where search times decreased from “Bread” to “Knife” and then increased for “Plate” (participants P2, P4, P6, P7). Participant P4 showed a partial deviation from the Decreasing–Increasing pattern but aligned more closely with the Decreasing–Increasing group. To evaluate memory retrieval performance, we normalized the CT for “Knife” in the Rearrangement task by the total Rearrangement task duration. Similarly, the search time for “Knife” in the Main task was normalized by the subtask CT. The ratio of these two normalized values was used to quantify the influence of prior learning on retrieval efficiency. Participants with the Increasing–Decreasing pattern spent at least 1.5 times more time interacting with the object in the Main task than in the Rearrangement task. In contrast, participants with the Decreasing–Increasing pattern spent less than or equal to the same proportion of time as in the Rearrangement task.

Search time for each target object in the subtask “Prepare a slice of toasted bread on a plate and bring it to the dining table” of the main task during the testing phase.
Since search time reflects retrieval efficiency, it can be interpreted through the base-level activation. The activation values are inversely proportional to search time, such that higher activation corresponds to faster retrieval (i.e., lower search time). We compared model performance between the original ACT-R equation (Equation 1) and a modified version (Equation 3), which incorporates entropy, observation duration, and a learning pattern parameter (α). We used the default value of .5 for the decay parameter (d) for both equations. This modification enables a more fine-grained understanding of how factors such as object salience, spatial configuration, and prior exposure affect memory retrieval and search efficiency. As shown in Figure 3a, the original equation failed to fully capture the observed retrieval patterns. In contrast, Figure 3b shows that the modified equation aligned more closely with empirical data, supporting its enhanced predictive accuracy.

(a) Activation values for each object computed using the conventional base-level activation equation, which yield nearly uniform patterns across targets. (b) Activation values for each object calculated using the modified base-level activation equation—incorporating both entropy-based uncertainty, observation duration, and learning pattern—that more closely reflect the variability in memory retrieval observed in participants’ search behavior.
We further evaluated the models’ predictive performance on the calculating retrieval time (Equation 4) using Mean Absolute Error (MAE) and the Pearson correlation along with its p-value for each participant, then averaged the results (see Table 2). The retrieval times were normalized using min–max scaling for the MAE calculation because no parameters were optimized for both original and modified equations. The original equation yielded an average MAE of .3580 and an average correlation of .3078. The modified equation achieved a lower average MAE of .1868 and a higher average correlation of .7776, confirming improved predictive accuracy.
MAE and Pearson correlation (with p-values) for retrieval time prediction using the original and modified equations.
Discussion
Participants showed distinct memory retrieval patterns during the main task, even though they interacted with objects in the same order. The Increasing–Decreasing and Decreasing–Increasing search time patterns reflect differences in how prior learning influenced performance. Those in the Increasing–Decreasing group spent more time locating the “Knife,” suggesting weaker encoding, but were faster with the “Plate.” In contrast, participants in the Decreasing–Increasing group recalled the “Knife” more easily but took longer to find the “Plate,” possibly due to limited attention or environmental complexity during earlier exposure.
These behavioral differences were supported by the modeling results. The original ACT-R equation, based solely on frequency and recency, failed to capture individual variability, as shown in Figure 3a. In contrast, the modified equation, which incorporates entropy, observation duration, and a learning parameter α, more accurately reflected retrieval patterns, as shown in Figure 3b. A positive α value of one was associated with efficient retrieval and shorter search times, while a negative value of one indicated weaker memory traces. These α values also distinguished differences between the Rearrangement and Main tasks, reflecting variations in retention across participants.
Table 2 provides quantitative support for the effectiveness of the modified equation. On average, the modified equation achieved a lower mean absolute error (MAE = .1868) compared to the original equation (MAE = .3580), indicating improved prediction accuracy. Similarly, the average Pearson correlation with observed behavior increased substantially from .3078 under the original model to .7776 with the modified version. Most participants benefited from the modification, showing either reduced prediction error, improved correlation, or both. For instance, participant P6’s correlation rose dramatically from −.7378 to .9377, and their MAE dropped from .7576 to .1074. Likewise, participant P3’s MAE improved from .0700 to .0276, with a near-perfect correlation of .9970. A few participants, such as P1 and P5, showed slight degradation in performance. For example, participant P1’s correlation decreased from .9552 to .6984, and their MAE increased from .0879 to .2512. Despite these exceptions, the overall trend confirms that the modified equation more accurately models memory retrieval across participants in dynamic environments.
This study has several limitations. First, the sample size was limited to eight participants, which constrains generalizability. Future work should involve a larger and more diverse population to validate the model’s robustness across individuals. Second, entropy was computed under the assumption of a uniform object distribution, which may not reflect real-world environments. Refining entropy estimation using non-uniform or context-aware distributions could improve predictive accuracy. Third, the analysis focused only on one subtask, which may limit the scope of observed retrieval dynamics. Future studies should incorporate more sophisticated and varied tasks to better evaluate memory performance across contexts. Lastly, no parameter optimization was performed for either model. Incorporating individualized parameter tuning may further enhance alignment between predicted activation and observed behavior.
Conclusion
This study extends the ACT-R base level learning equation by incorporating entropy, observation duration, and a learning pattern parameter to better model memory retrieval in dynamic multi-object environments. The modified equation achieved lower prediction error and stronger alignment with observed behavior, with an average MAE of .1868 and a correlation of .7776. By capturing distinct retrieval patterns across participants, the model demonstrates the need for and importance of accounting for environmental uncertainty and individual learning dynamics. These findings support the value of context aware memory models for improving behavioral prediction. Future work should involve a larger and more diverse participant sample, refine entropy estimation using real world distributions, and explore personalized parameter tuning. This research contributes to the HFES community by advancing cognitive modeling for applications in training, system design, and human-machine interaction.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
