Abstract
Gloss perception depends on several surface properties, but most studies measure these effects only one or two at a time. Here, we test whether a three-dimensional version of maximum likelihood conjoint measurement (3D-MLCM) can be used to capture the combined influence of multiple cues on gloss. Observers judged which of two surfaces looked glossier while specular reflectance, albedo, and bumpiness were varied together. The additive model showed clear and reliable contributions of both albedo and bumpiness in addition to specular reflectance, and model comparisons confirmed that these cues significantly affected gloss judgments. The full model further revealed that these effects change with gloss level: bumpiness strongly influenced perceived gloss at low specular reflectance but had little effect at high gloss levels. These results show that 3D-MLCM provides stable, interpretable measurements and is a practical method for studying complex interactions between visual features that influence visual appearance.
Keywords
How to cite this article
Hansmann-Roth, S. (2026). The influence of specular reflectance, albedo, and shape on perceived gloss: A case for three-dimensional maximum likelihood conjoint measurement (MLCM). i-Perception, 17(2), 1–12. https://doi.org/10.1177/10.1177/20416695261435363
Introduction
Materials are ubiquitous to us. Most objects we interact with on a daily basis are made from a particular material that influences our grasping behavior (Baumgartner et al., 2013; Bergmann Tiest & Kappers, 2019; Paulun et al., 2016), the way we walk or what dress we pick to buy (Silvia et al., 2018). Optical such as matte, glossy, translucent, and mechanical properties such as sticky, soft, and so on are two prominent surface properties that we infer mainly through vision. Because the light reaching our eyes reflects a mix of material, lighting, and shape, perception is underconstrained: the visual system has to recover a three-dimensional (3D) world from a two-dimensional (2D)-retinal image (see Chadwick & Kentridge, 2015 and Fleming, 2014, 2017 for reviews on gloss perception). Estimating intrinsic material properties therefore requires the visual system to tease apart these different factors. For example, although gloss physically arises from specular reflections, its perception also depends on color, illumination, and surface geometry (e.g., Fleming et al., 2003; Hansmann-Roth & Mamassian, 2017; Kim et al., 2011; Marlow & Anderson, 2015; Nishida & Shinya, 1998; Olkkonen & Brainard, 2010, 2011; Qi et al., 2015; Vangorp et al., 2007). These findings suggest that the visual system uses image cues or heuristics to make rapid, workable inferences about material properties (Fleming, 2014; Storrs et al., 2021). Perceived gloss depends on many midlevel visual cues in the image such as size, sharpness, coverage, or the contrast of specular highlights (Marlow et al., 2012; Marlow & Anderson, 2013). These highlight characteristics can vary independently from the strength of the specular reflection and therefore, this often results in variations of perceived gloss while the physical gloss remains the same: darker objects appear glossier, because the contrast between the highlight and the nonhighlight area increases (Hansmann-Roth & Mamassian, 2017; Hunter & Harold, 1987; Pellacini et al., 2000). Frontal illumination results in larger highlights on the surface (Beck & Prazdny, 1981; Berzhanskaya et al., 2005; Marlow et al., 2012), leading to the perception of a glossier material. A bumpier surface produces sharper highlights, and sharper highlights result in perceiving a surface as glossier than blurred highlights (Ho et al., 2008). Understanding the perception of gloss requires studying these interactions between the different material properties and how they affect perceived gloss.
A very promising tool that has been applied to quantify these interactions is maximum likelihood conjoint measurement or MLCM (Knoblauch & Maloney, 2012). MLCM allows one to simultaneously measure the effects of multiple physical dimensions on visual appearance. In the seminal paper by Ho et al. (2008) the mutual influences of surface roughness/bumpiness and glossiness of a surface were examined by applying MLCM which allowed them to quantify how the changes in the different material properties affected each other. Strong interactions between perceived gloss and perceived surface roughness were found. Importantly, the insights gained from these MLCM experiments are not restricted to the study of material perception; rather, they generalize to a wide range of perceptual domains, offering a useful framework for understanding how the visual system integrates multiple cues and how secondary features can systematically bias the perception of a primary visual feature (the one that observers are asked to judge after (Ho et al., 2008); between lightness and chroma (Rogers et al., 2016); that background and contour luminance contribute to the watercolor effect (Gerardin et al., 2018); the interactions between facial age and gender (Fitousi, 2021); that perceived transparency (Hansmann-Roth & Mamassian, 2025) and the background albedo influences perceived gloss (Hansmann-Roth & Mamassian, 2017); and that perceived translucency depends on the joint contributions of scattering and absorption, as demonstrated in a study using real “milky tea” stimuli (Chadwick et al., 2018).
As noted above, perceived gloss depends on several surface properties that are physically independent of specular reflections but interact perceptually, such as albedo and surface roughness. This suggests that gloss cannot be fully captured within a simple 2D space, and that conventional 2D MLCM designs may be insufficient. A more complete description therefore requires modeling gloss within a 3D stimulus space that reflects the independent contributions and potential interactions of these attributes.
To quantify these multidimensional effects, we applied the standard likelihood-based MLCM framework to three stimulus dimensions, allowing simultaneous estimation of how each dimension, and their interactions, shape perceived gloss. This does not involve a new estimation procedure; rather, MLCM can be naturally extended to higher-dimensional stimulus spaces. To our knowledge, such a 3D application has been implemented only once previously, in the context of multielement texture perception (Sun et al., 2021) and not for material properties. The novelty of the present work therefore lies not in the mathematical formulation of the model, but in applying a higher-dimensional MLCM design to quantify the perceptual contributions of gloss, albedo, and bumpiness.
We are now applying this 3D-MLCM approach in a small observer sample, as the inclusion of three dimensions with multiple feature levels greatly increases the number of required comparisons, exceeding 16,000 trials per observer. We are applying this 3D-MLCM approach to evaluate whether it is suited to quantify the combined influence of three surface features on perceived gloss.
Materials and Methods
Stimuli
The surface was generated in Wolfram Mathematica 10.2 by placing 20 × 20 spheres placed on a grid with a distance of 1 cm and a radius of 0.8 cm. The exact position was then jittered with a random value between −0.25 and 0.25 cm in all three directions. Each surface was rendered with four different bump levels, four different albedos and four different gloss levels (varying the specular reflectance parameters). The four bump levels were generated by manipulating the height along the z-axis of the surface (see Figure 1, top row). While a factor of 1 corresponds to a perfect round sphere the different bump levels were generated by using the following weights: 0.15, 0.41, 1, and 1.29. This resulted in the four different bumpiness levels shown in Figure 1 (top row).

Overview of a subset of surfaces used in the experiment. Each row shows the four different levels of bumpiness, albedo, and gloss while the other two features are held constant.
Stimuli were then rendered with Mitsuba 0.5.0 using the RenderToolbox4 package for MATLAB (Heasly et al., 2014; see also http://rendertoolbox.org). RenderToolbox acts as an interface to Mitsuba rendering software (Wenzel, 2010; see also www.mitsuba-renderer.org).
Sixty-four stimuli were rendered using the Ward model (Ward, 1992). The Ward model defines surface reflectance by three different parameters: ρd controls the diffuse reflection or albedo, ρs controls the strength of the specular reflection, and α controls the spread of the specular lobe. The diffuse reflection was manipulated to obtain four perceptually distinct albedos (ρd = 0.03, 0.1, 0.2, and 0.3) and the strength of specular lobe to simulate different levels of gloss (α = 0.4, 0.2, 0.1 and 0.01) while ρs was held constant at 0.07 (see Figure 1, middle and bottom rows).
All stimuli were illuminated with an environment map (Hallstatt, downloaded from http://dativ.at/lightprobes/). We cut out the rendered objects from their original background and placed them in front of a black background. Images were converted from RGB to gray scale in MATLAB. MATLAB uses the following conversion weights: 0.2989 × R + 0.5870 × G + 0.1140 × B.
Procedure
All observers were seated 57 cm away from the monitor inside a dark experimental booth and viewed the stimuli with both eyes. The images did not contain binocular disparities, so from a stereoscopic perspective, the stimuli were consistent with flat objects. Sixty-four surfaces were presented. On each trial two surfaces were presented simultaneously on the left and right side of the screen. Each surface subtended 14° × 14° of visual angle. These surfaces were presented for an unlimited amount of time until the observer responded. Observers were encouraged to select the glossier surface by pressing the left or right arrow key on the keyboard. After pressing the appropriate response key the next trial began.
The four gloss, bump, and albedo levels resulted in 4,032 possible combinations without self-comparisons (64 in total). To obtain reliable results a minimum of three repetitions are required. In order to balance the presentation on the left and right of the screen, four repetitions were chosen resulting in a total of 16.128 trials without self-comparisons. Observers finished all trials in 18 sessions while each session consisted of 896 trials. Each session lasted for about 45 min including three breaks.
Data Analysis
To analyze the data, we modeled observers’ judgments within the MLCM framework. The following equations describe the formal perceptual decision model assumed by MLCM—that is, how stimulus dimensions are combined into an internal decision variable that determines choice probabilities. These equations represent the mathematical assumptions of the model rather than a direct claim about the neural implementation of gloss perception.
Two surfaces are presented simultaneously to an observer:
When observers are encouraged to indicate which surface is glossier, an estimate of perceived gloss
If only one of the secondary features (albedo or bumpiness) is additionally contributing to the perceptual estimate, then it is better described by
Both gloss estimates from the two presented surfaces are then compared and the difference is computed:
In the additive model, bumpiness, albedo, and gloss contribute linearly and independently to the perceptual estimate. That is, the influence of albedo and bumpiness on perceived gloss does not depend on the level of gloss. This means that the contamination of albedo and/or bumpiness on perceived gloss is independent of the gloss level.
For each gloss, albedo and bumpiness levels, we estimate the parameters
However, from equation (4) we can conclude that any parameter will account equally well if we add or multiply it by a constant c. Therefore,
If perceived lightness and perceived bumpiness of the surface have no additional influence on perceived gloss, we are left with the independent observer model:
For a full model, perceived lightness and glossiness are modeled with an interaction factor, bringing the number of free parameters to 63 that are linked as follows:
We used the framework from Ho et al. (2008) where the model makes no assumption about the direction of the effects: estimated parameters can be positive or negative and the function could also be nonmonotonic. We analyzed our data using the MLCM package (Knoblauch & Maloney, 2012, 2014) in open source software R (R Core Team, 2021) to estimate the perceptual scale values and model the contribution of both features, albedo, and gloss. The 3D implementation does not require any modification of the package or underlying algorithm; the additional stimulus dimension is incorporated directly into the design matrix within the standard MLCM framework. All three models are nested within each other and were tested against each other using a likelihood ratio test of nested models using the
Because the MLCM models were estimated using maximum likelihood on binary forced-choice data, a traditional coefficient of determination (R2) based on variance decomposition is not applicable. The classical R2 assumes a linear model with continuous outcomes and quantifies explained variance, which is not defined for likelihood-based models of categorical responses. To provide an index of model fit, we therefore report McFadden's pseudo-R2. This measure quantifies the proportional improvement in log-likelihood of a fitted model relative to a baseline/null model and is commonly used for maximum likelihood frameworks such as logistic regression and conjoint measurement. Importantly, this does not represent variance explained in the traditional sense, but rather the relative improvement in model fit under the likelihood framework. Pseudo-R2 values were computed separately for each observer, consistent with our individual-level model fitting. We report the improvement of the additive model relative to the independent model, and the improvement of the full model relative to the additive model. In likelihood-based contexts, McFadden's pseudo-R2 values around 0.20 are generally considered to indicate strong model fit, and values above 0.40 are typically interpreted as very strong improvements over the baseline model (McFadden, 1974).
Observers
Four students/staff (three females) from the University of Iceland participated in the experiment. All except for the author were naïve to the purpose of the study and had all normal or corrected-to-normal vision. Participants gave written informed consent. All experiments were performed in accordance with the Declaration of Helsinki.
Apparatus
Stimuli were displayed on a 24-inch calibrated LCD monitor (ASUS, VX248 h; resolution 1,920 × 1,080) with a linearized gamma using MATLAB R 2019a and Psychtoolbox-3 that ran on a desktop PC with Windows 10.
Results
Observers were presented with surfaces that varied in three material properties: specular reflectance, albedo, and bumpiness. They were instructed to judge which of two simultaneously presented surfaces appeared glossier. Figure 2 shows the estimated perceptual scales under the additive model for each observer and averaged across observers. The different colors correspond to the contribution of each material property to perceived gloss. Unsurprisingly, specular reflectance elicited the strongest contribution to perceived gloss. However, both additional properties (albedo and bumpiness) also influenced gloss perception, as indicated by nonzero parameter estimates for those dimensions (see purple and green lines in Figure 2).

Results from the experiment under the additive model. Normalized parameter fits from the additive model for each observer (a–d) and averaged across all observers (e). Each panel plots perceived gloss as a function of specular reflectance (blue), albedo (green), and bumpiness (red). Error bars indicate the standard error of the mean.
We formally tested whether including albedo and bumpiness improved the description of observers’ judgments by comparing the independent and additive models using a likelihood ratio test.
The abovementioned observations were confirmed by nested hypothesis tests at the Bonferroni-corrected level (0.0125), comparing the independent model (in which irrelevant dimensions are constrained to zero) with the additive model. For all four observers, the additive model provided a significantly better fit (all observer: p < .001), demonstrating that both albedo and bumpiness systematically bias perceived gloss.
In the additive model, gloss was scaled from 0 (lowest gloss level) to 1 (highest gloss level). For ease of interpretation, this scale was linearly rescaled to 0%–100%. The fitted parameter at level 4 for albedo and bumpiness therefore reflects how strongly moving from the lowest to the highest level of that dimension shifts perceived gloss on the same perceptual scale. Because MLCM estimates perceptual scale values rather than direct physical units, the fitted parameters reflect shifts along an internally derived gloss scale.
For bumpiness, the fitted level 4 parameters were 0.50 (O1), 0.44 (O2), 0.38 (O3), and 0.21 (O4), corresponding to shifts of 50%, 44%, 38%, and 21% of the gloss range, respectively. The corresponding albedo parameters were 0.15 (O1), 0.36 (O2), 0.14 (O3), and 0.12 (O4), corresponding to 15%, 36%, 14%, and 12% of the gloss range. Averaged across observers, lighter surfaces reduced perceived gloss by around 19.25% of the gloss range, whereas bumpier surfaces increased perceived gloss by 38.25%. These values represent shifts on the gloss perceptual scale and do not indicate proportional contributions that sum to 100%. Scale values were monotonic across stimulus levels for all dimensions, confirming consistent influence of the physical manipulations of the surface. (All parameter estimates of the independent and the additive model are provided in the Supplemental materials.)
The additive model assumes that the influences of albedo and bumpiness are independent of the specular reflectance of the surface. As previously described, the full model includes additional interaction terms to test for such dependencies. Figure 3 visualizes the perceptual gloss scale values predicted by the full model for all combinations of gloss, albedo, and bumpiness (4 × 4 × 4), averaged across observers. A nested hypothesis test comparing the full model against the additive model (Bonferroni-corrected α = .0125) revealed that the full model provided a significantly better fit for all four observers (all observers: p < .001). This indicates that at least one interaction between the stimulus dimensions significantly contributes to perceived gloss.

Results from the experiment under the full model. Normalized parameter fits from the model separately for each shape. Perceived gloss is plotted against the four albedo levels. Each line corresponds to one of the four gloss levels. Error bars indicate the standard error of the mean.
The pattern shown in Figure 3 suggests that these interactions primarily reflect modulations of perceived gloss through the bumpiness and albedo of the surface. The contribution of bumpiness is nearly absent at high gloss levels, where surfaces appear strongly glossy (flat dark pink curve), whereas at lower gloss levels bumpiness exerts a pronounced influence: the bumpier the surface, the glossier it appears. A similar pattern is observed for albedo, whose influence is strongest at intermediate gloss levels and reduced for very matte or highly glossy surfaces.
To assess the magnitude of improvement beyond the likelihood ratio tests, we computed McFadden's pseudo-R2 values for each model comparison. For observer O1, the additive model substantially improved fit relative to the independent model (McFadden's R2 = 0.316), and the full model provided an additional improvement over the additive model (McFadden's R2 = 0.181). For observer O2, the additive model showed a very strong improvement over the independent model (McFadden's R2 = 0.577), whereas the full model provided a smaller additional improvement over the additive model (McFadden's R2 = 0.120). For observer O3, the additive model improved fit relative to the independent model (McFadden's R2 = 0.205), and the full model provided an additional improvement over the additive model (McFadden's R2 = 0.300) and for observer O4, the additive model improved fit relative to the independent model (McFadden's R2 = 0.209), and the full model provided an additional improvement over the additive model (McFadden's R2 = 0.121). Across observers, the pseudo-R2 values fall within ranges typically interpreted as indicating moderate to very strong improvements in model fit in likelihood-based models. Together, the likelihood ratio tests and pseudo-R2 values indicate that both additional dimensions and their interactions contribute meaningfully to perceived gloss across observers.
While it is well established that both surface shape and albedo influence perceived gloss, the present approach quantifies their contributions simultaneously and reveals how these cues interact with specular reflectance across the full stimulus range. The 3D-MLCM results therefore provide a comprehensive, model-based description of how multiple material properties jointly shape gloss perception, offering a principled way to map the structure of the multidimensional gloss space.
Discussion
The present study used 3D-MLCM to quantify how multiple material properties jointly influence perceived gloss. By varying specular reflectance, albedo, and bumpiness simultaneously, we were able to assess their independent contributions as well as their interactions within a unified psychophysical framework. Across all observers, specular reflectance exerted the strongest influence on perceived gloss, yet both albedo and bumpiness also systematically biased gloss judgments. Across all observers, bumpiness exhibited a consistently stronger influence on perceived gloss than albedo. Notably, this effect diminished at high specular reflectance levels, where surfaces already appeared highly glossy, but the effect remained robust at medium and low gloss levels. Although it is well established that albedo and bumpiness can alter perceived gloss (Hansmann-Roth & Mamassian, 2017; Ho et al., 2008; Marlow et al., 2012; Marlow & Anderson, 2013; Pellacini et al., 2000; Qi et al., 2015; Vangorp et al., 2007), the present study provides the first simultaneous, quantitative assessment of these cue interactions using a fully 3D-MLCM approach.
Several studies have proposed that the visual system relies on image-based cues, particularly the contrast, sharpness, and spatial extent of specular highlights when judging gloss (Beck & Prazdny, 1981; Berzhanskaya et al., 2005; Fleming, 2014; Marlow et al., 2012; Marlow & Anderson, 2013; Olkkonen & Brainard, 2011). According to these accounts, changes in albedo and bumpiness alter the appearance of the highlight structure in ways that systematically bias perceived gloss. Darker surfaces increase highlight contrast, making them appear glossier, whereas lighter surfaces reduce this contrast and therefore diminish perceived gloss. Similarly, increased bumpiness produces sharper yet spatially localized highlights that can be interpreted as increased glossiness. Our results align with this cue-based interpretation. Although specular reflectance remained the dominant determinant of gloss, both albedo and bumpiness contributed consistently across observers, in a way that matches how these properties reshape highlight contrast and sharpness. Importantly, the strong influence of bumpiness at low gloss levels, and its near absence at high gloss levels, fits with the idea that secondary cues are most influential when the primary feature is ambiguous. At high specular reflectance, highlights become sharply defined and highly informative, reducing the influence of other cues.
A key advantage of the 3D-MLCM approach is that it allows the simultaneous quantification of multiple material cues and their interactions, something that traditional one-dimensional or 2D methods cannot achieve. This is particularly important in material perception, where surface properties rarely vary independently and the visual system must integrate multiple cues to infer appearance. The 3D-MLCM framework not only determines how strongly each dimension contributes to perceived gloss, but also captures how these contributions change across different levels of specular reflectance. The additive model revealed robust influences of both albedo and bumpiness on gloss, and the full model showed that these effects are themselves modulated by the gloss level of the surface. Together, these findings highlight the value of explicitly modeling multidimensional cue interactions rather than assuming cue independence. Importantly, the same multidimensional MLCM framework can be applied to any perceptual domain in which multiple features jointly shape appearance, making it a broadly useful tool for vision science beyond material perception.
The 3D-MLCM approach requires a large number of trials due to the combinatorial increase in stimulus comparisons, which constrained the present study to a small observer sample. Despite this, the consistency of patterns across observers and the monotonicity of the estimated scales indicate that the method yields stable and interpretable results at the individual level. Recent work has begun to extend MLCM into higher-dimensional spaces. Sun et al. (2021), for example, applied a 3D-MLCM framework to study texture regularity, which demonstrates that the method is feasible and yields reliable perceptual estimates even with a small sample of observers. Our study contributes to this development by showing how 3D-MLCM can be used to characterize the joint influence of multiple material properties.
Overall, the present work highlights that multidimensional MLCM is an essential methodological step beyond traditional 2D implementations. By extending MLCM into a fully 3D framework, we show how multiple dimensions can be quantified simultaneously, how their contributions compare in magnitude, and how interactions vary across the full physical range. This establishes multidimensional MLCM as a general tool for mapping visual appearance spaces.
Supplemental Material
sj-pdf-1-ipe-10.1177_20416695261435363 - Supplemental material for The influence of specular reflectance, albedo, and shape on perceived gloss: A case for three-dimensional maximum likelihood conjoint measurement (MLCM)
Supplemental material, sj-pdf-1-ipe-10.1177_20416695261435363 for The influence of specular reflectance, albedo, and shape on perceived gloss: A case for three-dimensional maximum likelihood conjoint measurement (MLCM) by Sabrina Hansmann-Roth in i-Perception
Footnotes
Acknowledgments
Part of this work was inspired by a previous discussion with Kenneth Knoblauch and part of that data was presented during the MLDS and MLCM symposium at the European Conference of Visual Perception (ECVP) in Leuven. The author wishes to thank the organizers of this symposium and all observers for participating in this study.
Author Contribution(s)
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Sabrina Hansmann-Roth was funded by a grant from the Icelandic Research Fund (#239774-051).
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
