Abstract
Additively manufactured (AM) parts typically possess high surface roughness (∼5–30 μm) and large surface features, resulting from balling, partial sintering/melting, and staircase effects. However, there are few widely accepted/adopted methodologies for measuring and characterising AM surfaces. This research proposes a practical, reliable, and repeatable methodology for the measurement, analysis and characterisation of surfaces produced via AM. Various line and surface roughness parameters are measured on the top and side faces of AM cubes, using both tactile and optical profilometers. The line-based tactile roughness measurement approach is considerably faster than surface measurements but offers a limited range of accuracy. The study has undertaken an exhaustive analysis to establish the applicability and degree of accuracy of line-measured roughness metrics, when benchmarked with full-surface measurements. An analysis of the limited range of applicability of line measurements is provided within a framework of probabilistic uncertainty analysis, performed on measured roughness data. This study also explores alternative methods for reducing the measurement time required without compromising the reliability of the results, by decreasing the resolution of the scanned data. It is demonstrated that deviations in surface parameters remained within 2% when the resolution was reduced by 50%; consequently, the measurement time was reduced by ∼75%.
Keywords
Highlights
A practical, reliable, and repeatable methodology for the measurement, analysis and characterisation of surfaces produced via additive manufacturing.
A comparative analysis of the AM surface parameters measured by tactile and optical profilometry.
Identifying three distinct roughness ranges and the associated degrees of accuracy with which line measurements can capture the surface roughness in these ranges.
Areal measurements suitable when describing maximum values of the surface features/asperities or the shape of the surface.
Reduction in measurement time by ∼75% when reducing the device resolution by 50%, albeit deviations in measured surface parameters are only within 2%.
Introduction
Background
In recent years a significant uptake of Additive Manufacturing (AM) within industry has been seen, although widespread adoption of AM in the manufacturing sector is yet to be fully realised. 1 The parts produced via high power laser-based metallic AM processes, such as powder bed fusion (PBF) and directed energy deposition (DED), typically suffer from adverse surface integrity issues, such as high surface roughness (5–30 µm Ra), porosity, balling, and tensile residual stresses. Of these, surface roughness is a major concern for the functional and aesthetic properties of AM components, as high roughness can detrimentally affect their tribological characteristics, fatigue life, and corrosion resistance.2,3 As AM has matured, numerous comprehensive studies have been undertaken to improve the AM parts’ surface finish via process parameter optimisation,4,5 and through various post-processing techniques.6,7 The quantification of any surface changes achieved necessitates effective measurement and characterisation of AM surfaces. AM remains relatively immature when compared to traditional manufacturing processes, and while there have been numerous attempts to accurately measure and characterise AM surfaces,8,9 to date there are few comparative studies which effectively measure, analyse, and characterise AM surfaces using different measurement techniques. Therefore, no standard practice for producing accurate and reliable results has been established.
The measurement of AM surfaces is particularly challenging due to their characteristic morphology; high roughness values, tall and high aspect-ratio features, and re-entrant features (overhangs and undercuts). These irregular, complex features present difficulties for different measurement systems and techniques. Furthermore, these features are distributed across the surfaces, and often do not exist in any discernible pattern. Thus, relatively large measurement areas are required to ensure all features are captured effectively by the analysis, increasing measurement times and analysis complexity.
A further challenge faced when defining the morphological features of AM surfaces is to identify appropriate surface parameters that can be used to effectively differentiate one surface from another. Research has shown that skewness can differentiate between top and side surfaces of laser PBF (L-PBF) polyamide-12 samples, with side surfaces typically having negative skewness values whereas the top surfaces exhibited positive skewness. 9 A study has also been undertaken to develop algorithms to detect and segment features from a surface measurement. 10 This has the potential to improve the understanding of the unique structures and characteristics of AM surfaces (e.g., pores and adhered particles).
The challenge of determining appropriate parameters is not confined to AM, however other applications have the benefit of history and experience to guide them. For example, in tribology, the material ratio (ratio of bearing length to evaluation length) is considered a more crucial parameter for assessing the tribological behaviour of the bearing surfaces, rather than the average height of the surfaces. 11 It is therefore imperative to identify which parameters are significant in characterising the specific morphologies present on AM surfaces. Without a standardised methodology, there is the potential for significant inconsistencies in results between different research groups, individuals, and measurement methods, leading to difficulties in comparing and determining the validity of the resulting data. Furthermore, in manufacturing industries, with increasing prevalence of AM there is a need for a standard procedure for assessing the quality of AM surfaces and quantifying them, to enable certification and acceptance of parts between suppliers and customers.
Surface measurement techniques
In modern metrology there are two broad categories of measurement system, dependant on whether the measurement is conducted by direct contact of a probe with the surface (tactile measurement) or without any physical contact between them (non-tactile). Most tactile systems collect two-dimensional (2D) roughness information, which is the intersection of the surface topography with the plane of the stylus motion. A series of these 2D profiles, gathered at regular intervals along the axis normal to the profiles, can be used to reconstruct the corresponding 3D, or areal, surface height maps, an example of which is shown in Figure 1. Tactile measurements remain popular due to the well understood measurement characteristics and relative ease of operation. 12

Example of a surface height map.
Within the non-contact class of profilers, the most common systems use optical sensors to collect an image of a surface. Many optical techniques are available including, confocal microscopy (CM), interferometry, and focus variation (FV). 13 Confocal microscopy utilises incident light passing through a pinhole before illuminating the surface to be measured. The reflected light also passes through a pinhole and the intensity is detected. When the surface is in focus, the detected light intensity is greatest, which reduces rapidly when the surface is out of focus. The distance between the optics and surface can be modulated to gather information for a range of Z-heights, and when combined with the X-Y traverse, surface height maps can be constructed. 13 Interferometry uses laser light to measure the difference in lengths between two beam paths, this is achieved by splitting the incident laser beam along two paths, the reference length and the test length (distance to sample surface). The beams are recombined and directed to a detector, the phase shift between them is then used to calculate the difference in length, and therefore the height of a point. 14 Finally, FV uses very short focal length optics to photograph a surface at various Z-heights to generate an image stack. Each pixel is then analysed to determine at which point in the stack it is at maximum focus (highest contrast to a neighbouring region of pixels). The height of greatest focus corresponds to the surface height of the pixel.15,16 Comparative studies have been undertaken on the performance of these, and other techniques, for characterising AM surfaces. For example, de Pastre et al. 9 computed the difference in heights between different measurement techniques across a surface, while Senin et al. 17 focussed on the measurement differences around specific features. Other areal measurement techniques include conoscopic holography, atomic force microscopy, and elastomeric sensors, whereas 2D imaging methods involve optical and scanning electron microscopy while volumetric measurement techniques encompass X-ray computed tomography (XCT). 18 Heinl et al. 19 investigated the suitability of various optical surface detection techniques such as FV, fringe projection and CM alongside the established profilometry technique to evaluate the AM surface profiles. FV has been found to be suitable as an areal surface detection method for surface evaluations, although an additional adjustment of the linear profilometer measurement was also suggested. While comparing between the capabilities of FV and XCT in measuring AM surfaces, FV was typically found to exhibit a finer resolution than XCT, whereas XCT could capture re-entrant features and displayed no non-measured points, in contrast to FV. 20 Various in-situ process monitoring and metrology techniques (thermocouples, high-speed cameras, pyrometers, photo detectors etc.) have also been implemented for characterising AM parts (defects, pores), however such systems are largely limited to producing information about the component surface. 21 Different optical measuring methods, such as conoscopic holography, photogrammetric scanning and structured light scanning have been used by Guerra and Lavecchia 22 to characterise AM freeform artefacts. It was observed that an increase in the cut-off values applied resulted in a decrease in the differences between the measuring instruments, achieving more convergent data. McGregor et al. 23 implemented X-ray computed tomography (XCT) method to inspect internal features of polymer AM nozzles. Giganto et al. 24 evaluated the capability of different optical inspection systems commonly used in industry, such as laser triangulation, conoscopic holography and structured light techniques, and assessed the effect of applying filters to digitise metal AM parts. They concluded that filtering did not have significant influence, regardless of the sensor used. However, the geometrical results were strongly affected by the point-cloud quality. A real time scanning method using optical scattering tomography has been developed by Orth et al. 25 for in situ defect detection and correction of AM parts. By comparing various AM surface measurement techniques (confocal, interferometry, FV and XCT) Thompson et al. 26 inferred that no one measurement system is superior to the others and the selection should not be solely based on the measurement performance.
Each of the available technologies has both strengths and drawbacks in AM applications. There is considerable interest in research to use optical profilometry to characterise AM surfaces, due to the direct acquisition of areal data and reduced acquisition time compared with the tactile methods. 13 AM surfaces, however, are challenging to measure optically due to the high aspect ratio of features, large maximum deviation of the surfaces, and their optical properties. 17 AM surfaces often feature regions of high reflectivity, dispersed over a comparatively duller background. As optical metrology systems measure based on the reflected light from a surface, the variation can lead to errors in the calculated height of points, or complete failure to measure points. 27 A contact profilometer can also face difficulties, primarily due to the long measurement times associated with areal measurements by these systems. Tactile surface profilers display other related issues, including the smoothing effect of the stylus tip, risk of losing contact (skipping), limited vertical range, and risk of damaging soft materials (such as polymers and aluminium). 14 Neither technology can evaluate re-entrant features, such as undercuts, which may be present on some AM surfaces. 28
Recent research has successfully utilised several optical systems to evaluate AM surfaces, with some studies quantifying the discrepancy between topography computation.17,26 It has been shown that, for small scale features, the discrepancies in results between measurement technologies can be of a similar magnitude to the features being measured. However, there has been less focus on comparing optical methods with 3D tactile measurement techniques, which often only involved 2D line profiling. 15 For example, de Pastre et al. 9 evaluated surfaces using a stylus-based tactile system and various optical systems, extracting profile data from the acquired areas from which parameters were calculated. It was found that the contact measurements consistently reported mean roughness values (Ra, Rq) lower than those recorded by the optical systems. It is well known that line and area parameters should not be directly compared due to the differences in computation, and the possibility of line scans omitting important features. 27 Therefore, it is imperative to evaluate AM surfaces using 3D areal contact-based methods to ensure confidence that the same features are recorded and all generated surface parameters are directly comparable.
Aim of the research
The nature of surfaces produced by AM is unique. Typically, AM surfaces do not possess a strong lay, as opposed to those generally observed on the surfaces produced via conventional subtractive manufacturing (SM) processes, such as turning, milling, and grinding. Furthermore, the AM surfaces also exhibit high variability on a local scale. Due to these factors, standard two-dimensional (2D) metrology solutions do not capture sufficient data to effectively characterise an AM surface. Conversely, three-dimensional (3D) surface measurement techniques demonstrate an improved capability for capturing the AM topography features, 13 however, 3D techniques typically require greater measurement times to collect and process the data. Furthermore, as AM surfaces are usually much rougher surface (Ra 5 µm to 15 µm) than their SM counterparts (machined surfaces typically have 0.8 µm to 6.3 µm Ra 14 ) the standard recommended 2D evaluation length (ln) for AM surfaces is much longer (40 mm based on an Ra > 10 µm, compared to under 4 mm for SM surfaces, where Ra < 2 µm). 29 Such long ln may not always be possible to acquire on parts with dimensions smaller than ln. Selection of appropriate lateral resolution when using FV technique has also been recommended by Liu. 20 Thus, it is necessary to propose alternative methods to define and evaluate the right size of the area, when using 3D measurement techniques, to effectively represent the AM surface.
The aim of this research is to develop a practical, reliable, and repeatable methodology for the measurement and analysis of the surface roughness of AM parts. In particular, the study investigates the applicability and degree of accuracy of line-measured roughness metrics, when benchmarked with full-surface measurements. Analysis of the limited range of applicability of line measurements is undertaken within a framework of probabilistic uncertainty analysis performed with the measured roughness data. The study further explores alternative ways to reduce the measurement time without compromising the reliability of the results, by examining the effect of decreasing the resolution of the scanned data.
Methodology
Fabrication of AM specimens
Additively manufactured specimens (10 × 10 × 10 mm3) were fabricated using a Renishaw AM250 L-PBF machine, from gas atomised AlSi10Mg alloy powders, as supplied by Renishaw plc. A set of 27 cubes were produced using a design of experiments, with one replication, and involving the following parameters: layer height 25 μm, laser power 200 W, point distance varied between 64-96 μm, exposure time varied between 112-168 μs, and hatch spacing varied between 64-96 μm.
Three of these samples were selected based on having either a visually good or poor surface finish, along with exhibiting a range of features common to AM surfaces (adhered particles, tall peaks, different optical properties, etc.) to allow a thorough evaluation of such features. Examples of these surfaces are shown in Figure 2, where some of the key features can be seen. These specimens will henceforth be referred to as S1, S2, and S3. Measurements of the surface roughness were performed on the top (perpendicular to build direction) and side (along build direction) faces of the samples.

Optical images of three as-built AM parts, showing surfaces with visually (qualitatively) the (a) best, (b) medium, and (c) worst surface topography. The ‘balling’ features are shown in red circles.
Surface measurement procedure
The top and side faces of the selected L-PBF parts were measured and analysed for roughness and topography characteristics using tactile and optical profilometry. Tactile measurements were carried out using a Taylor Hobson Form Talysurf 2 contact-based profilometer to collect both line (2D) (designated T2D) and areal (3D) (designated T3D) datasets. 2D measurements were taken due to their prevalent usage within industry, well-defined characteristics, and short measurement time. As for the optical profilometry, two optical sensors were employed for generating areal datasets, one with an automated stitching of multiple fields-of-view (FoV) capability (Alicona G5 InfinteFocus system, 30 designated OAS) whereas the other with a manual stitching strategy (Sensofar Smart microscope, 31 designated OMS). Of the three available data capturing options (confocal microscopy, interferometry and FV), only the FV technology was used in the current study, due to its reduced measurement time and low percentage of non-measured points (NM points) when compared to the other two methods. Following surface measurement, the data were stitched (when using OMS), analysed and compared using MountainsMap software by Digital Surf. 32 The measuring instruments and their respective settings are listed in Table 1.
Different surface measurement systems and the settings used for the respective measurements.
The workflow for the areal datasets is shown in Figure 3. A 2.5 µm S-Filter, based on using a 2 µm stylus tip, was used to account for resolution differences between the different systems used. The 2D data were collected according to ISO 1134 and processed in a similar fashion, however with the maximum surface parameter values (e.g., maximum recorded Ra) were used for further analysis as per the same ISO standard. 33

Processing flow for areal 3D measurements, dotted lines indicate that not all datasets required that operation.
The nominal measured area for 3D datasets (by using both tactile and optical methods) was chosen as 8 × 8 mm2, to capture as many features from the surfaces as possible, while avoiding edge effects, as reported in. 34 The OAS system automatically scanned and stitched areas larger than the FoV of the chosen objective and thus the number of stitched regions was set by the system itself. Due to the large number of data points acquired, the OAS system reduces the file size through down sampling the results, the effect being to reduce the resolution and increase the apparent pixel size on the measured surface. The OMS system required manual imaging of the surface, followed by stitching the resulting array of collected images via the MountainsMap software. To achieve the desired 8 × 8 mm2 area using the selected objective lens, an array of 5 × 6 images (i.e., the FoVs of size 1.7 × 1.42 mm2) was used for manual stitching. An additional 10 profile measurements were conducted (T2D), in line with ISO 1134, 33 with 8 mm evaluation length captured from each surface for comparison, arranged across the surfaces as shown in Figure 4.

Orientation of line profiles taken across sample surfaces. Sample edges are shown with dashed lines. All dimensions are in mm.
Localisation of the scanned datasets
When comparing multiple measurements from the same surface, either using different measurement systems or after mechanical testing (e.g., when evaluating wear tracks) it is important to ensure that the same area is being evaluated. In this study, some misalignment in the results was expected when comparing the scanned areas obtained from different measurement devices, and therefore corrections were required.
To localise the different surfaces, first matching features must be identified on the surfaces to act as reference points for the following processes (Figure 5(a)). Next, translations and rotations are applied to the surfaces to align identified features (Figure 5(b)). Finally, the output images are cropped to extract the matching areas (Figure 5(c)). This can be achieved through metrology software, such as MountainsMap, or bespoke algorithms, such as those described by Senin et al. 17 During the localisation process it is useful to designate one dataset to be the reference, to which all others are compared after applying appropriate transformations.

Schematic of the localisation process, (a) identify matching features on the surfaces – denoted by with red circles, (b) translate and/or rotate the surfaces so the matched features align, (c) crop the surfaces to leave only the corresponding areas.
For this work, datasets had to be aligned using the localisation operator within MountainsMap. An initial ‘coarse’ alignment was carried out manually, followed by an automated process which was necessary due to the automated feature being limited to correct small rotational differences. To ensure the extracted surfaces from all three data sets matched, this process was repeated between each pair of surfaces, applying subsequent localisations to the previously extracted areas.
It is necessary to localise the data to ensure the same features are being analysed in all subsequent comparisons. As can be seen from Figure 6, there are several features present on the original surfaces (a, b) that are beyond the bounds of the cropped surface (c, d). It follows, therefore, that any subsequent parameter calculations would be affected by including these outlying areas, along with any measurement differences that may be present. As the goal of this study is to evaluate the comparative performance of these measurement systems, eliminating variables (such as subtle differences in measured areas) which may affect the results is essential to enable accurate comparisons to be made.

Example of a surface measured using two different measurement systems: (a), (b) before, and (c), (d) after localisation, with landmarks highlighted. (a) and (c) were captured using the optical system with automated stitching, while (b) and (d) with the tactile device. Heights are not affected by localisation; any apparent changes are due to subtle differences in the colour assignments.
Selection of surface parameters
The most common parameter used to evaluate surface roughness in engineering applications is the arithmetic mean height of deviations from the mean-line of a surface (Ra and Sa for 2D and 3D measurements, respectively), however this alone does not provide much information to the overall form, appearance, or properties of a surface. 35 Two surfaces with the same Ra value can have very different morphologies (shown conceptually in Figure 7) and therefore can exhibit very different mechanical properties. It is therefore necessary to consider other parameters to effectively characterise a surface. 36 In this study, Ra, Sa, Ssq, Sku, Sp, Sv and Sz surface parameters were recorded for each of the samples.

Example of two theoretical surfaces with the same arithmetic mean height (Ra).
Post-processing of the scanned datasets
Post-processing of the recorded datasets is important in order to characterise the features of interest, extracting form, waviness, and roughness, as is pertinent to the required functional properties of the parts. Appropriate filtering should also be selected to allow repeatability in measurements between samples, and across measurement processes. The previously described process flow (Figure 3) was used to ensure consistency between different measurement methods and to minimise any errors induced by the processing steps. Levelling was the first step used to remove any errors in fixturing. The next step involved filling up of the NM points on the surfaces through smooth interpolation. Finally, the surfaces were filtered, and parameters computed.
With conventionally manufactured surfaces there are guidelines in ISO standards for the appropriate selection of filters, such as ISO 4288, 29 however previous studies clearly indicate that they do not translate well for analysing AM surfaces. 37 The present work therefore incorporates an assessment of the effect of filtering on the measured surfaces. A robust Gaussian filter was used for its superior performance in terms of reduced waviness profile deviations around step features, when compared with the standard Gaussian filter. 38 ISO 13565-1 suggests using a 0.8 mm λc filter for stratified surfaces, such as those characteristic of AM processes, with a 2.5 mm cut-off also being suitable in some circumstances. 39 This matches with findings by other authors, 37 and observations of maximum feature widths present on the examined surfaces.
Analysis of lay
To confirm if there was a lay present on the surfaces both the line profiles and areal scans were analysed. The areal data was analysed in the MountainsMap software using the ‘Texture Direction’ operator that provides graphs of the relative strengths of texture directions. Any large peaks on the plot correspond to a predominant direction of the surface lay (an example is shown in Figure 8 amd Table 2). Of greater interest is the “isotropy” value generated. In this context, isotropy is described as “the higher the percentage value the more the surface resembles itself in every direction”. 40

An example result obtained from the ‘texture direction’ operator.
The three directions presented correspond to the three longest peaks on the radial graph in Figure 8.
The line scans were separated based on orientation across the surface (grouped as parallel lines, shown in Figure 4) and the mean width of profile elements (PSm) values calculated from the primary profile. The average value in each direction was calculated, with the range between the maximum and minimum values for each surface compared. This was used to calculate an “isotropy” value (γ), where a high isotropy means the surface resembles itself in every direction (i.e., no directionality) while a lower value implies the surface only resembles itself in few directions which indicates that the surface has strong directionality or lay.
Selected resolution/magnification
A key consideration in industry is the time required to complete tasks that could be considered as manufacturing, quality control/assurance, or verification operations. Therefore, it was decided to evaluate whether it is possible to reduce the time required for measuring the surface topography of AM samples. With both tactile and optical profilometers, the use of higher resolutions during measurement allows a greater level of detail of the features to be captured, however this also requires more time to collect and process the data. Thus, it is important to determine a suitable trade-off between resolution and measurement time, whilst still retaining the accuracy of the measurements taken. To emulate the influence of reducing the resolution of area scans an operator within MountainsMap was used to down-sample the surface files. For the T3D measurement only the number of lines along Y was reduced, as the resolution in X does not significantly change the measurement time. Similarly, the OMS results were resampled in both X and Y to emulate the reduced spatial resolution of lower magnification objectives. In both cases, the resampling was performed before filtering of the surfaces. The parameters were then calculated and compared with those determined from the full resolution images. The relative difference in surface parameters obtained was then used to evaluate the influence of resolution on the results.
A comparative analysis of line and areal measurements
A comparative study of the line and areal surface measurements was undertaken. The objective was to gain insight into whether representative metrics from line measurements accurately approximate the true surface roughness of the surface under consideration. The benchmark roughness value chosen for this study is the OMS areal measurement results taken on the top faces of all 27 AM cubes, which are taken to represent the true roughness of the surface. This is justified since the areal measurements provide a high-fidelity estimate of the roughness. The areal measurements are data-rich (due to the chosen fine scan resolution), time consuming and resource intensive compared to obtaining a selection of line measurements on the same surface. A study of the accuracy of roughness metrics predicted from line measurements is required, in order to establish the applicable range of roughness metrics over which this rapid and cost-effective method can be employed to obtain a robust estimate of surface roughness.
This comparative analysis also focuses on quantitative uncertainty quantification for the prediction of roughness metrics from the measurement dataset. This is done to ensure that predictions of line roughness metrics reflect the epistemic uncertainty due to limited data, measurement noise, and are sufficiently robust and avoids overfitting.
Measurement results and discussion
Visual inspection of AM surfaces
The selected surfaces of the as-built AM parts (S1, S2, and S3) were inspected visually, to judge the roughness and identify larger features present. It was noted that the top faces had a greater lustre than side faces and appeared to have longer wavelength undulations. Additionally, the top face of S1 had several pronounced peaks, with a height and diameter of around 0.5 mm. In contrast, the side faces seemed to have shorter wavelength components, together with a duller appearance. The side faces appeared reasonably homogeneous, with evenly distributed adhered particles and consistent feature heights.
It was realised that the top and side surfaces showed considerable macroscopic differences, and thus required different techniques to measure and process the surface data. Due to the higher reflectivity of the top surfaces greater care was necessary when setting the illumination level (brightness) of the optical systems, while the larger individual features could cause the movement of the tactile measuring probe to go out of range.
Qualitative comparison of scanned surfaces
Once the scanned data were collected, levelled, and had NM points filled, the height maps were inspected visually. For each resulting image, strong asperities were identified. An example of this is shown in Figure 9 for the S1 top surface. Features were identifiable across all measuring devices; however, these were less apparent on the side surfaces due to the more uniform distribution of asperity heights.

Top surface scans of sample S1 taken by the three measurement techniques: (a) tactile (T3D), (b) optical with automated stitching (OAS) and (c) optical with manual stitching (OMS), together with the strong asperity features (balling) identified.
Likewise, low points (valleys and pits) were identified, as shown in Figure 10. It can be seen that the appearance of the valleys is more prominent in the image produced by T3D, whereas the optical measurement techniques presented equivalent features with reduced depth and less clearly defined boundaries.

Top surface scans of sample S3 taken by the three measurement techniques: (a) tactile (T3D), (b) optical with automated stitching (OAS) and (c) optical with manual stitching (OMS), together with the prominent valleys identified.
The limitations of tactile profilers when measuring deep valleys with steep local slopes, which can result in shallower measured depths than the actual values (as reported in literature17,26), were not apparent in the present study. The presented results imply that the features present on these samples are of suitably low aspect ratio to avoid inducing such errors when using the tactile system. Conversely, the working principle of FV systems involves capturing images of the surface using a very short focal length (in the µm range) and comparing the contrast of pixels at each level in the Z-stack. 16 The reduced depths and defused boundaries of the deep valleys, as obtained by the optical systems (Figures 10(b) and 10(c)), were possibly due to insufficient illumination, causing the focus detection algorithm to be less effective in these regions. This effect was subsequently mitigated through careful set-up of the illumination used during measurements, although the effect could not be entirely eliminated due to a necessary trade-off with over-exposing asperities.
Features were also identified which may cause erroneous results, depending on the measurement technology. The most obvious examples were observed when using OMS, where artefacts of the stitching process are present, often with clearly visible defined boundaries between individual images. A representative image is provided in Figure 11, where a grid of height discontinuities can be seen, aligning with the image boundaries. To minimise this effect, the offsets used were kept consistent, and a relatively large overlap of adjacent images (0.25 mm/0.15 mm along X/Y respectively) was used, although the effect still remained. From analysis of the computed parameters, it was noted that the values were comparable as those calculated from T3D and OAS (discussed further in section 3.4). It is therefore inferred that the stitching artefacts did not significantly affect the accuracy of results.

A representative image of scanned dataset from optical profilometer with manual stitching. Stitching artefacts indicated by dashed lines.
It is well documented that optical profilometers are sensitive to the optical properties of a surface, for example the reflectivity. 17 With AM surfaces, there are often regions of high reflectivity dispersed across a less reflective background. These optical characteristics can cause errors in the height calculations, or failure to measure the height of a point. One instance of this is seen in the top surface of sample S1, where the OAS device presents low points in the middle of asperities, such as that the black region in the enlarged area of Figure 12. These incorrect points are likely due to the surface scattering the incident light, or because of the reflected light over-exposing the sensor; both resulting in the system being unable to calculate the point of maximum focus accurately. It is possible to remove these outliers during data processing using software, once identified, but it is necessary to be cautious, ensuring features such as adhered particles and asperities of high aspect-ratio are not unintentionally removed in the process. Due to these limitations, and as outliers were not present on all measurements, this operator was not used in this work.

(a) Erroneous points present in the scanned image taken by optical profiler with automated stitching (OAS), and (b) the equivalent surface scanned by tactile device (T3D).
The visual inspection of these surfaces demonstrated the necessity to take a high level of care when measuring AM surfaces, in order that reliable results are generated. Sufficient time must be allotted when setting up, to assess the surface for features that could interfere with the measurement (e.g., high asperities, different optical properties) and to reduce the need to modify the set up after unacceptable measurements. The heights of asperities are especially important with tactile measurements where the probe has a limited Z-range, and going beyond this would cause a measurement to fail, and risk damaging the apparatus.
Filtering of the scanned datasets
As discussed in Section 2.5, the surface data were levelled using the least squares method and matching areas were extracted before filtering. Due to the localisation process reducing the available FoV, it was necessary to manage end effects when applying the 2.5 mm nesting index. MountainsMap employs a proprietary technique to achieve this. Additionally, the surfaces were also filtered with a 0.8 mm nesting index, with end effect management. The relative difference (ε) between nesting indices for each parameter on each surface was calculated for each measurement technique using equation 1 (presented using Sa as an example, with the subscripts identifying the nesting index used).
The average values were then found for each surface parameter (see Figure 13(a)). It is apparent that with the change in the nesting index, Ssk, Sku and Sa are affected in greater proportions (with average differences of 16.3%, 15.1%, and 12.5%) than the parameters that typically have higher magnitudes from the mean-line (average differences of Sp: 4.4%, Sv: 6.1%, and Sz: 3.3%). Figure 13(b) shows the influence of nesting indices based on measurement system and surface. It reveals that the top face surface parameters are more strongly influenced (9.5% to 19.8%) with the change in the nesting index than those of the side faces (2.4% to 6.4%). This implies that different nesting indices may need to be chosen for surfaces oriented differently. As can be seen from Figures 13(a) and 13(b), the three measurement systems used have differing sensitivity to variations in the nesting index, however the same overall trends can be observed (e.g., Ssk and top surfaces are most strongly influenced). The differences are thought to be due to the small variations in the calculated height maps generated by each system. The trend of the S2_top sample data was similar to that of the S1_top and S3_top samples. As S1 and S3 were the ‘best’ and ‘worst’ surfaces visually, the data from these two samples are presented fully, but the S2_top data are omitted.

(a) Average difference for each surface parameter, (b) average difference on each specimen's surface, when using different nesting indices (0.08 and 2.5).
The scarcity of guidance in the standards for appropriate nesting indices, along with a lack of consensus within literature 28 is hampering surface texture assessment within AM. While there are comprehensive studies focussing on improving the AM surface finish, and correlating roughness to various mechanical properties, it is imperative that a consistent surface measurement methodology is adopted, so that results can be compared directly. From the results of this study, it is apparent that the prescribed filter parameters are not appropriate, due to the highly variable nature of AM surfaces. From these results it is implausible to prescribe a nesting index that will be suitable for all L-PBF surfaces (let alone AM in general). This suggests that, when investigating the general roughness (medium to large scale wavelengths) of a surface or a part, a nesting index can be chosen based on the maximum dimensions of features on a surface. In the present case, this implies that a nesting index of 0.8 mm would be suitable. However, if the micro-scale roughness is of primary concern (such as when optimising the weld tracks or melt pool), a smaller nesting index would be required, to focus on features of an appropriate scale. As it is likely that different studies will employ different nesting index values, it is imperative that the entire measurement post-processing stream (e.g., levelling operator, S-filter and L-filter values) is reported alongside the results, to enable meaningful comparisons to be made. 28
Computed surface parameters
The arithmetic mean height parameters (Sa, Ra) are an average measure over a surface or profile, and therefore are not strongly influenced by single features that are much higher or lower than the general profile. It is therefore expected that the Sa value be robust to changes between measurement devices. Figure 14 shows the Sa values calculated from the three different measurement systems (T3D, OAS, OMS), along with the maximum Ra value from the same surfaces measured using T2D. It can be seen from Figure 14 that the Sa values obtained for a single sample vary by 2.1% to 8.2% across the different measurement systems. By comparison, the variance between the Ra values and related Sa values remains small at 0.8% to 3.0%, with the exception of sample S1, where the difference is much higher (8.5% for side face and 37.0% for top face). This is due to distinct large surface features (see Figure 9) that were not captured by the line scans.

Areal arithmetic mean height (Sa) values calculated from surface data captured with different areal measurement techniques and maximum profile arithmetic mean height (Ra) values calculated from tactile profile measurements (0.8 mm nesting index).
Similar trends were observed for Sq values, with the results for the different measurement systems showing an average discrepancy of between 2.0% and 4.3%. However, the Rq values were consistently lower than the average Sq values, with much larger differences (5.8% to 47.9% on S1 side and top faces respectively). This is likely due to the line measurements missing extreme features, which are weighted higher by the calculation of root mean squared parameters.
For both Sa or Ra and Sq or Rq, the values are greater when filtered using a 2.5 mm cut-off, with the areal results following the similar pattern (highest on the top face of sample S1, and lowest on sample S3). However, the T2D profile results also show greater variation than at 0.8 mm cut-off (between 33% and 50% on samples S1 and S2). The T2D profile results remain comparatively close for sample S3 (under 20% difference with respect to the average area values). This re-affirms that the choice of nesting index is of great importance when analysing surface roughness and to compare results the nesting indices must be kept the same. Furthermore, from these results it is inferred that 2D profile measurements are only comparable when using a 0.8 mm cut-off, however a more extensive evaluation would be necessary to make a more generalised assessment.
The skewness (Ssk) values showed variation between 3.5% (S3 side face) to 17.3% (S2 side face) when comparing across the measurement systems, whereas the kurtosis values (Sku) had lower discrepancies of 2.5% to 6.7% between the measurement systems (graph not presented) and were comparable to the Sa and Sq results. The similar Sku values obtained across all systems imply that the different measurement systems describe the shapes of asperities similarly.
In summary, the most commonly applied measurement parameter, Sa or Ra, shows variations of less than 8.2% between measurement techniques. However, to fully characterise a surface, other parameters should also be considered. 35 The choice of parameters should be based on the desired functionality of the surface (e.g., Ssk is important for bearing applications).
Analysis of lay and isotropy
Line profile results (T2D)
The average PSm values recorded in each of the four T2D trace orientations (from Figure 4) were calculated and compared. PSm was chosen as it describes the average distance between profile elements, and, as can be seen from Figure 15(a), PSm will change if the trace is taken along, across, or at an angle to the lay. The isotropy (γ) was calculated using equation 2, where a low Isotropy value indicates a surface with high directionality (a predominant lay). Figure 15(b) shows the calculated isotropy values for each surface, and it can be seen that the surfaces did not show strong directionality, with a minimum isotropy of 68.5% (and therefore the greatest directionality) on the top face of sample S1 when considering the data obtained from T2D measurements.

(a) Effect of measuring at different angles with respect to a lay (predominant lay is ‘across page’ as displayed) from ISO 1134, 33 (b) isotropy (γ) values calculated by MountainsMap (areal measurements) and using mean width of profile elements (PSm) on the profile measurements.
Areal results
The area scans were analysed in MountainsMap, as described in Section 2.6, and the calculated isotropy values were compared. The operator was applied to the S-F surface (surface with micro-roughness and form removed) to eliminate any influence of the L-operator (roughness filter). From Figure 15(b) it can be seen that the OMS results for samples S2 and S3 (49.5, 49.7, and 45.3%) are between 20 and 40 percentage points lower than the values calculated from the other two measurement systems, likely due to the stitching artefacts discussed in Section 3.2. It can also be seen that the range of results (omitting OMS for S2 and S3) was between 79.1% and 81.2% (2 percentage points) on S1's top face, and 64.8% and 78.7% (14 percentage points) on S3's top face. Furthermore, when comparing these values to the line results (discussed in section 3.5.1) the two methods provide similar results with the greatest difference being on the top face of sample S1 (66% and 79 to 81% for profile and area scan respectively). While not directly comparable due to differences in how they are computed, it indicates the methods discussed are a reasonable approach to identifying and quantifying the directionality of a surface.
The isotropy results obtained from the line and areal scans indicate that the surfaces did not have a prominent lay, although they were not fully regular in appearance. Some authors suggest that layered manufacturing could influence the produced parts’ surface textures, 41 however the 25 µm AM layer height used in the current study was significantly lower than the spatial frequencies of roughness measured, which was in the order of 100 s of microns. Therefore, at this scale/resolution, when using tactile systems to measure AM surfaces the orientation of the samples with respect to stylus motion is not of paramount importance. The effect of lay at this scale on the characteristics of the surface may become more significant when investigating smaller features (such as weld ripples).
Choice of resolution
Tactile Y-spacing
For the T3D measurements, Y-spacings were varied as 10 µm, 20 µm, 25 µm, 40 µm, 50 µm, and 100 µm. The average differences in the calculated parameters were consistent across these Y-spacing values for most parameters, however the range and maximum deviations were greater when using 50 µm and 100 µm spacings. This deviation highlights a problem with reducing the resolution of a measurement, especially with tactile measurements, where some very tall or deep features may be missed, and therefore affect the calculation of ‘higher magnitude’ parameters (e.g., Sz, S10z, Sp, Sv etc.).
Optical magnification
The optical measurement resolutions are determined by the magnification of the objective used, with magnifications as low as 2.5× available for some systems. 31 Table 3 shows the FoV and pixel size for some objectives of interest for this application. To replicate lower resolutions, the number of pixels in the surface file captured with a 10x optic were reduced to 50% and 25% of the original, to approximate 5× and 2.5× objectives respectively. For a true representation, the Z (height) resolution should also have been reduced, however the software is only capable of reducing this to pre-defined levels, none of which are representative of the quoted resolutions of the different objectives. The Z resolution was, therefore, not changed. Furthermore, the Sa values for the surfaces are over an order of magnitude greater than the Z resolution, and therefore the resolution is considered less significant when evaluating AM surfaces.
Specification of different magnification objectives and resultant measurement size for optical system with manual stitching.
FoV and Pixel size from manufacturer specifications. 31
It was observed that reducing the magnification did not strongly influence the ‘averaged’ parameters (Sa, Sq, Ssk, Sku) at either of the investigated magnifications. The parameters that quantify maximum (or minimum) values (Sz, Sp, Sp, S10z) were also not greatly affected at 5× magnification, however the difference was considerable when using 2.5× magnification, with maximum deviations from the baseline measurements of up to 7.7%. The reason for this increased range and average deviation could be because the lower resolution missed some of the localised features used in these calculations. As this deviation is constrained to the lowest magnification, it is recommended to use magnifications of 5× or greater in obtaining these measurements for AM surfaces. Furthermore, the time reduction is even more significant than in the tactile measurements, as the time taken is a function of the number of images required, which is in turn a squared function of the FoV, and also influences the computation time required when processing the data.
A comparative analysis of line and areal measurements
Figure 16 shows the line (T2D) and areal (OMS) measurement results. Notably, the X-axis gives the benchmark OMS surface roughness measurements against which the areal surface scan (Sa) measures are compared. Hence the ‘Sa reference’ (red line) in Figure 16(a) is a 45° line, which serves to highlight the deviation of the T2D line measurements from the areal surface roughness measurements. The line roughness measurements are obtained on a particular surface using a set of 10 line measurements as shown in Figure 4. Figure 16 contains Ra and Sa which can be written mathematically as

(a) The measured surface roughness data and comparison between line (Ra) and areal (Sa) measurements, (b) the lognormal distribution of the surface roughness using the line and areal measurements.
Hence the variance of the absolute deviations of surface roughness can be written, following from equations 3 and 4, as
The parameters
This lognormal distribution has been used to calculate the 95% confidence bounds shown in Figure 16(b) with both the line and surface measurements. It can be seen that the line measurements underestimate the surface roughness compared to the high-fidelity areal roughness measurements. The variability of the line roughness measurements is obtained by fitting a lognormal distribution on each of the 10 measured lines over a surface and creating an ensemble of samples taken from each of the 10 distributions. Mathematically,
The mean (
The distribution of the line measurements for each of the 27 measured surfaces are shown in Figure 17. For each of the measured surfaces, the mean and the 95% prediction intervals around the mean value are referred to in the legends. Given that the line measurements are all positive, a kernel distribution, with normal kernel and positive support, is fitted to the 10 line-measured Ra values, where a positive support has been chosen for the fitting of the data. It can be seen that the line scans provide a mean surface roughness value (blue line) which in most cases underestimates the areal roughness measurements values (i.e., Ra < Sa) in most cases. Hence it is necessary to investigate an appropriate statistical measure of the line scans which provides a better approximation of the areal roughness values (taken as the high-fidelity benchmark values).

Distribution of surface roughness with line measurements on the set of 27 specimens with varying roughness.
In order to investigate this, the overall trend of deviation of the line measurements, Ra (along with 95% bounds), from its Sa counterparts is presented in Figure 18. The figure presents three scenarios where the 10-line measurement data on each of the 27 surfaces is used to find a best fit estimate of Figure 18(a): The best-fit linear regression model with the full set of measured line data points (i.e., 27 × 10) Figure 18(b): The best-fit linear regression model with the mean of the measured line data points on each of the 27 surfaces Figure 18(c): The best-fit linear regression model with the maximum of the measured line data points on each of the 27 surfaces the line scans generally underestimate the surface roughness, except for in low surface roughness regimes. the maximum value of the line measurements on each surface has more uncertainty (and hence higher risk of outliers), hence often overestimates the roughness values. the mean value of the line scans provides a more robust estimate of roughness, and discrepancies of line-measured roughness values should be evaluated based on this metric.

Regression fit for different sets and summary statistic of linear measurements on the set of 27 specimens with varying roughness.
The measured discrepancies between line and areal roughness measurements, as shown in Figure 18, however, do not account for the uncertainty in the data. The following discussion highlights why it is important to take this into consideration for this analysis.
There are various sources of uncertainty in the dataset used to compare the line and areal measurements in the above discussion. These include, amongst others, measurement noise, calibration error and insufficient data. To account for these in the modelling, a Gaussian process (GP) stochastic surrogate model is introduced to fit the roughness data. GP gives a non-parametric way of fitting a model to data, and is conditioned using a training dataset
GP training is undertaken within a Bayesian framework from which the posterior distributions are inferred on the outputs. The full mathematical details of the GP training are beyond the scope of this work, but readers are referred to the following literature.42,43 The trained GP, conditional on the training dataset
The results, with the GP trained on the line measured roughness dataset, are presented in Figure 19. The training data for the GP (blue dots) consist of the mean of the 10-line measurements on each of the 27 surfaces, as already presented in Figures 17 and 18. The figure also gives the 80% and 95% credible intervals (CI) for the predicted roughness values. The uncertainty in the predicted roughness stems from the various sources of uncertainty, including measurement noise, data variance and uncertainty due to lack of data in certain regions. It is important to note that the variance shown in Figure 19 is the uncertainty around the mean prediction and not the variance around the measured surface roughness,

Stochastic Gaussian process model for surface roughness measurements with quantified uncertainty accounting measurement noise and epistemic uncertainty due to lack of data.
Figure 19 also contains the data of the areal roughness measurements (obtained with T3D, OAS and OMS) on which another GP was fitted. The 95% CI around these areal measurements (grey box) reflects the uncertainty due to lack of data in certain regions (especially between the 25–35 µm range). The comparison between the line measurements and the areal measurements shows that there are three distinct regions where the different statistical measure of the line roughness measurements can represent the ‘true’ areal surface roughness values:
up to 10 µm: the true mean surface roughness is captured within the 80% CI of the mean line measured values. between 10–15 µm: the mean surface roughness is captured within the 90–97.5% confidence band of the mean line measured values. beyond 15 µm: the mean surface roughness is underestimated by the line measurements and the measured quantiles of the line values cannot be reliably used to represent the actual surface roughness.
It is important to understand the distribution of the measured profile distances from the mean line

Comparison of surface roughness distribution for line and areal measurements using Gaussian process fitting of the measured data.
Thus, it is important to establish an upper bound beyond which the line measurements of surface roughness cannot be used as representative values of true surface roughness. The distribution of the roughness, as shown in Figure 20, along with the probabilistic measure of the uncertainty around the mean surface roughness Ra determined using a set of line measurements as discussed in Figure 19, highlight that 15 µm is approximately the upper limit for Ra-assisted predictions of surface roughness.
Proposed methodology
Based on the findings of this study, it is observed that there is a good correlation between measured Sa and average Ra value up to a certain limit, beyond which the values diverge. In particular, the uncertainly analysis of data suggests that for parameters such as the arithmetic mean height, or RMS height, measurements obtained using 2D techniques can be suitable with an upper limit of ∼15 µm for Ra-assisted prediction roughness. 2D line scans also allow considerably faster data collection. For determining other measurement parameters, the differences between 2D and 3D techniques are much larger, and therefore 3D measurements are necessary.
The data acquisition should follow the general process of visually inspecting surfaces for regions that are likely to be difficult to measure, then measuring the surface following machine specific guidance and relevant published standards. For 3D measurements collected with tactile instruments, this study has identified that AM surfaces do not present a significant lay, and therefore orientation of profiles is not significant. Similarly, for 2D systems, ≥ 10 individual measurements should be taken at various orientations across the surface, and the maximum calculated value should be considered to be the value of that parameter (ISO 1134 33 ), in order to achieve comparable results to 3D measurement systems (within the upper limit obtained from this study).
Furthermore, the post-processing of surface data, including the order of post-processing operations, and filter settings used, should be reported along with calculated parameters, to enable meaningful comparisons to be made between datasets. This is vital for effective quality control and quality assurance of AM parts within high-value industries.
Conclusions and future work
This study has developed a proposed methodology for measuring the surface characteristics of metal AM components. Three aluminium samples were measured using three different areal profilometers, and one 2D profiling system. The differences in the results obtained between measurement devices were quantified through a comparison of standard roughness parameters. Along with the measurement device, the nesting indices and measurement resolutions used were also varied.
From observations of the measured height maps, it was seen that all of the areal systems evaluated effectively captured the surfaces morphologies. Some discrepancies were, however, observed, such as erroneous points from the optical systems, due to the challenging optical properties of AM surfaces, and artefacts on the height maps resulting from the manual stitching process, corresponding to the boundaries of the individual measurements. The differences between measurement systems were quantified through comparisons of the calculated parameters. All three areal systems gave comparable values, with some surfaces being more challenging to measure than the others. However, when comparing the areal measurements with the faster, line-measured roughness values, it was observed that the areal measurements, due to greater coverage, captured more features (such as peaks) and localised characteristics than the line measurements. Thus, the line measurements underestimate the surface roughness statistics, and this discrepancy is significantly higher when considering surfaces with higher roughness. The study has identified three distinct roughness ranges and the associated degrees of accuracy with which line measurements can capture the surface roughness in these ranges. The findings are important because line-based roughness estimates are faster and less expensive to obtain, and this study establishes their suitability and operational range when benchmarked with more expensive and time-consuming areal measurements.
For line profile measurements it was found that using the peak spacing parameter from the primary profile PSm can be used to estimate the surface isotropy. The values calculated from line profile measurements agreed with those from the areal data, and therefore this approach is justified. The lowest isotropy calculated was 64%, implying that no surfaces had a strong directionality.
From a comparison of the calculated roughness parameters, it was found that measurements on the top faces (as-built) were more sensitive to the choice of nesting index than the side faces of the components. At the two different nesting indices included in this study, calculated parameters differed by between 9.5%–19.8% and 2.4%–6.4% for the top and side faces respectively. Therefore, different strategies are likely to be required based on surface orientation during manufacture. From these results, it is suggested that an L-filter nesting index can be chosen based on the maximum dimension of features observed on the surfaces of interest. This is a valid approach for evaluating larger-scale roughness, as opposed to roughness on the same scale as the layer heights or weld tracks. Further research is required into the filtration of AM surfaces for roughness measurements, in terms of filter types and clearer guidance on nesting indices. Another key consideration is the time required to measure a surface. One way of reducing the measurement time is through a reduction in measurement resolution. This was emulated through software and the calculated roughness parameters were used to quantify the effects. It was found that parameters could be maintained within 2% of the base value with approximately 75% reduction in the time required for measurement and processing of the results.
The natural continuation of the works presented in this paper will involve further measurement and characterisation of specimens built from other materials and manufacturing techniques, such as sintering, extrusion and directed energy deposition-based processes. Furthermore, more in-depth investigations are required to evaluate different filtering options, nesting indices and filter types, at different scales of interest. These will enable improved understanding of how to correctly evaluate the surface roughness of AM parts, enabling more widespread adoption of these technologies in industry. It is also important to combine these with determinations of uncertainties surrounding the measurement, both arising from the data acquisition and post-processing operation applied, to fully understand the limitations of these techniques.
Footnotes
Acknowledgements
The authors wish to thank Dr Franck Lacan of Cardiff University for the manufacture of the AM samples used, along with Dr Samuel Bigot and Dr Alastair Clarke, also from Cardiff University, for the provision of measurement equipment and the analysis software used. Special thanks are also for Prof Stefan Dimov and Dr Pavel Penchev for giving access to the Alicona FocusVariation device at the School of Engineering of the University of Birmingham. And finally, Prof Richard Leach, formerly from the University of Nottingham, provided invaluable insight and guidance on the measurement concepts used in this research.
Credit author statement
Ben Mason: Investigation; Data curation; Formal analysis, Writing – original draft
Abhishek Kundu: Supervision; Formal analysis, Writing – review & editing
Michael Ryan: Supervision; Writing – review & editing
Rossitza Setchi: Supervision; Writing – review & editing
Debajyoti Bhaduri: Conceptualization; Funding acquisition; Project administration; Supervision; Writing – review & editing
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research is supported by the EPSRC Doctoral Training Programme.
