Abstract
This paper presents an alternative approach to implementing a stereo camera configuration for SLAM. The approach suggested implements a simplified method using a single RGB-D camera sensor mounted on a maneuverable non-holonomic mobile robot, the KCLBOT, used for extracting image feature depth information while maneuvering. Using a defined quadratic equation, based on the calibration of the camera, a depth computation model is derived base on the HSV color space map. Using this methodology it is possible to build navigation environment maps and carry out autonomous mobile robot path following and obstacle avoidance. This paper presents a calculation model which enables the distance estimation using the RGB-D sensor from Microsoft .NET micro framework device. Experimental results are presented to validate the distance estimation methodology.
Introduction
One of the fundamental problems in mobile robot autonomous navigation is map decomposition and building. Thrun [1] in his 1997 paper talks about two major mapping methods for autonomous robots, which are grid-based and topological methods. Building maps as a mobile robot traverses a navigation environment, while using a partially composed map to maintain localization is better known as Simultaneous Localization and Mapping (SLAM). The most common methods of building localization maps are using occupancy grids [2], feature building [3], and raw sensor telemetry [4]. In Elfes [2] 1989 paper, an occupancy grid method is presented using multi sensors to build multidimensional random fields that maintain stochastic estimates of an occupancy state. In Leonard [3] et al.'s 1992 paper on feature building for dynamic maps for autonomous mobile robots, a critical point is made on sensor problems coming from noise and importance on the validation of sensor measurements. This paper uses a similar approach to that presented by Fang [4] et al, by using a raw sensor approach acquiring telemetry information via a monocular vision sensor. The approach presented by this paper uses a similar method, utilizing raw sensor telemetry via an inexpensive off the shelf RGB-D camera. An RGB-D camera is a projective sensor that measures the bearing of image features and returns an image with depth telemetry, normally using an HSL or HSV color map. Using the depth information from the RGB-D sensor, it is possible to build a navigation map, which in turn can be used for mapping, sensing, localizing, and environment modeling.
The focus of the map building and localization methodology is for small form factor mobile robots using the Microsoft .NET Micro Framework [5], where the map building and localization methods are programmed using the limitations of the Windows Presentation Foundation (WPF) [6]. The methodologies were tested on Netdruino Plus [7] and the GHI ChipworkX Development System [8].
The motivation for developing these methodologies stems from the platform limitations of OpenKinect [9] and Robot Operating System (ROS) [10] on .NET Micro Framework microcontrollers. An alternative to the RGB-D camera is the implementation of a stereo camera configuration [11], which utilizes two CMOS RGB cameras, to produce 3D reconstructions of the environment in its visual range. The idea behind using a single RGD-D camera is to simplify the stereo camera approach, and reduce the computation time to extract depth information from an image and use this image for SLAM.

The Netduino Plus: A Microsoft .Net Micro Framework Microcontroller
A Maneuverable Non-holonomic Mobile Robot
The KCLBOT [12] is a non-holonomic two wheeled autonomous maneuverable mobile robot. The mobile robot is built around the specifications for ‘Micromouse Robot’ and the ‘RoboCup’ competition. These specifications contribute to the mobile robot's form, factor and size. This mobile robot holds a complex electronic system to support on-line path planning, self-localization, and even SLAM, which is made possible by the on-board sensor array. The mobile robot is loaded with eight Robot-Electronics SRF05 ultrasonic rangers [13]. The drive system of the mobile robot is supported by Nubotics WC-132 WheelCommander Motion Controller [14] and two
The combination of hardware electronics and drive mechanics makes the KCLBOT, as represented in Fig. 2, an ideal platform for the self-localization methodology.
In the maneuverable classification of mobile robots [15], the vehicle is defined as being constrained to move in the vehicle's fixed heading angle. For the vehicle to change maneuver configuration, it needs to rotate about itself. As the vehicle traverses on a two-dimensional plane, both left and right wheels follow a path that moves around the instantaneous center of curvature at the same angle, which can be defined as, and thus the angular velocity of the left and right wheel rotation is deduced as follows:

The KCLBOT: A Non-holonomic Maneuverable Mobile Robot.
where L is the distance between the centers of the two rotating wheels, and the parameter iccr is the distance between the mid-point of the rotating wheels and the instantaneous center of curvature. Using the velocities (1) and (2) of the rotating left and rights wheels,
Using (3) and (4), two singularities can be identified. When
When the wheels on the mobile robot rotate, the quadrature shaft encoder returns a counter tick value; the rotation direction of the rotating wheel is given by a positive or negative value returned by the encoder. Using the numbers of tick counts returned, the distance travelled by the rotating left and right wheel can be deduced in the following way:
where Lticks and Rticks represent the number of encoder pulses counted by the left and right wheel encoders, respectively, since the last sampling and D is defined as the diameter of the wheels. With resolution of the left and right shaft encoders, Lres and Rres, respectively, it is possible to determine the distance travelled by the left and right rotating wheel, dL and dR. This calculation is shown in (5) and (6).
In the field of robotics, holonomicity [16] is demonstrated as the relationship between the controllable and total degrees of freedom of the mobile robot, as presented by the mobile robot configuration in Fig. 3. In this case, if the controllable degrees of freedom are equal to the total degrees of freedom, then the mobile robot is to be defined as holonomic. Otherwise, if the controllable degrees of freedom are less than the total degrees of freedom, it is non-holonomic. The maneuverable mobile robot has three degrees of freedom, which are its position in two axes and its orientation relative to a fixed heading angle. This individual holonomic constraint is based on the mobile robot's translation and rotation in the direction of the axis of symmetry and is represented as follows:

The KCLBOT in X, Y, and Z Planar Environment.
where Xc and Yc are Cartesian-based coordinates of the mobile robot's center of mass, and ϕ describes the heading angle of the mobile robot, which is referenced from the global x-axis. To conclude, (7) presents the pose of the mobile robot. The mobile robot has two controllable degrees of freedom, which control the rotational velocity of the left and right wheels and, adversely with changes in rotation, the heading angle of the mobile robot is affected. These constraints are as follows:
where
By using the quadrature shaft encoders that accumulate the distance travelled by the wheels, a form of position can be deduced by deriving the mobile robot's (x, y) Cartesian position and the maneuverable vehicle's orientation ϕ, with respect to time. The derivation starts by defining and considering s(t) and ϕ(t) to be function of time, which represents the velocity and orientation of the mobile robot, respectively. The velocity and orientation are derived from differentiating the position form as follows:
The change in orientation with respect to time is the angular velocity ω, which is defined in (4) and can be specified as follows:
When (12) is integrated, the mobile robot's angle orientation value ϕ(t) with respect to time is achieved. The mobile robot's initial angle of orientation ϕ(0) is written as ϕ0 and is given as follows:
The velocity of the mobile robot is equal to the average speed of the two wheels and this can be incorporated into (10) and (11), as follows:
The next step is to integrate (14) and (15) to the initial position of the mobile robot, which is as follows:
Equations (16) and (17) describe the mobile robot's position, where x(0) = x0 and y(0) = y0 are the mobile robot's initial positions. The next step is to represent (13), (16) and (17) in terms of the distances that the left and right wheels have traversed, which are defined by dR and dL. This can be achieved by substituting
Implementing (18), (19), and (20) provides a solution to the relative position of a maneuverable mobile robot. This is a possible solution to the self-localization problem but is subject to accumulative drift of the position and orientation with no method of re-alignment. The accuracy of this method is subject to the sampling rate of the data accumulation, such that if small position or orientation changes are not recorded, the position and orientation will be erroneous.
A RGB-D Image Capture using Microsoft's Kinect™ Sensor
In early November 2010, Microsoft released the Kinect™ sensor, which has the capability of RGB and RGB-D image capture. Exploiting the pre-released drivers for this device, it is possible to export and manipulate both the RGB and RGB-D images for mapping, sensing, locating and modeling for SLAM.
The Kinect™ depicted in Fig. 4, which is mounted on the KCLBOT, is built around PrimeSense's PS1080 SoC [2]. The PS1080 acts as a controller for the IR light source in order to build the input image with IR light coding depth information. The light coding in this case is mapped on HSV [17]. The device is configured such that a standard CMOS image sensor receives the projected IR light source and transforms the image to produce a depth image. The sensor has a field of view of 58° Horizontal (H), 40° Vertical (V), and 70° Diagonal (D). The sensor allows for a maximum image depth size of VGA, which is a resolution of 640×480 and has maximum image throughput frame rate of 60fps. Finally, the operational range of the sensor is limited to a minimum of 0.8m and to a maximum of 3.5m.

The Kinect™ Mounting Configuration on the KCLBOT.
As depicted in Fig. 5, the sensor is able to acquire RGB and RGB-D images, where the RGB-D images are used for environment building.

RGB and RGB-D Sensor Images.
Before the RGB-D images can be used for environment map building, the depth information from the image needs to be computed. To achieve this, a function needs to be derived to convert the RGB color values to Hue. Once the Hue value is acquired, it can be used to compute relative distance estimation. Using the model presented by Joblove and Greenberg [17], the following equation is implemented.
Here (21), the formula specifies the transpose of Hue value, using
The conversion function used to convert the transpose of Hue (21) to the Hue value needed, is as follows.
Now that a conversion for Hue from RGB has been defined, a function to convert the Hue values to distance is required. To achieve this, sample images are captured from the sensor, using a static object with a known distance from the sensor.
Using the images presented in Fig. 6, the average RGB values from the static object are extracted and converted to a Hue valuation.

Sample RGB-D Distance Estimation Images.
The values required to build a Hue to distance function are presented in Table 1.
Sample RGB-D Distance Estimation Results
Using Table 1, a distance estimation function is presented by (23).
Using the quadratic equation that describes distance as a function of Hue, a behavior graph is produced, and presented in Fig. 7. Based on this graph, when distance is less than 80cm or greater than 350cm, the slope of the curve changes more significantly. This conforms to the manufacturer's specifications on the limitations of distance estimation above and below these distance ranges.

HSV Exponential Mapping of Distance vs. Hue.
To validate the distance estimation equation presented in (23), the RGB-D sensor is used to capture 220 image samples of a static object between a distance range of 80 to 300cm, with sample steps on 1cm. This selection was made based on the behavior of the curve presented in Fig. 7. At the same time as the image is captured, an empirical measurement of distance is taken. The empirical measurement has a tolerance of 0.5mm error. The images are processed on the Microsoft .NET Micro Framework microcontroller and stored on a database on the microcontroller board, for analysis.
Statistical analysis of experimental data
Statistical analysis of experimental data
The results presented in Table 2 show statistical analyses of the 220 samples taken from the microcontroller. The mean value is relatively low and for more accuracy because the data might be skewed the median value is more useful and is comparably low. The confidence intervals which represents two standard deviations from the mean, equivalently present a low error rate.
The histogram in Fig 8 demonstrates that the error is roughly normally distributed between −0.3 to 0.2 and shows some anomaly below −0.3.

The Frequency vs. Error Histogram
The boxplot presented in Figure 9 visually shows the distance from 0 to the mean, which is −0.067. It also presents the interquartile range, which is 0.248 and minimum (−0.39) and the maximum (0.198) values.

Boxplot of the experiment data
The Q-Q plot depicted in Figure 10 presents the performance of the observed values against the expected values. This demonstrates that the errors are approximately normally distributed centrally, with anomalies at both tail ends.

Normal Q-Q plot of error
Using the statistical analysis presented in the previous section, a non-parametric test is required to validate the practical effectiveness of the distance estimation equation (23). The ideal analysis test for a non-parametric independent one sample set of data is the Kolmogorov-Smirnov test [18] for significance.
The empirical distribution function
Where
The Kolmogrov-Smirnov statistic [18] for a cumulative distribution function
Where
Under the null hypothesis that the sample originates from the hypothesized distribution
Where
If
It should also be noted that the asymptotic power of this analysis test is 1.
For the experimental data presented in this paper the null hypothesis is that the distribution of error is normal with a mean value of −0.06732 and a standard deviation of 0.14766. Based on significance level of 0.05, a significance of 0.44 is returned using the one sample Kolmogorov-Smirnov test (28). The strength of the returned significance value allows us to retain the null hypothesis and say that the distribution of error is normal with a mean value of −0.06732 and a standard deviation of 0.14766.
Before the autonomous mobile robot can complete any path planning or path following tasks, it requires sufficient information about the environment that it will be traversing. To provide the mobile robot with this information, the detail from the RGB-D camera is used to make a plot of the terrain, mapping the unobstructed space the mobile robot can utilize.
Before the RGB-D image can be used, the noise, resolved as black pixels in the range of #E4ElC0h to #FFFFFFh, needs to be removed from the image. This is achieved by converting the RGB-D image to gray scale [20]. This process is carried out to protect natural colors in the #E4ElC0h to #FFFFFFh range. In the RGB color model, a color image can be represented by the following intensity function:
In (29),
The equation that converts a color pixel to a gray scale pixel is presented below (25).
After the image has been converted to gray scale, as depicted in Fig. 11, the black pixels are filtered out of the image.
Once the image has been stripped of the black noise pixels, as depicted in Fig. 12, the color detail is required for mapping the traversable environment.
The RGB-D depth color information from Fig. 11 is remapped onto the gray scale filtered image; the result is presented in Fig. 13.

Conversion from RGB-D to Gray Scale.

Conversion from RGB-D to Gray Scale.

Remapping Color on Filtered Gray Scale Image.
The input matrix (31) is multiplied by the rotation matrix [21] from (32), where θ = 90°, to produce a topological view of the traversable space.
Using the first two components of (33), a 2-dimensional traversable space plot is produced, where the
The generation of a topological view of the available navigation space, as presented in Fig. 14, provides the mobile robot with sufficient information for path planning and obstacle avoidance.

Environment Map Generation from RGB-D Image.
Using a simple Bezier path [22], the KCLBOT is instructed to traverse a defined path and capture sample images using the RGB-D camera. Using the mobile robot's ability to self-localize, implementing (18), (19), and (20), while capturing and superimposing the converted images, using (33), an environment map is constructed.
The initial results presented in Fig. 15, provide the mobile robot with detailed information of the environment and it can easily estimate the Euclidean distance to all obstacles in range.

Environment Map Generation from Path Following.
Using the depth information from equation (34) and mapping depth on the pixels of the image in Fig 13 it is possible to build a 3D reconstruction of the environment, which is demonstrated in Fig 16.

Single sample 3D image reconstruction
The introduction of the affordable off-the-shelf RGB-D camera sensor, which captures images with depth information mapped in the HSV range, makes SLAM possible with minimal computation. This is validated by the experimental results presented in Fig. 11. The only concern of note at this stage is the tolerance levels presented by the distance estimation behavior graph in Fig. 6. The limitation is that obstacles 0.8m away from the sensor will not be resolved accurately. This is only a concern for navigation in small spaces but if the navigation area is large and open spaces are available, the sensor will prove to be a valuable tool for SLAM.
The experiment results and statistical analysis is valuable for anyone wanting to adopt a similar monocular sensor approach. The retention of the null hypothesis validates the use of the sensor and the performance is acceptable for small form factor mobile robots requiring environment mapping.
