Skin color model adaptation under varying lighting conditions

Abstract

Skin region detection is crucial for face recognition, hand tracking, and motion detection. In the detection process, a skin color model is usually required to confine the distribution of skin colors. However, skin color models are sensitive to lighting conditions. Skin segmentation under varying lighting conditions produces poor results. This article presents a skin detection procedure for human–computer interaction sessions under varying lighting conditions. The proposed method requests a skin sample from the user to estimate the color temperature of the light source. Then, the color temperature is used to correct the skin sample. At the subsequent step, the mean of the corrected skin sample is utilized to adapt the skin color model. Finally, the adapted skin color model is employed to segment skin regions in the video stream. Tests using the proposed method and some adaptive skin detection algorithms have been conducted. Statistical data show that the proposed method is superior to color constancy methods and the Gaussian mixture model in skin region segmentation. The proposed method improves the true positive rate by more than 13% in segmenting skin regions of a database. Its true positive rate is 20% better if real-life images are used as test data.

Keywords

Skin color model human–computer interaction computer vision image processing

Introduction

Skin region segmentation is an important procedure in numerous applications, including hand tracking, motion detection, human–computer interaction (HCI), and face recognition.¹ Several skin detection methods and skin color models have been proposed, for example, the Gaussian model of Yang et al.¹ and the threshold model presented in Wang and Yuan.² These skin detection algorithms rely on some skin color models to classify skin and non-skin pixels. Under fixed and well-established lighting conditions, these skin detection methods produce decent results. However, Kakumanu et al.³ show that the distribution of skin colors is influenced by the color of the light source. No individual skin color model is suitable for segmenting skin regions under a changing lighting condition. To overcome this problem, some researchers proposed to perform color constancy to adjust the input images prior to the skin segmentation process.⁴ Other researchers designed mixture skin models to accommodate the varying lighting condition.^3,5

The success of the color constancy–based methods depends on two key factors: an accurate estimation of the light source color and an effective skin model which can produce decent skin classification results under the canonical lighting condition. These two factors should be carefully designed and perfectly coupled. Otherwise, the skin detection procedure will not be successful.⁴ Furthermore, the computational costs constitute another problem. The color constancy process must be conducted for each image frame, and thus, the accumulative computing costs become expensive. On the other hand, a mixture skin color model comprises multiple skin models adapting to different lighting conditions. Skin classification is determined by the weighted decision of the outcomes of these individual models. Ideally, a mixed skin model is more suitable for segmenting skin regions under changing lighting conditions. Nonetheless, a mixed skin model requires a lengthy training process to acquire the best combination weights for the individual models.^3,5 Once the light source is changed, the whole learning process has to be repeated.

In this article, we propose an adaptive skin detection algorithm which works effectively under changing lighting conditions. The Gaussian model deduced in Yang et al.¹ and Wu and Ai⁶ is utilized as the basic skin color model. An adaptation procedure is conducted to transform this skin color model prior to the skin detection process. Once the model has been transformed, it is used to detect skin regions in the video stream directly. The proposed procedure does not require any training process at the preprocessing stage. It doesn’t rely on any color constancy computation at the run time either. It is easy to tune and its performance is superior to the color constancy–based methods and the procedure using a mixed color model. Figure 1 shows the improvement achieved by the proposed method in an HCI application.⁷ The raw image is displayed in Figure 1(a). The skin regions detected using the approach of Yang et al.¹ and Wu and Ai⁶ are rendered in Figure 1(b), while the detected skin regions produced by the proposed method are illustrated in Figure 1(c). (The skin pixels are rendered in white color.) Since the color of the light source is yellowish, the colors of the user’s skin are changed and the Gaussian model–based method produces poor results. On the other hand, the proposed method is not affected by the varied light color and it still produces high-quality skin regions.

Figure 1.

Skin detection example: (a) the raw image, (b) skin detection using the Gaussian model of Wu and Ai,⁶ and (c) skin detection using the proposed method.

The rest of this article is organized as follows: The Gaussian skin color model and a fundamental color correction method are presented in section “Basic skin color model and color temperature.” The algorithm for estimating the light source color is described in section “Light source color calculation.” Section “Adaptation of the skin color model” comprises the adaptation procedure of the Gaussian skin color model. The technical principles and experiments, verifying the adaptation procedure, are also included in section “Adaptation of the skin color model.” Test results using various skin detection methods are given and analyzed in “Test results and analysis.” This article is concluded with the conclusion and future work in section “Conclusion and future work.”

Basic skin color model and color temperature

To accommodate to a changing lighting condition, we construct an adaptation process to adjust the Gaussian skin color model before conducting skin segmentation. The Gaussian skin color model and a basic color constancy used for estimating the color temperature of the light source and for adapting the skin color model are introduced in this section.

Gaussian skin color model

We select the Gaussian model proposed in Yang et al.¹ and Wu and Ai⁶ as the skin color model for classifying skin pixels and non-skin pixels. In the model, the skin color distribution in the C_rC_b color space is modeled by an elliptical Gaussian probability density function

p (c) = \frac{1}{2 π \sqrt{| Σ} |} e^{- \frac{1}{2} {(c - μ)}^{T} Σ^{- 1} (c - μ)}

(1)

here, c is the color vector of the input pixel and µ and Σ are the mean vector and covariance matrix of C_r and C_b components of human skin colors, respectively. In order to speed up the computation, we simplify the computation. Instead of calculating the probability value using equation (1), the Mahalanobis distance from c to µ is used for skin color classification. The input pixel is classified as a skin pixel, if the following condition is satisfied

d (c) = (c - μ)^{T} Σ^{- 1} (c - μ) \leq β

(2)

In our implementation, β is set to 2.105. The parameter β was decided by experiments carried out in Ueng and Chen.⁷ The mean and the covariance matrix are retrieved from Wu and Ai.⁶

Color correction equation

In the proposed method, color correction is required in the estimation of the light source color and in the transformation of the Gaussian skin color model. This computation is performed in the red, green, and blue (RGB) space. Assuming that the color temperature of the light source is T_f, the RGB components of T_f are calculated at first. Then, the RGB channels of the pixels of the skin sample S are transformed using the transformation method of Agarwal et al.⁸

[\begin{matrix} R' \\ G' \\ B' \end{matrix}] = [\begin{matrix} \frac{R_{W}}{R_{T}} & 0 & 0 \\ 0 & \frac{G_{W}}{G_{T}} & 0 \\ 0 & 0 & \frac{B_{W}}{B_{T}} \end{matrix}] [\begin{matrix} R \\ G \\ B \end{matrix}]

(3)

here, R_w, G_w, and B_w are the RGB values of white color, and R_T, G_T, and B_T are the RGB components of the color temperature T_f.

Light source color calculation

Since the light source color affects the distribution of skin colors, we use the light source color to adapt the skin color model in this work. We represent light source colors using color temperatures so that colors of artificial and natural light sources can be systematically enumerated. In this representation method, a color temperature is expressed in Kelvin (K). Color temperatures of ordinary light sources range from 8000 K (candle flame) up to 16,000 K (blue sky). We extend this range and assume that color temperatures of all possible light sources are between 1000 and 35,000 K. We search the color temperature of the light source in this color temperature spectrum.

Light source color temperature estimation

Assuming that the color temperature of the light source is T_f, the algorithm for computing T_f is shown in Figure 2. The detailed steps are described as follows:

Input the skin sample S , the Gaussian skin color model M , and a threshold α;

Initialize the trial color temperature by setting $T = 1000 K$ ;

Correct S to produce S_k using the RGB values of T and equation (3);

Classify S_k using M according to equation (2);

If the detection rate R_T ≥ α, record T;

$T = T + Δ T, Δ T = 100 K$

If $T \leq 35, 000 K$ , go to step 3;

T_f = the average of the recorded T.

Figure 2.

Flowchart of the color temperature estimation method. This iterative algorithm searches and computes the color temperature of the light source.

This algorithm comprises two stages. At the first stage, it searches all feasible color temperatures of the light source. At the second stage, the color temperature, T_f, is computed by averaging the feasible color temperatures. To verify whether a trial color temperature T is feasible, we use T to correct the skin sample S . Then, the resultant skin sample S_k is classified using the skin color model M . If the detection rate is high (R_T ≥ α), T is a feasible color temperature and it is recorded. To avoid an exhaustive search, T is increased by 100 K between two consecutive iterations. The value of α affects the estimation of T_f. If α is too high, few feasible color temperatures will be found and the estimation is biased. If its value is too low, many color temperatures will pass the test and T_f may be wrongly computed. In this work, α is set to 0.6 after conducting numerous experiments.

Skin sample fetching

To fetch the skin sample S , an augmented reality (AR) interface is created on the screen. The user is asked to cover a small rectangle shown on the screen using one of his or her hands, as shown in Figure 3. The pixels inside the rectangle constitute the skin sample S .

Figure 3.

User offers the skin sample S via an AR user interface. Pixels inside the green rectangle are regarded as skin pixels.

Example of color temperature estimation

An experiment had been carried out to show the distribution of feasible color temperature in the color temperature spectrum. Four different light sources were used in the experiment. The recorded feasible color temperatures are rendered in red dots and shown in Figure 4. Under these lighting conditions, the feasible color temperatures fall in a small continuous range in the spectrum though the width of the range may change. This test justifies the usage of the mean of the feasible color temperature as the resultant color temperature T_f.

Figure 4.

Distributions of feasible color temperatures (rendered in red dots) of four different light sources.

Adaptation of the skin color model

After the color temperature of the light source, T_f, has been computed, the adaptation process is conducted to transform the skin color model. First, the RGB components of T_f are calculated and the RGB channels of the skin sample S are transformed using equation (3) to produce a corrected skin sample S_f . Then, S_f is mapped into the C_rC_b space and its mean $μ_{f}$ is computed. Finally, the mean of the Gaussian skin color model, $μ$ , is modified by

μ = μ + (μ_{f} - μ_{s})

(4)

here, $μ_{s}$ and $μ_{f}$ are the means of the C_rC_b color vectors of S and S_f , respectively. The detailed steps are depicted in Figure 5.

Figure 5.

Flowchart of the skin model adaptation method. The color temperature of the light source is used to transform the skin color model.

Rationale of the skin color model adaptation

In the proposed method, the Gaussian color model is translated by the difference of the means of S and S_f . The rationale behind the transformation is illustrated in Figure 6. The colors satisfying the criteria of equation (2) form the ellipse A in the C_rC_b space. Under the canonical light condition, the colors of S constitute the ellipse B , which is included by A . Thus, the Gaussian model is eligible for classifying skin pixels. As the light source changes, B is migrated to a new position, denoted as D in Figure 6. Since D is outside A , the skin detection rate would be significantly deteriorated if the Gaussian model is not transformed. Hence, we translate A to C using equation (4). The color set of the skin sample will again be contained in the Gaussian model and the detection rate can be recovered.

Figure 6.

Adaptation of the Gaussian skin color model: (a) Under the canonical lighting condition, ellipse A , the color set of the original Gaussian model, contains ellipse B , the color set of the skin sample. As the light source changes, B migrate to D and is not within the scope of A and (b) After the adaptation, A is shifted to C and B is included by A again.

An experiment has been conducted to support our theory. The result is displayed in Figure 7. We collected skin samples of 16 people under an unknown lighting condition. The colors of these skin samples in the C_rC_b space are rendered in red dots. Then, colors satisfying the Gaussian model of equation (2) are exhaustively searched and rendered as blue dots. The image shows that the colors of the skin sample are mostly outside the color set of the Gaussian model. Hence, the detection rate is low. Using the adaptation method, the color set of the adapted Gaussian model is translated and illustrated in green dots. As the image shows, a significant part of the colors of the skin samples is contained in the transformed Gaussian model. Therefore, the detection rate will be improved.

Figure 7.

An experimental result of the adaptation: ellipse D —the color set of 16 skin samples; ellipse A —the color set of the original Gaussian model; and ellipse C —the color set of the adapted Gaussian model.

Test results and analysis

Several tests have been conducted to study the effectiveness of the proposed method and other skin detection methods. The embedded system is a desktop computer equipped with a 2.74 GHz CPU, a web-camera, and a 4 GB main memory. The resolution of all test images is fixed to 640 × 480 pixels. The results and analysis are presented and analyzed in this section.

Homogeneous test

In the first test, a user is asked to offer a skin sample to adjust the Gaussian model. Then, the adapted Gaussian model is utilized to segment skin regions of the user in a video stream. One of the results is displayed in Figure 1. This test asserts that the proposed method maintains high skin detection rate under a varying lighting condition.

Inhomogeneous test

To further study the effectiveness of the adaptation method, another test is carried out to segment skin regions in a multi-user video stream. In the test, a user is asked to offer a skin sample to adjust the skin color model. Then, the adapted skin color model is used to segment skin regions of the user and other users. Two results are shown in Figures 8 and 9. The skin sample is offered by the user appearing in the left side of the top image of Figure 8. The skin region segmentation results generated by the original Gaussian model and the proposed method are shown in parts Figure 8(b) and (c), respectively. The results show that the adapted skin color model is effective in detecting skin regions of both users.

Figure 8.

Skin detection in a multi-user environment. (a) The skin sample is offered by the left user. Skin detection results using (b) the original Gaussian skin color model and (c) the proposed method.

Figure 9.

Skin detection in a multi-user environment. (a) The skin sample is not from any of the users. The results using (b) the Gaussian model and (c) the proposed method.

Then, the user, offering the skin sample, is replaced by the third user, as shown in the top image of Figure 9. The results of the skin detection using the proposed method are shown in Figure 9(c). Although the user offering the skin sample is absent from the scene, the detection rate is not decreased. This test ensures that a skin sample from a person with a similar skin color can be used to adapt the skin model. Thus, in an HCI application, the adaptation can be performed by an operator prior to the normal session without bothering the actual users.

Tests using the hand gesture recognition database

Two extra experiments are carried out to compare the proposed method with other adaptive skin detection methods. In the first experiment, the hand gesture recognition (HGR) database of Kawulok et al.⁹ is used as the test data. The database contains 1558 hand gesture images and the ground truth skin region masks. Therefore, detection rates can be easily computed. The hand gesture images are segmented using various skin detection methods. The average detection rate of each testing method is reported in this section for comparison.

The skin detection methods can be divided into two categories. The first category comprised methods relying on skin color models to classify skin pixels. They include the Gaussian model of Wu and Ai,⁶ the Gaussian mixture model (GMM) described in Jones and Rehg,⁵ and the proposed method. The skin detection methods of the second category utilize color constancy methods to correct images at first. Then, they employ the Gaussian model of Wu and Ai⁶ to segment skin regions. The embed color constancy methods include the general gray world method proposed in Gijsenij and Gevers,¹⁰ the shade of gray method (SGM) of Finlayson and Trezzi,¹¹ the gray edge method of Van De Weijer et al.,¹² and the MAX RGB method of Gijsenij et al.¹³ We modified the gray edge method by taking the second-order derivatives as edges and created the second gray edge method.¹¹ We implemented these detection methods, except ours, using MATLAB libraries to speed up the programming task and to avoid coding errors. The proposed method is implemented purely in C-language by us. No special image processing library is used in the program. The detection rate is measured by true positive rate (TPR) and true negative rate (TNR). Detection rates of all the test methods are listed in Table 1.

Table 1.

Skin detection rates of the HGR database.

Algorithms	TPR (%)	TNR (%)
Original Gaussian model	63.3	96.6
Gaussian mixture model	61.8	96.7
General gray world	42.8	97.1
Shade of gray	62.8	96.6
Gray edge	62.9	96.5
Second gray edge	63.3	96.6
MAX RGB	30.6	97.6
Proposed method (α = 0.6)	76.1	97.1

TPR: true positive rate; TNR: true negative rate; RGB: red, green, and blue.

The test results show that the proposed method produces the best TPR. The TNRs of all the testing methods are high. The MAX RGB method enjoys the highest TNR. However, its TPR is relatively lower. The test images include different backgrounds. Color constancy–based methods may fail to find the true light source color to correct the pixels. Even if the estimation is accurate, the corrected skin pixels may not match the Gaussian skin color model. Thus, their TPRs are low. This phenomenon reflects that the coupling of the color constancy methods and the skin color model requires extra training efforts. Color constancy alone cannot improve skin detection rate.

The Gaussian model is superior to GMM. The GMM needs a training phase to obtain the best weights for combining individual Gaussian models. The weights reported in Jones and Rehg⁵ may not be the optimum. Therefore, its performance is decreased.

Tests using real data

In the following experiment, we invited 20 male and female Asians to participate in the test. Their ages range from 20 to 54 years. In total, 37 images are produced under light sources of 6500 and 2700 K color temperatures. The images are captured using a desktop computer equipped with a Logitech C310 web-camera. The white balance functionality of the camera is turned off after the initialization stage of the computer system.

The detection rates of the testing algorithms are measured by TPR and shown in Table 2. The results show that the proposed method generates the best TPR value. It outperforms the other methods by 20%. The best color constancy–based method is the MAX RGB method. Its TPR is about 67.4%. The complexities of the backgrounds of the real test data are higher than those of the HGR database. The reflectance from the background influences the estimation of the light source color. The colors of the skin pixels corrected by the color constancy methods may deviate from the distribution of skin color of the Gaussian model. Thus, the detection rate is deteriorated for the color constancy–based methods.

Table 2.

Skin detection rates using the real data set.

Algorithms	TPR (%)
Original Gaussian model	53.7
Gaussian mixture model	58.2
General gray world	53.0
Shade of gray	53.0
Gray edge	51.1
Second gray edge	53.7
MAX RGB	67.4
Proposed method (α = 0.6)	87.6

TPR: true positive rate; RGB: red, green, and blue.

The Mixture Gaussian Method outperforms the Gaussian model. The usage of multiple skin models receives payoff in this experiment. However, since no training is carried out for the real data set, the weights of individual Gaussian models are not optimized. The performance of this statistical method is still inferior to the proposed method.

Computational cost analysis

In order to unveil the speeds of the testing methods, the GetProcessTimes API of the Windows System is used to measure the CPU time, which is consumed in the skin detection process of a live video stream. The averaged costs are measured in seconds and listed in Table 3. The proposed method takes more than 50 times of CPU time to adapt the skin color model than the color constancy–based methods to correct one input image. However, the color constancy–based methods have to correct pixel colors for each frame. Thus, the cost of color constancy occurs in each frame. On the other hand, the proposed method adapts the skin color model only once in the entire HCI session unless the light source is changed. If the HCI session lasts more than 50 frames, the proposed method is actually faster.

Table 3.

Computational costs of the testing methods.

Algorithms	Cost (s)
General gray world	0.01092
Shade of gray	0.0316
Gray edge	0.2028
Second gray edge	0.3276
MAX RGB	0.0312
Proposed method (α = 0.6)	1.7001

RGB: red, green, and blue.

Conclusion and future work

In this article, an adaptive skin detection method is presented. Its effectiveness in skin detection is also tested and discussed. In the proposed method, the skin sample offered by a user is used to estimate the color temperature of the light source. The resultant color temperature is used to adapt the basic Gaussian model for skin classification. Experimental results reveal that the adaptation process greatly improves skin detection rate under changing light conditions. The proposed method outperforms the basic Gaussian model, GMM, and famous color constancy–based methods. Furthermore, the test results show that the skin sample of a third party can be served to tune the skin color model without deteriorating the performance of the proposed method.

In this work, most of the test data are collected from Asian users except the HGR database. Further tests are required to include people of various ethnic backgrounds. Another potential improvement is to automatically fetch the skin sample and tune the skin model at run time when the lighting condition is changed. The third interesting topic is to apply the adaptation process on different color models. Currently, we use the Gaussian skin color model defined in the C_rC_b space as the basic skin model. Khan et al.¹⁴ showed that selecting a right color space can significantly improve skin detection. Thus, future studies are needed to apply the proposed adaptation process on skin models in different color spaces, for example, the normalized rg-space mentioned in Kakumanu et al.³ In the proposed method, only one skin color model is adapted and used for skin detection. Skin regions under shadows, for example, the neck, may be classified as non-skin regions. Naji et al.¹⁵ created four skin color clusters in the hue, saturation, and value (HSV) color space for segmenting skin regions under non-uniform lighting conditions. Their work inspires us to employ multi-skin color models to conquer self-shadowing in our future researches.

Footnotes

Academic Editor: Stephen D Prior

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partly supported by the Ministry of Science and Technology, Taiwan.

References

Yang

Waibel

Skin-color modeling and adaptation. In: Proceedings of the Asian conference on computer vision, Hong Kong, China, 8–10 January 1998, pp.687–694. London: Springer.

Wang

Yuan

A novel approach for human face detection from color images under complex background. Pattern Recogn 2001; 34: 1983–1992.

Kakumanu

Makrogiannis

Bourbakis

A survey of skin-color modeling and detection methods. Pattern Recogn 2007; 40: 1106–1122.

Lee

Hwang

Jun

BM.

Analyzing color constancy method for recovering skin color under colored illuminants. J Korean Inst Commun Inf Sci 2011; 36: 621–628.

Jones

Rehg

JM.

Statistical color models with application to skin detection. Int J Comput Vision 2002; 46: 81–96.

. Face detection in color images using AdaBoost algorithm based on skin color information. In: Proceedings of the 1st IEEE international workshop on knowledge discovery and data mining, Adelaide, SA, Australia, 23–24 January 2008, pp.339–342. New York: IEEE.

Ueng

Chen

GZ.

Vision based multi-user human computer interaction. Multimed Tools Appl 2016; 75: 10059–10076.

Agarwal

Abidi

Koschan

. An overview of color constancy algorithms. J Pattern Recognit Res 2006; 1: 42–54.

Kawulok

Grzejszczak

Nalepa

. Database for hand gesture recognition, http://sun.aei.polsl.pl/~mkawulok/gestures/ (accessed February 2016).

10.

Gijsenij

Gevers

Color constancy by local averaging. In: Proceedings of the IEEE image analysis and processing workshops, Modena, 10–13 October 2007, pp.171–174. New York: IEEE.

11.

Finlayson

Trezzi

. Shades of gray and colour constancy. In: Proceedings of the color and imaging conference, Scottsdale, AZ, November 2004.

12.

Van De

Weijer J

Gevers

Gijsenij

Edge-based color constancy. IEEE T Image Process 2007; 16: 2207–2214.

13.

Gijsenij

Gevers

Van De

Weijer J

. Computational color constancy: survey and experiments. IEEE T Image Process 2011; 20: 2475–2489.

14.

Khan

Hanbury

Stottinger

. Color based skin classification. Pattern Recogn Lett 2012; 33; 157–163.

15.

Naji

Zainuddin

Jalab

HA.

Skin segmentation based on multi pixel color clustering models. Digit Signal Process 2012; 22: 933–940.