Two-Layer Hidden Markov Model for Human Activity Recognition in Home Environments

Abstract

Activities of Daily Livings (ADLs) refer to the activities that are carried out by an individual for everyday living. Recognition of ADLs is key element for building intelligent and pervasive environments. We propose a two-layer HMM to build a ADLs recognition model that can represent the mapping between low-level sensor data and high-level activity based on the binary sensor data. We used embedded sensor with appliances or object to get object used sequence data as well as object name, type, interaction time, and location. In the first layer, we use location data of object used sensor to predict the activity class and in the second layer object used sequence data to determine the exact activity. We perform comparison with other activity recognition models using three real datasets to validate the proposed model. The results show that the proposed model achieves significantly better recognition performance than other models.

1. Introduction

Activity recognition is an important task for many ubiquitous computing applications. In a home environment, activity recognition systems observe the resident's activities and can remind users to perform the missed activities or complete actions, helping them to provide some automated service. In a hospital environment, activity recognition system can remind a doctor or nurse to perform certain tests before operating. In a factory environment, it can monitor the activities of the workers and encourage them to act more safely. Activity recognition systems use several types of sensors, microphones and video cameras, RFID readers, wearable sensors, and embedded sensors with appliances (or object), to determine the state of the physical world. Based on the state of the physical world, activity recognition systems observe the behavior of persons and if necessary take actions in response. Microphones and video camera based activity recognition is complex to implement because it requires processing of multidimensional data and may violate issue of user's privacy [1–6]. It is effective to recognize primitive sequence of movements (walking, sitting, standing, etc.) but difficult to recognize ADLs (showering, grooming, preparing meals, etc.) using wearable sensors [7–9]. Home users perform an activity by interacting with appliances (or objects) within nearby location at a given time. Appliances or object used information can be gotten by attaching embedded sensor with them [10–13]. Embedded sensors are inexpensive, invisible, and nonintrusive having no impact on regular life activities. Tapia et al. have presented how simple binary sensors have solid potential for solving the ADLs recognition problem in the home [11]. Binary sensors can also be applied in human-centric problems such as health and elder care [14–16]. Van Kasteren et al. use binary sensors and motion sensors for performing ADLs recognition in a house setting [17]. However sensors based activity recognition is challenging due to the inherent noisy nature of the input [18]. In this context the temporal probabilistic reasoning and machine learning approaches are very effective. Several object use based models have been used to recognize ADLs from binary sensors such as Bayesian Networks [11], Conditional Random Field (CRF) [19], Hidden Markov Model (HMM) [17], and Hidden semi-Markov Model [17]. In [20], the authors proposed Triplet Markov Chains (TMC) to deal with nonstationary HMM by introducing an auxiliary process governing the regime switching of the hidden process.

The location of a person in an environment can provide important context information for activity recognition [21, 22]. Location based model is able to provide high-level abstraction for an activity. For example, bedroom is for sleeping, bathroom for showering, and kitchen for preparing meal. The same location can be used for different activities such as showering, brushing teeth, and toileting as bathroom activities and preparing meal, taking meals, and dishwashing as kitchen activities. So location information based modeling is not suitable for recognizing ADLs. Our aim is to build an activity recognition model that can represent the mapping between low-level sensor data and high-level activity based on the binary sensor data. We want to combine location of object used sensor with the object used data to overcome the limitation for each approach. Using embedded sensors with appliances or object we can get object used data as well as object name, type, interaction time, and location. We propose a two-layer Hidden Markov Model (HMM) for recognition of ADLs. In the first layer, we use location data of object used sensor to predict the activity class and in the second layer object used sequence data to determine the exact activity from the selected class. In our 2-layer HMM, first layer selects one activity class so second layer only considers the activities (variables) which are related to that class only. This approach reduces time complexity of this model. We evaluate and compare the activity recognition performance of the proposed models on three fully annotated real world datasets generated by Van Kasteren [23]. Based on our experiment, 2-layer HMM outperforms other activity recognition models. Using location data with object used data, this model can result in a significant increase in recognition performance.

The rest of the paper is organized as follows: Section 2 presents an overview of the data used in this study. Section 3 illustrates the details of the proposed model. Section 4 provides the experimental settings and results obtained followed by conclusion and future work in Section 5.

2. Dataset

We evaluate the recognition performance of the proposed model and compare it with naïve Bayes, Conditional Random Field (CRF), and Hidden Markov Model (HMM) on three fully annotated real world datasets generated by Van Kasteren et al. Wireless Sensor Networks are used to observe the behavior of inhabitants inside the houses by collecting binary temporal data. Wireless network nodes are equipped with various kinds of sensors: reed switches to measure whether doors and cupboards are open or closed, pressure mats to measure sitting on or lying in bed, mercury contacts to detect the movement of objects, passive infrared (PIR) to detect motion in a specific area, and float sensors to measure the toilet being flushed. Wireless network node sends an event to base station when the state of the digital input changes or when the analog input crosses some predefined threshold. An overview of the datasets is presented in Table 1.

Table 1

Details of Kasteren datasets.

	House-A	House-B	House-C
Setting	Apartment	Apartment	House
Rooms	3	2	6
Duration	25 days	15 days	20 days
Sensors	14	23	21

Activities	10	13	16

Time series data is discretized into a set of time slices of constant length $Δ t$ . A sensor event for time t is denoted by $x_{i}$ , indicating whether sensor i fired at least once between time t and time $t + Δ t$ , with $x_{i} \in {0,1}$ . If N sensors are installed in a home, then the observation vector for each time slice will be $X_{t} = {[x_{1}, x_{2}, \dots, x_{N - 1}, x_{N}]}^{T}$ . Each time slice corresponds to a single data instance during data representation. The class of each data instances is defined by the activity label of the corresponding time slices. The activity at time slice t is denoted by $y_{t}$ . It is the task to a classifier to find a mapping between a sequence of observations $Θ = {X_{1}, X_{2}, \dots, X_{T - 1}, X_{T}}$ and a sequence of labels $Y = {y_{1}, y_{2}, \dots, y_{T - 1}, y_{T}}$ for total T time intervals as shown in Figure 1.

Figure 1

Temporal relation between sensor reading $x_{i}$ and time intervals $Δ t$ .

The activities or labels considered were not the same for all datasets. In the three datasets nineteen different ADLs were included for labels: “Brushing teeth”, “Breakfast”, “Dinner”, “Drink”, “Dressing”, “Dishwashing”, “Eating”, “Leaving house”, “Lunch”, “Medication”, “Preparing Breakfast”, “Preparing Lunch”, “Preparing Dinner”, “Relax”, “Showering”, “Sleeping”, “Shaving”, “Snack”, and “Toileting”. If the activity of a time interval does not match with labeled activity, then activity of the corresponding time interval is referred to as “others”. Table 2 shows percentage (%) of instances in each activity class of the used datasets.

Table 2

Instances of each activity classes in Kasteren dataset measured in percentage (%).

Activity	Kasteren-A	Kasteren-B	Kasteren-C
Breakfast	0.3	0.3	—
Brushing teeth	0.3	0.5	0.3
Dinner	1	1	—
Drink	0.1	0.1	0.1
Dressing	—	0.4	0.4
Dishwashing	—	0.5	—
Eating	—	—	3
Leaving house	50.2	51	46.5
Lunch	—	—	—
Medication	—	—	0.1
Others	13	7	5.1
Preparing Breakfast	—	2	2
Preparing Lunch	—	—	3
Preparing Dinner	—	4	3
Relax	—	—	8
Showering	0.8	0.6	0.6
Sleeping	33.5	32	27
Shaving	—	—	0.1
Snack	0.1	—	0.1
Toileting	0.7	0.6	0.7

3. Proposed Model

We have proposed a 2-layer HMM for recognition of ADLs. We used the locations information of the used object in the first layer and select a group which satisfied maximum joint probability. In the second layer we considered the data of used object and select one activity among the classes. Activities are grouped according to the location of objects used to perform similar activity. Figure 2 shows an example of the grouping of the activity. Sensors are also classified based on the location of the objects which are involved with each activity group. Figure 3 presents an example of classification of the location of sensor installed with different object for each activity group.

Figure 2

Grouping of the activities according to the location of sensor installed with different objects.

Figure 3

Classification of the location of sensor installed with different objects for each activity group.

We consider $Y = {y_{1}, y_{2}, \dots, y_{m}}$ as the set of activities, where m is the total number of activities and these activities are divided into p groups; $G = {G_{1}, G_{2}, \dots, G_{p}}$ for all $G_{i} \subset Y$ . Let $X = {x_{1}, x_{2}, \dots, x_{n}}$ be the set of sensors installed and $L = {l_{1}, l_{2}, \dots, l_{p}}$ the set of locations in an environment on deploying n number of sensors in p number of locations; then the location of sensors is $L_{X} = \{l_{x_{1}}, l_{x_{2}}, \dots, l_{x_{n}}\}$ for $l_{x_{1}} \in L$ . Therefore $Θ_{L} = {{L}_{X_{1}}, L_{X_{2}}, \dots, L_{X_{T}}}$ represents the set of locations of observed sensor sequences associated with the set of sensor observations sequence $Θ = {X_{1}, X_{2}, \dots, X_{T}}$ at a given time T. Each group corresponds to a location.

3.1. First Layer of HMM

In the first layer, the classifier uses the location information of the observed sensors to classify the group of activities. The first layer of this HMM is characterized by $λ_{s 1} = {π, A, B}$ , where π is the initial state distribution, A is the transition probability matrix, and B is the observation probability matrix for this layer. We assume each group of activities $G_{l}$ is a hidden state and location information of the observed sensor $l_{x_{k}}$ is the observations for first layer of HMM. The graphical representation of the first layer of HMM is shown in Figure 4. Initial state distribution is denoted by

\begin{matrix} π_{i} = P (G_{1} = i), 1 \leq i \geq p, \end{matrix}

(1)

where p is the number of probability distributions. The state transition probability distribution represents the probability of transition from state i to state j and is represented by

\begin{matrix} a_{i, j} = P (G_{t} = j ∣ G_{t - 1} = i), 1 \leq i, j \geq p . \end{matrix}

(2)

The observation probability distribution indicates the probability that the state j would generate observation

l_{x_{k}}

\begin{matrix} b_{j} (k) = P (l_{x_{k}} ∣ G_{t} = j), 1 \leq j \geq p, 1 \leq k \geq n . \end{matrix}

(3)

The hidden

G_{t}

state at time t depends only on the previous hidden state

G_{t - 1}

. The observation variable

l_{X_{t}}

at time t, namely, depends only on the hidden variable

G_{t}

. The goal is to find the joint probability distribution:

\begin{matrix} P (G, Θ_{L}) = \prod_{t = 1}^{|Θ_{L}|} P (G_{t} ∣ G_{t - 1}) P (l_{X_{t}} ∣ G_{t}) . \end{matrix}

(4)

The observation distribution

P (l_{X_{t}} ∣ G_{t})

represents the probability that the activity group

G_{t}

would generate n distinct observation symbols. We apply naïve Bayes assumption for the observation distribution as

\begin{matrix} P (l_{X_{t}} ∣ G_{t} = j) = \prod_{n = 1}^{|L_{X}|} P (l_{x_{n}} ∣ G_{t} = j) . \end{matrix}

(5)

Figure 4

The graphical representation of the HMM used in the first layer of the activity model.

3.2. Second Layer of HMM

In the second layer, the classifier uses the sensor data to classify the individual activity. For each activity group $G_{k}$ , a separate HMM is used. Let $λ_{s 2} = \{λ_{s 2}^{1}, λ_{s 2}^{2}, \dots, λ_{s 2}^{p}\}$ be the set of all HMMs in the second layer in which HMM is characterized by $λ_{s 2}^{θ} = \{π_{i}^{θ}, A^{θ}, B^{θ}\}$ , where θ is the index that represents the HMM to be used and for all $θ = |G|$ . First layer calculates the joint probability distribution and selects the most probable groups for this index as an individual activity classification. The graphical representation of the second layer of HMM is shown in Figure 5. We assume all the activities in a group $G_{k} = \{y_{1}^{k}, y_{2}^{k}, \dots, y_{q}^{k}\}$ are a hidden state where $1 \leq q \leq m$ and sensor data $X_{t}$ at a time t are the observations for second-layer HMM. $Θ^{θ} = {X_{1}^{θ}, X_{2}^{θ}, \dots, X_{T}^{θ}}$ is the observed sequence for sensor where θ represents the index of second-layer HMMs and $X_{1}^{θ} \in X$ . The initial state distribution ( $π_{i}^{θ}$ ) is denoted by

\begin{matrix} π_{i}^{θ} = P (y_{1}^{θ} = i), 1 \leq i \geq q, \end{matrix}

(6)

where q is the number of probability distributions. The state transition probability distribution represents the probability of transition from state i to state j and is represented by

\begin{matrix} a_{i, j}^{θ} = P (y_{t}^{θ} = j ∣ y_{t - 1}^{θ} = i), 1 \leq i, j \geq q . \end{matrix}

(7)

The observation probability distribution indicates the probability that the state j would generate observation

x_{k}

\begin{matrix} b_{j}^{θ} (k) = P (x_{k}^{θ} ∣ y_{t}^{θ} = j), 1 \leq j \geq q, 1 \leq k \geq n . \end{matrix}

(8)

The hidden state

y_{t}^{θ}

at time t depends only on the previous hidden state

y_{t - 1}^{θ}

. The observation variable

X_{t}^{θ}

at time t depends only on the hidden variable

y_{t}^{θ}

. The goal is to find the joint probability distribution:

\begin{matrix} P (Y^{θ}, Θ^{θ}) = \prod_{t = 1}^{|Θ^{θ}|} P (y_{t}^{θ} ∣ y_{t - 1}^{θ}) P (X_{t}^{θ} ∣ y_{t}^{θ}) . \end{matrix}

(9)

The observation distribution

P (X_{t}^{θ} ∣ y_{t}^{θ})

represents the probability that the activity

y_{t}

would generate n distinct observation symbols. We apply naïve Bayes assumption for the calculation of observation distribution as

\begin{matrix} P (X_{t}^{θ} ∣ y_{t}^{θ} = j) = \prod_{n = 1}^{|X^{θ}|} P (x_{n}^{θ} ∣ y_{t}^{θ} = j) . \end{matrix}

(10)

Figure 5

The graphical representation of the HMM used in the second layer of the activity model.

3.3. Learning the Model Parameters

The model parameters are apparently learnt from learning samples. To estimate the parameters from the sequences of observation Baum-Welch (BW) based algorithm is used [24, 25]. In BW method, for a given sequence of observations, an HMM with n states and an initial model $λ^{0}$ , the forward-backward (FB) algorithm computes the expected number of state transitions and state emissions based on the current model (E-step) and then reestimates the model parameters (M-step) using estimation formula during each iteration. After each iteration of the E-step and M-step, the likelihood of the observations increases until a convergence to a stationary point occurs. The reestimation formulas are as follows.

The probability of being in state $G_{i}$ at time t and state $G_{j}$ at time $t + 1$ , given the model $λ_{s 1}$ and the observation sequence $Θ_{L}$ , can be estimated using forward (α) and backward (β) variables as follows:

\begin{matrix} ξ_{t} (i, j) = \frac{α_{t} (i) a_{i j} b_{j} (O_{t + 1}) β_{t + 1} (j)}{\sum_{i = 1}^{N} \sum_{j = 1}^{N} α_{t} (i) a_{i j} b_{j} (O_{t + 1}) β_{t + 1} (j)} . \end{matrix}

(11)

The probability of being in state

G_{i}

at time t, given the observation sequence

Θ_{L}

and the model

λ_{s 1}

, can be estimated as follows:

\begin{matrix} γ_{t} (i) = \sum_{j = 1}^{N} ξ_{t} (i, j) . \end{matrix}

(12)

First layer of HMM

(λ_{s 1} = {{\bar{π}}_{i}, \bar{A}, \bar{B}})

\begin{matrix} {\bar{π}}_{i} = γ_{1} (i) = expected  #  of  times  state G_{i} at t = 1, \\ {\bar{a}}_{i, j} = \frac{\sum_{t = 1}^{T - 1} ξ_{t} (i, j)}{\sum_{t = 1}^{T - 1} γ_{t} (i)} = \frac{Expected  #  of  transition  from G_{i} ⟶ G_{j}}{Expected  #  of  transition  from G_{i}}, \\ {\bar{b}}_{j} (k) = \frac{\sum_{t = 1}^{T} γ_{t} (j) δ_{O_{t}} l_{x_{k}}}{\sum_{t = 1}^{T} γ_{t} (j)} = \frac{Expected  #  of  times  in G_{j} observing l_{x_{k}}}{Expected  #  of  times  in G_{j}} . \end{matrix}

(13)

Similarly the probability of being in state

y_{i}^{θ}

at time t and state

y_{j}^{θ}

at time

t + 1

, given the model

λ_{s 2}^{θ}

and the observation sequence Θ, can be estimated using (11). Moreover the probability of being in state

y_{i}^{θ}

at time t, given the observation sequence Θ and the model

λ_{s 2}^{θ}

, can be estimate using (12).

Second layer of HMM ( $λ_{s 2}^{θ} = \{{\bar{π}}_{i}^{θ}, {\bar{A}}^{θ}, {\bar{B}}^{θ}\}$ , $1 \leq θ \leq p$ )

\begin{matrix} {\bar{π}}_{i}^{θ} = γ_{1} (i) = e x p e c t e d # o f t i m e s s t a t e y_{1}^{θ} a t t = 1, \\ {\bar{a}}_{i, j}^{θ} = \frac{\sum_{t = 1}^{T - 1} ξ_{t} (i, j)}{\sum_{t = 1}^{T - 1} γ_{t} (i)} = \frac{E x p e c t e d # o f t r a n s i t i o n f r o m y_{i}^{θ} ⟶ y_{j}^{θ}}{E x p e c t e d # o f t r a n s i t i o n f r o m y_{i}^{θ}}, \\ {\bar{b}}_{j}^{θ} (k) = \frac{\sum_{t = 1}^{T} γ_{t} (j) δ_{O_{t}} x_{k}^{θ}}{\sum_{t = 1}^{T} γ_{t} (j)} = \frac{E x p e c t e d # o f t i m e s i n y_{j}^{θ} o b s e r v i n g x_{k}^{θ}}{E x p e c t e d # o f t i m e s i n y_{j}^{θ}} . \end{matrix}

(14)

3.4. Inferring the Best State Sequence

To infer the best sequence of groups as well as activities we have used Viterbi algorithm which has been successfully applied with HMM to solve many activity recognition problems [24]. Viterbi algorithm can find out the best sequence efficiently using dynamic programming and discard a number of paths at each time step. The computational complexity of Viterbi algorithm is less than other algorithms (direct calculation, forward-backward). Inference is done in simultaneous manner: the groups and the activities are estimated simultaneously based on Viterbi algorithm.

4. Experimental Setup and Results

We evaluate the performance of the proposed 2-layer HMM in recognizing activities of ADLs and compare it with three well-known classifiers: naïve Bayes, CRF, and HMM using three real datasets. Sensor data were segmented in time slices of length $Δ t = 60$ seconds where there were a total of 36,000 time slices for dataset of “Kasteren-A”, 21,600 time slices for “Kasteren-B”, and 28,800 time slices for dataset of “Kasteren-C”. The activities and sensors are grouped based on the location and tabulated in Table 3. The sensor networks deployed in the home environment generate raw data stream. The sensor data streams have been presented in three different feature representations [10].

Table 3

Details of group information for Kasteren datasets: activity classification based on location data.

Group information	House-A	House-B	House-C
Number of location groups	5	4	6
Name of location group	Bedroom, bathroom, front door, kitchen, and toilet	Bedroom, bathroom, front door, and kitchen	Bedroom, bathroom, front door, kitchen, living room, and toilet
Bedroom group activity list	Sleeping	Sleeping, dressing	Sleeping, dressing
Bathroom group activity list	Brushing teeth, showering	Brushing teeth, showering, and toileting	Brushing teeth, showering, and shaving
Front door group activity list	Leaving house	Leaving house	Leaving house
Kitchen group activity list	Breakfast, dinner, drinking, and snack	Breakfast, dinner, drinking, dishwashing, preparing breakfast, and preparing dinner	Drinking, dishwashing, eating, preparing breakfast, preparing lunch, preparing dinner, and snack
Living room group activity list	—	—	Relax
Toilet group activity list	Toileting	—	Toileting
Bedroom group sensor list	Door	Door, dresser, pressure mat	Door, dresser, and pressure mat
Bathroom group sensor list	Door	Door, toilet flush	Door, shower tap, and sink
Front door group sensor list	Front door	Front door	Front door
Kitchen group sensor list	Microwave, refrigerator, freezer, kitchen cupboard (cup), kitchen cupboard (plates), kitchen cupboard (various), and dishwasher	Microwave, refrigerator, freezer, kitchen sink, kitchen cupboard (various), kitchen cupboard (plates), kitchen drawer (cutlery), and dishwasher	Microwave, refrigerator, freezer, kitchen cupboard (cup), kitchen cupboard (plates), kitchen cupboard (various), and dishwasher
Living room group sensor list	—	—	Couch pressure mat
Toilet group sensor list	Door, toilet flush	—	Door, toilet flush

Raw. This feature uses the sensor data directly as it was collected from the sensor network. The value is 1 when the sensor fires and 0 otherwise.

Change Point (CP). This feature indicates when a sensor changes value. The value is 1 when a sensor state goes from zero to one or vice versa and 0 otherwise.

Last-Fired (LF). This feature indicates which sensor fired last. The sensor that changed state last continues to value 1 and changes to 0 when another sensor changes state.

We accommodate three distinct features, Raw, Change Point (CP), and Last-Fired (LF), and four combined features: Raw + CP, Raw + LF, CP + LF, and Raw + CP + LF. The combined feature representation is a concatenation of the feature matrices. Figure 6 shows the three distinct features.

Figure 6

Feature representations of (a) Raw, (b) Change Point, and (c) Last-Fired.

The data were split into a test and training set using a “leave one day out” approach in which one day of sensor data is used for testing and the remaining days are used for training. The process is repeated for each day and the average performance is measured. The performance of the proposed 2-layer HMM is evaluated using precision, recall, and F-measure. F-measure data were expressed as a mean ± standard deviation and significance was analyzed using Student's t-tests. Statistical significance was considered as $P < 0.05$ . F-measure, precision, recall, and accuracy can be calculated using the confusion matrix shown in Table 4:

\begin{matrix} F - M e a s u r e = \frac{2 \cdot p r e c i s i o n \cdot r e c a l l}{p r e c i s i o n + r e c a l l} . \end{matrix}

(15)

The precision and recall are defined as

\begin{matrix} P r e c i s i o n = \frac{1}{N} \sum_{i = 1}^{N} \frac{{T P}_{i}}{{T I}_{i}}, \\ R e c a l l = \frac{1}{N} \sum_{i = 1}^{N} \frac{{T P}_{i}}{{T T}_{i}}, \\ A c c u r a c y = \frac{\sum_{i = 1}^{N} {T P}_{i}}{T o t a l} . \end{matrix}

(16)

The recognition accuracy of each activity for three datasets is shown in the confusion matrix separately. In the case of the “Change Point + Last-Fired” feature the highest accuracy is achieved for each activity. Tables 5–7 show confusion matrix for 2-layer HMM using the “Change Point + Last-Fired” feature of three datasets.

Table 4

Confusion matrix showing the true positives (TP), total of true labels (TT), and total of inferred labels (TI) for each class of activity.

True	Inferred
True	1	2	3	⋯	N
1	TP₁	$ε_{12}$	$ε_{13}$	⋯	$ε_{1 N}$	TT₁
2	$ε_{21}$	TP₂	$ε_{23}$	⋯	$ε_{2 N}$	TT₂
3	$ε_{31}$	$ε_{32}$	TP₃	⋯	$ε_{3 N}$	TT₃
⋮	⋮	⋮	⋮	⋮	⋮	⋮
N	$ε_{N 1}$	$ε_{N 2}$	$ε_{N 3}$	⋯	TP $_{N}$	TT $_{N}$

	TI₁	TI₂	TI₃	⋯	TI $_{N}$	Total

Table 5

Confusion matrix of two-layer HMM using “Change Point + Last-Fired” feature for House-A of Kasteren dataset.

		Predicted activity										Recall
		Breakfast	Brushing teeth	Dinner	Drinking	Leavinghouse	Others	Sleeping	Showering	Snack	Toileting	Recall
Actual activity	Breakfast	75	0.0	3.0	5.7	0.0	6.5	0.0	5.5	4.3	0.0	75.0
	Brushing teeth	0.0	75.5	0.0	0	0	10.2	0	7.4	0	6.9	75.5
	Dinner	6.6	0.0	78	7.6	0	7.8	0	0	0	0.0	78.0
	Drinking	4.5	0.0	5.3	78	0	11	0	0	1.2	0.0	78.0
	Leaving house	0.0	0.0	0.0	0	85	15	0	0	0	0.0	85.0
	Others	5.3	5.5	5.8	5.2	5.8	50	5.7	5.6	5.6	5.5	50.0
	Sleeping	0.0	0.0	0.0	0	0	19.5	80.5	0	0	0.0	80.5
	Showering	0.0	7.6	0.0	0	0	7.2	0	77	0	8.2	77.0
	Snack	6.5	0.0	5.7	5.6	0	10.2	0	0	72	0.0	72.0
	Toileting	0.0	6.0	0.0	0	0	9.7	0	7.3	0	77	77.0

	Precision	76.61	79.80	79.75	76.39	93.61	33.99	93.38	74.90	86.64	78.89

Table 6

Confusion matrix of two-layer HMM using “Change Point + Last-Fired” feature for House-B of Kasteren dataset.

		Predicted activity													Recall
		Breakfast	Brushing teeth	Dinner	Drinking	Dressing	Leaving house	Others	Preparing Breakfast	Preparing Dinner	Sleeping	Showering	Toileting	Using dishwasher	Recall
Actual activity	Breakfast	75	0.0	4.3	4.7	0.0	0.0	6.5	3.5	3.0	0.0	0.0	0.0	3.0	75.0
	Brushing teeth	0.0	75.5	0.0	0.0	0.0	0.0	10.2	0.0	0.0	0.0	7.4	6.9	0.0	75.5
	Dinner	6.6	0.0	78	7.6	0.0	0.0	7.8	0.0	0.0	0.0	0.0	0.0	0.0	78.0
	Drinking	4.5	0.0	3.3	78	0.0	0.0	8.2	2.0	2.0	0.0	0.0	0.0	2.0	78.0
	Dressing	0.0	0.0	0.0	0.0	80	0.0	10	0.0	0.0	10.0	0.0	0.0	0.0	80.0
	Leaving house	0.0	0.0	0.0	0.0	0.0	85	15	0.0	0.0	0.0	0.0	0.0	0.0	85.0
	Others	5.3	5.5	5.4	5.2	0.0	4.8	60	0.0	0.0	4.7	4.6	4.5	0.0	60.0
	Preparing Breakfast	5.0	0.0	4.0	3.0	0.0	0.0	8.0	65	10	0.0	00	0.0	5.0	65.0
	Preparing Dinner	3.0	0.0	5.0	3.0	0.0	0.0	8.0	10	66	0.0	0.0	0.0	5.0	66.0
	Sleeping	0.0	0.0	0.0	0.0	9.5	0.0	10.0	0.0	0.0	80.5	0.0	0.0	0.0	80.5
	Showering	0.0	7.6	0.0	0.0	0.0	0.0	7.2	0.0	0.0	0.0	77	8.2	0.0	77.0
	Toileting	0.0	6.0	0.0	0.0	0.0	0.0	9.7	0.0	0.0	0.0	7.3	77	0.0	77.0
	Using dishwasher	3.0	0.0	4.0	2.0	0.0	0.0	5.0	5.0	5.0	0.0	0.0	0.0	76.0	76.0

	Precision	73.24	79.80	75.00	75.36	89.38	94.65	36.23	76.02	76.74	84.55	79.96	79.71	83.51

Table 7

Confusion matrix of two-layer HMM using “Change Point + Last-Fired” feature for House-C of Kasteren dataset.

		Predicted activity																Recall
		Brushing teeth	Drinking	Dressing	Eating	Leaving house	Medication	Others	Preparing Breakfast	Preparing Lunch	Preparing Dinner	Relax	Sleeping	Showering	Snack	Shaving	Toileting	Recall
Actual activity	Brushing teeth	72	0.0	0.0	0.0	0.0	0.0	12	0.0	0.0	0.0	0.0	0.0	5.0	0.0	6.0	5.0	72
	Drinking	0.0	80	0.0	5.0	0.0	0.0	9.0	2.0	0.0	2.0	0.0	0.0	0.0	0.0	0.0	2.0	80
	Dressing	0.0	0.0	75	0.0	0.0	0.0	10	0.0	0.0	0.0	8.0	7.0	0.0	0.0	0.0	0.0	75
	Eating	0.0	0.0	0.0	75	0.0	0.0	6.0	5.0	5.0	5.0	0.0	0.0	0.0	4.0	0.0	0.0	75
	Leaving house	0.0	0.0	0.0	0.0	85	0.0	15	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	85
	Medication	0.0	0.0	0.0	0.0	0.0	85	10	0.0	0.0	0.0	0.0	5.0	0.0	0.0	0.0	0.0	85
	Others	3.0	3.0	3.0	3.0	3.0	4.8	50	3.0	3.0	3.0	2.0	4.7	4.5	3.0	4.5	2.5	50
	Preparing Breakfast	0.0	5.0	0.0	5.0	0.0	0.0	8.0	65	6.0	6.0	0.0	0.0	0.0	5.0	0.0	0.0	65
	Preparing Lunch	0.0	5.0	0.0	5.0	0.0	0.0	8.0	6.0	65	6.0	0.0	0.0	0.0	5.0	0.0	0.0	65
	Preparing Dinner	0.0	5.0	0.0	5.0	0.0	0.0	8.0	6.0	6.0	65	0.0	0.0	0.0	5.0	0.0	0.0	65
	Relax	0.0	0.0	0.0	0.0	0.0	0.0	5.0	0.0	0.0	0.0	85	10	0.0	0.0	0.0	0.0	85
	Sleeping	0.0	0.0	5.0	0.0	0.0	0.0	9.5	0.0	0.0	0.0	5	80.5	0.0	0.0	0.0	0.0	80.5
	Showering	6.0	0.0	0.0	0.0	0.0	0.0	8.0	0.0	0.0	0.0	0.0	0.0	76	0.0	5.0	5.0	76
	Snack	0.0	5.0	0.0	5.0	0.0	0.0	10	5.0	5.0	5.0	0.0	0.0	0.0	65	0.0	0.0	65
	Shaving	6.0	0.0	0.0	0.0	0.0	0.0	10	0.0	0.0	0.0	0.0	0.0	6.0	0.0	72	6.0	72
	Toileting	4.0	0.0	0.0	0.0	0.0	0.0	9.0	0.0	0.0	0.0	0.0	0.0	5.0	0.0	5.0	77	77

	Precision	79.12	77.67	90.36	72.81	96.59	94.65	26.67	70.65	72.22	70.65	85	75.09	78.75	74.71	77.83	78.97

The average F-measure values for the three datasets are tabulated in Tables 8–10. Rows in the tables correspond to the different distinct and combined features whereas columns represent the experimental results for each activity recognition model. In case of the “Change point + Last-Fired (CP + LF)” feature the highest F-measure value is achieved. F-measure data are expressed as a mean ± standard deviation and significance is analyzed using Student's t-test. Statistical significance is considered as $P < 0.05$ .

Table 8

F-measure expressed in percentage (%) for House-A of Kasteren datasets.

Feature	Naïve Bayes	CRF	HMM	2-layer HMM
Raw	48 ± 16	50 ± 18	50 ± 16	60 ± 13
Change Point (CP)	50 ± 17	65 ± 15	71 ± 14	72 ± 10
Last-Fired (LF)	58 ± 15	60 ± 14	62 ± 13	67 ± 12
Raw + CP	55 ± 16	62 ± 16	55 ± 15	71 ± 11
Raw + LF	54 ± 14	63 ± 14	69 ± 13	72 ± 12
CP + LF	65 ± 16	70 ± 16	72 ± 15	76 ± 13
Raw + CP + LF	60 ± 18	68 ± 14	68 ± 14	70 ± 12
Average	56 ± 16	63 ± 15	64 ± 16	70 ± 12

Table 9

F-measure expressed in percentage (%) for House-B of Kasteren datasets.

Feature	Naïve Bayes	CRF	HMM	2-layer HMM
Raw	49 ± 10	52 ± 13	50 ± 13	60 ± 12
Change point (CP)	52 ± 8	60 ± 17	60 ± 11	71 ± 10
Last-Fired (LF)	55 ± 11	55 ± 12	55 ± 12	68 ± 13
Raw + CP	59 ± 12	62 ± 18	57 ± 9	67 ± 14
Raw + LF	60 ± 8	65 ± 12	56 ± 12	66 ± 13
CP + LF	66 ± 12	68 ± 13	70 ± 15	76 ± 11
Raw + CP + LF	62 ± 9	67 ± 15	68 ± 8	72 ± 11
Average	58 ± 10	61 ± 14	59 ± 11	69 ± 12

Table 10

F-measure expressed in percentage (%) for House-C of Kasteren datasets.

Feature	Naïve Bayes	CRF	HMM	2-layer HMM
Raw	40 ± 10	50 ± 13	52 ± 10	62 ± 12
Change Point (CP)	58 ± 8	60 ± 14	70 ± 8	72 ± 13
Last-Fired (LF)	57 ± 12	58 ± 16	65 ± 11	70 ± 11
Raw + CP	55 ± 10	55 ± 17	60 ± 9	65 ± 12
Raw + LF	58 ± 9	57 ± 15	66 ± 8	68 ± 13
CP + LF	60 ± 13	65 ± 13	72 ± 12	74 ± 13
Raw + CP + LF	55 ± 11	63 ± 18	68 ± 11	71 ± 11
Average	55 ± 10	63 ± 15	65 ± 10	69 ± 12

The experimental results for House-A of Kasteren datasets are similar to House-B of Kasteren datasets. The 2-layer HMM achieves the best F-measure value for all datasets using distinct feature “Change Point (CP)” and combined feature “Change Point + Last-Fired (CP + LF).” The average performance of each activity model for all datasets is shown in Figure 7. From Figure 7 we can conclude that the 2-layer activity model outperforms the other activity models.

Figure 7

Performance comparison with the existing systems.

Time complexity arises while calculating the joint probability of each state sequence with the observed series of events. For example, an HMM having N states will need $N^{N}$ state transition probabilities, $2^{N}$ output probabilities (assuming all the outputs are binary), and $N^{2} L$ time complexity to derive the probability of an output sequence of length L [26]. Time complexity is uncontrollable for realistic problems as the number of possible hidden node sequences typically is extremely high. To reduce time complexity of probabilistic model we proposed 2-layer HMM.

Using 2-layer HMM we can reduce the number of hidden states as well as the number of symbols in each hidden state. Comparison of time complexity for HMM and 2-layer HMM is tabulated in the following table. We calculate output sequence length L, multiplying the number of symbols emitted in corresponding layer by the number of time slices with time duration of 60 s. In House-A of Kasteren dataset, the number of hidden states and the number of observation symbols in each state for the HMM are 10 and 14, respectively. Then time complexity can be calculated as $N^{2} L = 10^{2} \times 14 \times 1440$ = 2,016,000. For 2-layer HMM we consider 5 hidden states and 14 observation symbols in first layer. Second layer is composed of 5 sublayers and the numbers of hidden states and observation symbols for each sublayer are 1 and 1, 2 and 1, 1 and 1, 4 and 7, and 1 and 2, respectively. Time complexity for 2-layer HMM can be calculated as $N^{2} L = (5^{2} \times 14 + 1^{2} \times 1 + 2^{2} \times 1 + 1^{2} \times 1 + 4^{2} \times 7 + 1^{2} \times 2) \times 1440$ = 676,800. Similarly, time complexity of HMM and proposed 2-layer HMM can be calculated for House-B and House-C of Kasteren datasets and are tabulated in Table 11.

Table 11

Comparison of time complexity ( $N^{2} L$ ) for different models.

⁢House-A		⁢House-B		⁢House-C
HMM	2-layer HMM	HMM	2-layer HMM	HMM	2-layer HMM
2,016,000	676,800	5,597,280	1,002,240	7,741,440	1,644,480

5. Conclusion

We have presented new approach to recognize ADLs using binary sensors in home environments. We proposed 2-layer HMM using location information in first layer and object used information in another layer. We used three real datasets to evaluate the recognition performance of the proposed model and then make a comparison with the other activity recognition models. For evaluation we considered F-measure, a measure that considers the correct classification of each class as equally important. F-measure data were expressed as a mean ± standard deviation and significance was analyzed using Student's t-test. Statistical significance was considered as $P < 0.05$ . Comparison results show that combination of location data with object used data can lead to significantly better performance in real world activity recognition. This work demonstrates that ADLs recognition can be done effectively by binary sensor network deployed in home environments. In the future we can apply machine learning schemes for learning the model parameters.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This work was supported by the Industrial Strategic Technology Development Program (10041788, Development of Smart Home Service based on Advanced Context-Awareness) funded by the Ministry of Trade, Industry & Energy (MI, Korea).

References

Duong

Phung

Bui

Venkatesh

Efficient duration and hierarchical modeling for human activity recognition

Artificial Intelligence 2009 173 7-8 830 856

10.1016/j.artint.2008.12.005

2-s2.0-62549089105

Kellokumpu

Pietikäinen

Heikkilä

Human activity recognition using sequences of postures

Proceedings of the IAPR Conference on Machine Vision Applications (MVA ’05)

May 2005

Tsukuba, Japan

570 573

Aggarwal

J. K.

Cai

Human motion analysis: a review

Computer Vision and Image Understanding 1999 73 3 428 440

10.1006/cviu.1998.0744

2-s2.0-0033099637

Polana

Nelson

Detecting activities

Journal of Visual Communication and Image Representation 1994 5 2 172 180

10.1006/jvci.1994.1016

2-s2.0-0038759056

Ward

J. A.

Lukowicz

Tröster

Starner

T. E.

Activity recognition of assembly tasks using body-worn microphones and accelerometers

IEEE Transactions on Pattern Analysis and Machine Intelligence 2006 28 10 1553 1567

10.1109/tpami.2006.197

2-s2.0-33750113404

Sminchisescu

Kanaujia

Metaxas

Conditional models for contextual human motion recognition

Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV ’05)

October 2005

Beijing, China

IEEE

1808 1815

10.1109/iccv.2005.59

2-s2.0-33745893539

Huynh

Schiele

Towards less supervision in activity recognition form wearable sensors

Proceedings of the 10th IEEE International Symposium on Wearable Computers (ISWC ’06)

October 2006

Montreux, Switzerland

3 10

Lara

Ó. D.

Labrador

M. A.

A survey on human activity recognition using wearable sensors

IEEE Communications Surveys & Tutorials 2013 15 3 1192 1209

10.1109/surv.2012.110112.00192

2-s2.0-84881311778

Bao

Intille

S. S.

Ferscha

Mattern

Activity recognition from user-annotated acceleration data

Pervasive Computing 2004 3001

Berlin, Germany

Springer

1 17 Lecture Notes in Computer Science

10.1007/978-3-540-24646-6_1

10.

Van Kasteren

T. L. M.

Englebienne

Kröse

B. J. A.

Human activity recognition from wireless sensor network data: benchmark and software

Activity Recognition in Pervasive Intelligent Environments 2011 4

Atlantis Press

165 186 Atlantis Ambient and Pervasive Intelligence

10.2991/978-94-91216-05-3_8

11.

Tapia

E. M.

Intille

S. S.

Larson

Activity recognition in the home using simple and ubiquitous sensors

Pervasive Computing 2004 3001

Berlin, Germany

Springer

158 175 Lecture Notes in Computer Science

10.1007/978-3-540-24646-6_10

12.

Philipose

Fishkin

K. P.

Perkowitz

Patterson

D. J.

Fox

Kautz

Hähnel

Inferring activities from interactions with objects

IEEE Pervasive Computing 2004 3 4 50 57

10.1109/MPRV.2004.7

2-s2.0-10944269122

13.

Fatima

Fahim

Lee

Y.-K.

Lee

Analysis and effects of smart home dataset characteristics for daily life activity recognition

Journal of Supercomputing 2013 66 2 760 780

10.1007/s11227-013-0978-8

2-s2.0-84888014281

14.

Wilson

D. H.

Atkeson

Simultaneous tracking and activity recognition (STAR) using many anonymous, binary sensors

Pervasive Computing 2005 3468

Berlin, Germany

Springer

62 79 Lecture Notes in Computer Science

10.1007/11428572_5

15.

Cook

D. J.

Schmitter-Edgecombe

Assessing the quality of activities in a smart environment

Methods of Information in Medicine 2009 48 5 480 485

10.3414/ME0592

2-s2.0-65749083852

16.

de Ipiña

D. L.

de Sarralde

I. D.

García-Zubia

An Ambient assisted living platform integrating RFID data-on-tag care annotations and twitter

Journal of Universal Computer Science 2010 16 12 1521 1538

17.

Van Kasteren

T. L. M.

Noulas

Englebienne

Kröse

B. J.

Accurate activity recognition in a home setting

Proceedings of the Conference on Autonomous Agents and Multiagent Systems (AAMAS ’07)

May 2007

Honolulu, Hawaii, USA

1 9

18.

Palmes

Pung

H. K.

Xue

Chen

Object relevance weight pattern mining for activity recognition and segmentation

Pervasive and Mobile Computing 2010 6 1 43 57

10.1016/j.pmcj.2009.10.004

2-s2.0-75049085494

19.

Vail

D. L.

Veloso

M. M.

Lafferty

J. D.

Conditional random fields for activity recognition

Proceedings of the Conference on Autonomous Agents and Multiagent Systems (AAMAS ’07)

May 2007

Honolulu, Hawaii, USA

1 8 article 235

20.

Lanchantin

Lapuyade-Lahorgue

Pieczynski

Unsupervised segmentation of randomly switching data hidden with non-Gaussian correlated noise

Signal Processing 2011 91 2 163 175

10.1016/j.sigpro.2010.05.033

ZBL1203.94037

2-s2.0-77956177375

21.

Fox

Location-based activity recognition

Proceedings of the 30th Annual German Conference on Advances in Artificial Intelligence (KI ’07)

September 2007

Osnabruck, Germany

Springer

22.

Zhu

Sheng

Motion- and location-based online human daily activity recognition

Pervasive and Mobile Computing 2011 7 2 256 269

10.1016/j.pmcj.2010.11.004

2-s2.0-79955910724

23.

Van Kasteren

T. L. M.

Datasets for Activity Recognition

https://sites.google.com/site/tim0306/datasets

24.

Rabiner

L. R.

A tutorial on hidden Markov models and selected applications in speech recognition

Proceedings of the IEEE 1989 77 2 257 286

10.1109/5.18626

2-s2.0-0024610919

25.

Khreich

Granger

Miri

Sabourin

A survey of techniques for incremental learning of HMM parameters

Information Sciences 2012 197 105 130

10.1016/j.ins.2012.02.017

2-s2.0-84859102158

26.

Bilmes

J. A.

What HMM can do

IEICE Transactions on Information and Systems 2006 E89-D 3 1 24 Technical Report of University of Washington, Department of Electrical Engineering, Number UWEETR-2002-0003