Modelling,monitoring and evaluation to support automatic engineering process management

Abstract

Process management is considered to be an essential approach to improve the performance of an enterprise. The process of an engineering project is considered to be a formalised workflow accompanied by a set of decisions. With decisions being made by taking account of information from various sources, the operation and management of modern engineering projects has to deal with increasing amounts of dynamic and changing project information. Understanding and interpreting this information for use in process management can generate challenges in practice. This might be caused by constraints of time and resource, the distributed structure of the information and a lack of modelled domain knowledge. To address these challenges, the research described in this paper focuses on techniques that support automation of the process management of engineering projects, from a data-driven perspective. The research includes elements of process modelling, monitoring and evaluation of such projects, through a proposed automatic process analysis system. The proposed system works with live and historical data. Within this paper, the design and implementation of the system is described. The use of techniques such as autonomic computing, data mining and KM technologies are shown, and the system functionality is demonstrated through the use of a dataset from an aerospace organisation.

Keywords

Aerospace in-service autonomic computing process modelling process evaluation process visualisation

Introduction

The process of an engineering project should be a formalised workflow accompanied by a set of decisions that are made by taking into account various pieces of information, such as project objectives, required resources and the dynamics of the working environment. Process management has been considered as an essential approach to improving the performance of an enterprise.¹ In recent research, various process management models have been proposed covering the fields of process modelling, process reuse, complexity identification, process standardisation and optimisation.^2,3

However it has become clear that the operation and management of modern engineering projects needs to deal with increasing volumes of dynamic information.⁴ The related actions may involve receiving, understanding and interacting with this information via various project partners. It will also require the creation of solutions to interpret, articulate, clarify and make decisions based on this the dynamic information. Hence, understanding this dynamic project information and reusing the information for process management purposes generates challenges in practice. The challenges can be caused by:

constraints of time and resource;

distributed structure of information;

lack of modelled domain knowledge.

The details of these challenges are discussed below.

Constraints of time and resource: As manufacturing in the globalised environment faces intensive competition, the need to maintain a high level of profitability, reduce operation/management cost and improve time/resource efficiency become critical requirements for engineering companies.^5,6 To support the related decision-making tasks, the essential information generated at each project stage needs to be captured and assessed collectively. The detailed analysis of this information on a detailed level could enable decision makers to have a comprehensive understanding about project process evolutions and characteristic changes, such as the interactions between project actors, the dependencies between project components, and the performance changes regarding different project stages.

However, such information is typically considerable, especially from large-scale projects or projects with complex processes. Thus, due to the conflict between the amount of information and the limitation of time/human resource, human-centred decision-making struggles to fully take account of all the necessary information. This also may mean that the decision-making and related problem solving tasks may have an outcome bias in areas where the project actor has specific knowledge and expertise, rather than an overall project perspective.

Distributed structure of information: From an organisational perspective, the globalisation of product design, manufacturing and services enables the partners of collaborative engineering projects to be located in various regions/countries.^7,8 As a result, the operation and management of such engineering projects is critically dependent upon distributed communications, together with large amounts of digitalised documents and dynamically generated workflows.^9,10 In practice, the management of distributed data is still not straightforward or well-supported. There are a number of unsolved issues in system design, data storage, data integration and transaction management.^11,12 Therefore, the distributed project information, together with the decentralised project structure, could significantly increase the complexity of the project process and the difficulty of the project operation/management. This could prevent project actors from having awareness of potential problems throughout the project lifecycle.

Lack of modelled domain knowledge: The shortage of experienced labour has become a serious issue in various industrial sectors, including aerospace, automotive and other engineering sectors.^13–15 The high rate of retirement and turnover of experienced staff can fundamentally change the structure of established departments, decrease their average expertise level and operational efficiency. To avoid these negative impacts, knowledge management (KM) has been suggested as a solution.^16,17 However, using top-down methods to model the knowledge in large organisations, even in small-to-medium-sized enterprises, remains a challenge. This is often due to the KM process itself being complicated and time/resource consuming.^18,19

According to recent research, solving these three key challenges is considered to be critical for improving the capability of information management, KM, and workflow management for collaborative engineering projects.^10,15,20 It is these issues that drive the research work described in this paper.

The purpose of the paper

In order to automate the process management of an engineering project, the following requirements are considered in this paper.

Model the operational process from the project data and define process normality.

In real time monitor the process and ascertain the level of process normality.

Utilise the captured/modelled domain knowledge to facilitate process management and its related analytical tasks.

Based on these requirements, a system is proposed that integrates three main functionalities: process modelling, process monitoring and process evaluation. The design and implementation of the system and its related analytical approaches apply autonomic computing, data mining and KM technologies.

To demonstrate the creation and the evaluation of the proposed system, case studies from the aerospace service sector (departments that provide in-service support) are shown. In-service departments for aircraft play a vital and increasingly important role in delivering services such as modifications and upgrades, maintenance and emergency repairs to airline operators of all types. In addition, they have the opportunity to collect feedback from the stakeholders, for example, airline operators, contractors and specialist suppliers. The information contained in the feedback can then be used to improve aircraft design in the future.

As is typical of many collaborative engineering projects, in-service projects contain complex processes with asynchronous and synchronous collaborations.²¹ Their execution processes need to deal with various constraints in terms of time, budget and resource. In the manufacturing environment, the process management of in-service projects still relies on human-centred decision-making approaches. This decision-making requires the understanding, modelling and representation of large amounts of project information. Consequently, the monitoring and evaluation of multiple processes on a real-time basis is a challenge for the in-service departments. This challenge becomes a common issue in the process management of engineering projects today. To demonstrate that the proposed automatic process analysis system (APAS) can address such a challenge, the research presented in this paper utilises information and knowledge from 396 in-service case studies, to research the design, implementation and generalisation of the system and the underpinning techniques utilised.

The remainder of this paper is organised as follows. The section ‘Related work’ reviews the work in the fields of process management, autonomic computing and data mining in manufacturing. The section ‘Automatic process analysis system (APAS)’ introduces the proposed APAS, and the technical details of the proposed approaches to be used in the APAS, are described in the section ‘Automatic approaches to process management’. The experimental results are evaluated in the section ‘Evaluation and experiment’, and the section ‘Conclusions’ provides the conclusions from the analysis.

Related work

Process management

Process management aims to improve the efficiency and effectiveness of organising project activities, together with facilitating the understanding of inter-relationships among the activities.²² As an extension of workflow management, it involves various information technologies, specific knowledge and associated data to support the design, enactment, management and analysis of operational processes.^23,24

According to recent research, various process management models have been developed and applied in different sectors to solve their practical problems. Benner and Tushman²⁵ revealed that the activities of process management have associations with the increase of innovations and the share of innovations. This demonstrates that process management can be used to enhance the technical innovations, and broaden the existing knowledge of organisations. Pino, García and Piattini²⁶ indicated that the use of process management in software engineering projects is an effective way to improve the project management, documentation management, requirements change management, process establishment, configuration management and requirements elicitation. de Mast et al.²⁷ proposed a conceptual framework of process management for healthcare projects, by considering the micro processes, tasks and resources from the project workflow, which is used to improve the process efficiency, resource management and organisational performances. Rao et al.²⁸ introduced a model that integrates ontologies and knowledge maps with process re-engineering approaches, and it is used to improve the efficiency of business processes. Lerner et al.²⁹ proposed an approach to handling exception patterns of modelled process, which could capture the relationship between exception handling tasks and the normative process.

Due to the dynamics in manufacturing environments, the definition and technical specification of process management are difficult to be consolidated and formalised. As a result, the creation or selection of suitable process management models remains a challenge.³⁰ It is also challenging to apply the existing process management models within manufacturing processes. At times, the application of such models is considered less important by project actors, for example, a model could be applied at the early stage of a project, and then intentionally or unintentionally ignored at the following stages.³¹ To improve the usability of process management models, researchers have shown that it is necessary to reduce human intervention. Hence, computer-aided technologies need to be integrated. In the automotive industry, information-technology-based support has been used to support process management and support process development.³² The technologies have also been used to support process planning in collaborative manufacturing and product lifecycle management.³³

A number of gaps become clear from this review, in particular that there are only piecemeal attempts at solutions, that it is important to reduce human intervention and that the time-based aspect has been largely overlooked.

Autonomic computing

Information systems in modern manufacturing have increasingly sophisticated structures, as they need to include more components, deal with distributed resources, process heterogeneous data, and support multiple users with varying levels of access. These factors make the development, configuration and management of such systems become more difficult. Researchers from IBM first revealed that the operational and managerial mechanisms of complex information systems have certain degrees of similarity with the human biological system.³⁴ For example, a biological system such as the autonomic nervous system can manage essential body functions autonomously, such as the monitoring of heartbeats, maintenance of blood sugar levels and body temperature, without any effort from the human. This type of self-management feature is considered particularly useful for improving the autonomy of complex information systems.³⁵

On the basis of this concept, autonomic computing is designed as an automatic approach that simulates the autonomic nervous system. This could be used to integrate self-management functionalities into information systems, that is, to control the function of computation- and system-operation-related tasks without human intervention.

Autonomic computing systems are typically referred to as any information system that involves autonomic computing. In its self-management process, the system needs to sense the temporal status of each internal component and changes in the condition of the external environment. It then takes appropriate actions based on the sensed information. The control loop in the autonomic computing system covers the aspects of self-configuration, self-optimisation, self-healing and self-protection.³⁴ In general, self-configuration means the system can automatically adjust itself according to pre-designed high-level guidance; self-optimisation means the system can constantly monitor and adjust itself during operation, continually improve its performance and efficiency; self-healing means the system has the capability of handling failures caused by itself; and self-protection means the system can automatically anticipate and prevent attacks or failures caused by external sources.³⁶

To implement complex tasks, the autonomic computing system integrates various types of components, including software, hardware, services or combinations of them. Each system component is treated as a single autonomic element that is designed to automatically perform certain behaviours. The autonomic element contains a communication mechanism that enables it to communicate with other elements in the system. Based on different component combinations, the system could have the capability to perform tasks in different manufacturing environments. For a particular environment, the key behaviours of the autonomic elements need to be pre-defined according to the environment condition and user requirements, and then organised into hierarchical structures. The functionality of the system is therefore dependent upon the internal behaviour of each autonomic element, and the relationships between the autonomic elements.^35,37

Recently, various types of autonomic computing systems have been proposed and applied in different fields. Kim et al.³⁸ proposed an autonomic model to manage the application workflow in hybrid computing infrastructure. Here, the system and application states are monitored and then the applications and resources are adapted to respond to the changes of environment. Caton and Rana³⁹ stated that large-scale and multi-user information management systems, such as cloud services, could utilise autonomic management to improve their reliability and the predictability of resource management. Fallon and O’Sullivan⁴⁰ integrated semantic technologies into an autonomic model, and then applied the model to manage end-user service quality. These ideas have been used in the creation of the approaches used by the research reported in this paper.

Data mining in manufacturing

In modern manufacturing, more and more information management systems have started to integrate advanced analytical approaches from data mining and machine learning fields, in order to address issues on information understanding and reusing, and to fulfil information needs about process management and evaluation. Data mining is an important tool to support daily information management tasks and discover the knowledge from manufacturing databases.⁴¹ It is also considered to be the fundamental tool for developing more advanced information management systems. For example, it can be used to achieve functions including predictive maintenance, fault detection, quality control and customer relationship management.^42,43 Under the scope of data mining, various analytical technologies have been proposed, such as classification, clustering, pattern identification/extraction, feature selection/modelling and visualisation. By using these approaches, raw data from engineering projects can be automatically organised and analysed, enabling project actors to gain a more comprehensive understanding of project characteristics without excessive effort.⁴⁴

Recent research has demonstrated that data mining is playing an important role in supporting information management in manufacturing. Gröger et al.⁴⁵ revealed that data mining can be used to optimise workflow-based business processes and generate decision rules/trees for process analysis. Shi et al.⁴⁶ proposed an approach integrating data mining with domain knowledge, which automatically identifies temporal changes of an engineering project process from workflow related data. Shi et al.⁴⁷ also revealed that patterns contained in project documentation are particularly useful for the planning of similar projects. Meanwhile, identified patterns can be directly transferred into a knowledge base for future reuse.

Data generation of an engineering project is accompanied by the project operation process; hence the information contained in project data can be used to reconstruct the actual project process. Data mining has been considered to have the capability to discover meaningful patterns in the data, and also to identify relations between the patterns.

On this basis, data mining has been selected as suitable to be used as the virtual sensor and information provider in the autonomic process management system, which could automatically identify and extract process-related information from project data.

Automatic process analysis system (APAS)

In modern manufacturing, collaborative engineering projects are concurrent, heterogeneous, and constrained. In general, ‘concurrent’ implies that, to fit the demands of changing markets and consumers, the project teams need to simultaneously implement multiple tasks/projects on different product/service lines; ‘heterogeneous’ implies that projects could require different processes due to their different types, priority settings and technical requirements; ‘constrained’ implies that the projects are performed under constraints, for example, time pressures, limitations of humans or other types of resources.

To automate the process management for collaborative engineering projects, two research challenges are identified.

How to improve the manageability and understandability of project information.

How to reduce human intervention in data analysis and decision-making-related tasks.

To address these challenges, the proposed APAS integrates data mining technologies, natural language processing and knowledge bases. Knowledge bases are used as the high-level guidances to perform analytical tasks. Project actors, for example, engineers and managers, are able to add new knowledge to the knowledge bases at any time, or to edit the existing concepts in the knowledge bases if needed. These behaviours could integrate the dynamics of environments into the knowledge bases.

The system encompasses three main phases, that is, the modelling phase, the monitoring phase and the evaluation phase (see Figure 1). These phases are executed in a pre-defined order, each having specific functionalities. For example, the modelling phase focuses on processing the captured project data, identifying process features from the data and selecting additional features from modelled knowledge bases; the monitoring phase categorises modelled processes by using clustering/classification methods, segment such processes based on pre-defined time intervals and compares the similarity between ongoing processes and segmented processes; the evaluation phase measures the process normality, and generates interactive visualisations for modelled processes.

Figure 1.

The main phases of the system.

The components included in these phases and the technical details are introduced as follows.

Modelling phase : as shown in Figure 2, the components involved in this phase include data processing, knowledge mapping, feature modelling and process modelling. At the initial stage, the distributed project data is captured and integrated from different sources, including those related to communication, workflow and environment. The integrated data is treated as the system input, and then processed by the analytical modules, including metadata analysis, content analysis and semantic analysis. For the analysis, certain types of knowledge bases, for example, lexical ontologies, are applied to support the semantic analysis module. This module is used to identify and extract semantic features, for example, important terminologies or phrases, from content-related data such as project descriptions, objectives or related communications. In the knowledge mapping module, the extracted information is used to select domain-specific knowledge bases to facilitate further analysis, such as feature modelling and process modelling. The feature identification module aims to identify process-related features from the data, for example, activity names, named entities, document names, department information, client information, etc. In the feature selection module, the identified process-related features are filtered/weighted based on the selected domain-specific knowledge bases, and then the most important features are passed to the next step, activity mapping. In the activity mapping module, each identified activity is treated as a high-level indication of process-related features, which can be used to map/link related features together. In the end, the sequencing module converts the features with their timestamps into feature-based sequences (modelled process).

Figure 2.

Modelling phase.

Monitoring phase : as shown in Figure 3, the components involved in this phase include process visualisation and process monitoring. In this phase, an interactive interface for process visualisation is provided to the project actors, enabling them to browse multiple modelled processes simultaneously. Process monitoring contains two modules, clustering and classification, which are used to categorise the modelled processes into groups. The clustering module applies the unsupervised approach to categorise the modelled processes into groups by measuring the similarities/dissimilarities between the processes. In other words, the processes in the same group need to have a certain level of similarity with each other. This module aims to assist the project actors through quick access to projects with similar processes. The classification module applies the supervised approach, by using a training set to discriminate the category a process belongs to. The training set contains a collection of labelled processes, and each label corresponds to a pre-defined category. The processes together with their labels from the training set are used to train the classifier in how to categorise unlabelled processes. During the classification, a specific label will be assigned to an unlabelled process, if the process has certain level of similarity with the labelled processes. With this module, when a new project takes place, it will be automatically categorised into the appropriate category. This module aims to help the project actors effectively compare the new/current project process with other similar projects.

Figure 3.

Monitoring phase.

Evaluation phase : as shown in Figure 4, the components involved in this phase include process modelling, content retrieval, process segmentation, normality measure and process visualisation. In the initial stage, the data of ongoing projects is processed by the process modelling module. The content retrieval module then automatically retrieves similar projects from the dataset based on the identified process features. Based on the elapsed time of ongoing projects, the processes of retrieved projects are segmented by the process segmentation module. Next, the process normality of ongoing projects is measured by considering the sequence similarity between the ongoing project process and retrieved project processes. The outputs from the analysis generated by the normality measure module enable project actors to ascertain whether the ongoing project is not behaving in a normal fashion.

Figure 4.

Evaluation phase.

In the APAS, the analytical modules are designed to be automated. To achieve their functionalities, approaches based on data mining and machine learning are required. The technical details of these approaches are introduced in the following section.

Automatic approaches process management

To deal with different projects, a critical requirement is to handle their heterogeneous data in a uniform way, ensuring that the information contained in such data can be represented consistently. It then enables both computers and humans to use the same approach for understanding and evaluating the process structures and project characteristics.

In the APAS, a hybrid representation is proposed and used to fulfil the requirement. The core consists of feature-based vectors and sequences, which are created by considering two types of information, that is, semantic and process-related. Accordingly, five analytical approaches are used to achieve the automatic creation of these vectors and sequences. These approaches include feature identification, feature selection, process sequencing, process similarity measure and a process normality measure.

Feature identification and selection

Using $p_{i}$ to denote a project, $p_{i} = {M, C}$ denote its related dataset, where M is the set of metadata, and C is the set of content information. The metadata contains the bottom-level description and attributes-related information of the project data itself, for example, the file version, creator, issued by, modified date, storage path, etc. The content information contains descriptive information, semantic information and knowledge-related information about the project, for example, the defined objectives, encountered problems, applied solutions, contained communications, issued documentations, involved technical details, etc.

A general knowledge base $K_{G}$ contains terminologies, entities and general concepts related to the common characteristics of projects. Let $F_{G}$ denote a set of semantic features in $K_{G}$ , and $f_{G, m}$ denote a single feature, where $f_{G, m} \in F_{G}$ and $F_{G} = (K_{G} \cap M) \cup (K_{G} \cap C)$ . The feature vector of project $p_{i}$ is

v_{i} = [f_{G, 1}, f_{G, 2}, \dots, f_{G, m}]^{T}

(1)

where m is the total number of semantic features identified from the project data.

For a process-related knowledge base $K_{P}$ , contains process-related concepts, activities, document types, and their related dependencies and weightings. Let $F_{P}$ denote a set of process-related features in $K_{P}$ , and $f_{P, n}$ denote a single feature, where $f_{P, n} \in F_{P}$ and $F_{P} = K_{P} \cap C$ . If a process sequence containing multiple features, the index of a single feature is determined by the timestamp of the feature. For example, $f_{P, i}$ and $f_{P, j}$ are the features in one sequence, $t s_{P, i}$ and $t s_{P, j}$ denote their timestamps respectively, and the indexes of these features satisfy $i < j$ , if the timestamps of them satisfy $t s_{P, i} < t s_{P, j}$ . According to these, the process sequence of the project $p_{i}$ is represented as

s_{i} = \sum_{n = 1}^{| F_{P} |} w_{n} \cdot f_{P, n}

(2)

where n is the index of feature $f_{P, n}$ ; $| F_{P} |$ is the size of feature set $F_{P}$ ; and $w_{n}$ is a weighting function, $w_{n} \in [0, 1]$ .

Time-based sequence/vector segmentation

In practice, a project process is usually dynamic over time, thus adding a time dimension onto the data representation is necessary. This time dimension will enable the investigation of process evolution for both ongoing (i.e. uncompleted) and completed projects, as well as the comparison of process similarity between them. For this purpose, time-based sequence/vector segmentation is proposed and applied to convert a modelled process into multiple sub-processes based on time intervals.

For a given timestamp ts, it can be used to indicate the absolute time, for example, xth hour/day/step, or the relative time, for example, x% of the project timeline T. Let $F_{P}'$ denote a process-related feature set with ts, for its contained features, their timestamps need to satisfy $t s_{P, n} < ts$ . According to this, the sub-sequence of project $p_{i}$ is represented as

\begin{matrix} s_{i}' & = \sum_{n = 1}^{| {F'}_{P} |} {(w_{n} \cdot f_{P, n})}_{ts}, ts \in τ \\ τ & = {\sum_{j = 1}^{| seg |} (j \cdot \frac{T}{| seg |})}, j \in [1, | seg |] \end{matrix}

(3)

where $τ$ is the set of time intervals regarding the timeline, and $| seg |$ is the number of segmentations of the timeline.

Similarly, using $F_{G}'$ denote a general feature set with ts, for its contained features, the timestamps need to satisfy $t s_{G, n} < ts$ . According to this, the sub-vector of project $p_{i}$ is represented as

v_{i}' = [f_{G, 1}, f_{G, 2}, \dots, f_{G, | {F'}_{G} |}]^{T}

(4)

where $| F'_{G} |$ is the size of feature set $F'_{G}$ .

In this segmentation process, $F_{P}'$ and $F_{G}'$ are the truncated sets being converted from the original sets $F_{P}$ and $F_{G}$ . The contained features in these truncated sets are determined/adjusted by the setting of ts. For example, if ts is the last date of T, then $F_{P}' = F_{P}$ and $F_{G}' = F_{G}$ , which means that the segmented sequence and vector are equal to the original ones; if ts is equal to $10 %$ of the timeline T, then $F_{P}'$ and $F_{G}'$ will only contain the features that are generated within the $10 %$ of project progress.

Process similarity measure

The similarity measure of proposed data representation considers semantic similarity and sequence similarity. In the monitoring phase, project processes with certain levels of combined similarity are categorised into the same group. The similarity measure process is described as: given a threshold $ε \in (0, 1)$ , the processes of project $p_{i}$ and $p_{j}$ are considered as similar, if and only if $si m_{sem} (p_{i}, p_{j}) + si m_{seq} (p_{i}, p_{j}) ⩾ ε$ .

To measure the semantic similarity, the vector space model is applied. For a set of projects, all of the contained semantic features from them are used to construct a feature space. The similarity between any two projects is then measured based on the angle between their related feature vectors in the feature space. For the given feature vectors $v_{i}$ and $v_{j}$ , the similarity is

si m_{sem} (v_{i}, v_{j}) = \frac{v_{i} \cdot v_{j}}{∥ v_{i} ∥ ∥ v_{j} ∥}

(5)

where $∥ v_{i} ∥$ and $∥ v_{j} ∥$ are the $L^{2}$ norm of $v_{i}$ and $v_{j}$ .

To measure the sequence similarity, the Levenshtein distance is applied.⁴⁶ Let $| s_{i} |$ and $| s_{j} |$ denote the sequence length, the similarity between them is calculated by using

{sim}_{seq} (s_{i}, s_{j}) = {\begin{matrix} 1 - \frac{{dis}_{Lev} (s_{i}, s_{j})}{\min (| s_{i} |, | s_{j} |)}, & if {dis}_{Lev} (s_{i}, s_{j}) ⩽ φ \cdot \min (| s_{i} |, | s_{j} |) \\ 0, & otherwise \end{matrix}

(6)

where $min (| s_{i} |, | s_{j} |)$ is the minimum length of $s_{i}$ and $s_{j}$ ; $φ$ is a threshold, $φ \in (0, 1)$ .

By considering both semantic similarity and sequence similarity, the normalised similarity of projects $p_{i}$ and $p_{j}$ is measured by

sim (p_{i}, p_{j}) = α \cdot si m_{sem} (v_{i}, v_{j}) + β \cdot si m_{seq} (s_{i}, s_{j})

(7)

where $α, β$ are the adjustable weights, $α, β \in [0, 1]$ , $α + β = 1$ . These two variables are used to adjust the similarity measure process. For example, if $α = 0$ , the similarity between two projects is determined only by their content; if $β = 0$ , the similarity between two projects is determined only by their processes. An empirical setting of them is $α = β = 0.5$ , which equally considers the effects from both content and processes.

Normality measure

In the evaluation phase, process normality is used to evaluate whether the project process is on the right track during its operation.⁴⁶ In this paper, the normality measure of modelled processes is implemented on two levels, that is, micro-level and macro-level.

Micro-level measure : for a given process-related feature, there could be different prior and posterior adjacent features in different project processes. The distribution of process-related features is mainly determined by two factors, feature property and project type. For the given feature, its own property determines its occurrence probability; the type of its related project determines its adjacent features. For example, for a process-related feature, named issue repair instruction, of in-service projects, the prior features and posterior features, together with their occurrence probabilities, are shown in Table 1 and Table 2 respectively.

Table 1.

List of prior features with probabilities.

Prior Feature	Feature	Probability
Request repair instruction		58.56%
Receive damage report	Issue	14.38%
Request approval sheet	Repair instruction	9.70%
Request damage report		5.82%
Issue technical disposition		2.74%
…		…

Table 2.

List of posteriori features with probabilities.

Feature	Posteriori Feature	Probability
	Send repair instruction	46.15%
Issue	request damage report	7.69%
Repair instruction	Issue approval sheet	13.19%
	Receive damage report	12.09%
	Issue technical disposition	3.37%
	…	…

Review of the data in these tables shows that certain feature combinations have higher occurrence probabilities, such as request repair instruction and issue repair instruction (58.56%), receive damage report and issue repair instruction (14.38%), etc. However, some feature combinations have lower or zero occurrence probabilities, such as issue technical disposition and issue repair instruction (2.74%), issue approval sheet and issue repair instruction (≈0%). In general, if a project process mainly contains feature combinations with high occurrence probabilities, the process could be considered to be normal; if the project process excessively contains feature combinations with low or zero occurrence probabilities, the process could be considered to be less normal.

In this micro-level measure, for a given dataset, the possible adjacent features of each single feature need to be identified, and the occurrence probabilities of the adjacent features need to be calculated accordingly. For a single feature $f_{P, x}$ and a dataset D, $\Pr (f_{P, x})$ represents the feature occurrence probability in the dataset, and $F_{x, D}$ represents a set of adjacent features of $f_{P, x}$ that are identified from the dataset. For a project containing the feature combination ( $f_{P, x}, f_{P, x + 1}$ ), its micro-level normality is measured by

no r_{mic} (f_{P, x}, f_{P, x + 1}) = {\begin{matrix} \Pr (f_{P, x + 1}), & if f_{P, x + 1} \in F_{x, D} \\ - 1, & otherwise \end{matrix}

(8)

According to this equation, if a modelled process containing ( $f_{P, x}, f_{P, x + 1}$ ), but the feature $f_{P, x + 1}$ never appears behind $f_{P, x}$ in the given dataset, then the normality score of this process will be decreased during the process evaluation.

Macro-level measure : the macro-level measure considers the entire project process, instead of the feature combinations. Fundamentally, given a modelled process, this approach aims to calculate the number of other processes from the dataset that are similar to the given process.

Let $s_{i}$ denote the entire sequence of a modelled process, $S_{i, D}$ is the set of other similar sequences contained in the dataset. The macro-level normality of the sequence is measured by,

no r_{mac} (s_{i}) = {\begin{matrix} | S_{i, D} | \cdot \sum_{j = 1}^{| S_{i, D} |} & si m_{seq} (s_{i}, s_{j}) \\ 0, & if \frac{| S_{i, D} |}{| D |} ⩽ φ \end{matrix}

(9)

where $| S_{i, D} |$ is the size of $S_{i, D}$ , that is, the number of similar sequences of $s_{i}$ in dataset D; $| D |$ is the total number of sequences contained in the dataset, and $φ$ is a threshold, $φ \in (0, 1]$ , which is used to adjust the process of macro-level measure. For example, when $φ = 0.05$ , the sequence $s_{i}$ will be considered to be normal, if and only if its similar sequences take more than $5$ % proportion in the dataset.

Evaluation and experiment

This section demonstrates the functionalities of APAS and evaluates the analytical modules of the system through the use of a dataset from an aerospace organisation. The data set contained 396 in-service projects that were implemented between 2013 and 2014. The data of each project included workflow, technical and communication-related information. All of the information items had identifiable timestamps. Knowledge about general operation process and feature weightings was captured from subject-matter experts in the departments. It is then used as the knowledge base to support related analytical modules within the APAS. The evaluation comprised three main parts, i) data processing and feature modelling, ii) process modelling and monitoring, and iii) process evaluation.

Data processing and feature modelling

The data of in-service projects included various file types, for example, database-related (.sql), image-related (.jpg, .tiff, scanned PDF files) and text related (.pdf, .doc, .txt). These types of data are common across a lot of organisations and industries. Some of the image files also contained textual information, for example, scanned PDFs, drawings or images with annotations. To utilise the information contained in such files, optical character recognition was used to pre-process some of the data. After this, all the textual data was processed using natural language processing. The processing method involved tokenisation, stop words removing and stemming, which filtered out terms with less importance, and re-organised the data into a more structured format.

Semantic analysis with lexical ontologies is used to extract semantic features from the pre-processed data. The main reason for using lexical ontologies is to address the polysemy and synonymy problems. In general, polysemy means that one term could have different concepts, and synonymy means that one concept could be described by different terms. For example, ‘fuselage’ and ‘airframe’ have a similar semantic meaning but in different forms. For information items containing ‘fuselage’ and ‘airframe’ respectively, they could be considered to have identical concepts by the project actors immediately. However, they could be considered to have different concepts by the system if only term-based analysis is applied. Eliminating disambiguated word senses is recognised as essential to improve the accuracy of content analysis.⁴⁸ For information management systems in manufacturing, correctly distinguishing the semantic meaning of project data is critical, as it can directly affect the way of organising unstructured information, discovering knowledge and sharing information between people/departments.¹⁰

After initial processing, the project data is converted into a set of semantic features. Table 3 shows the semantic features being extracted from the operator’s damage report (ODR). As shown in the table, the semantic features are the essential terms contained in the report content, and the importance of a feature is indicated by the feature occurrence. The semantic features with their occurrences describe the project details, service requirements and technical requirements. For example, the features from the table could describe the affected locations, for example, ‘wing’, ‘top skin’, ‘rear rib’, or service types, for example, ‘cracking, ‘corrosion’.

Table 3.

Semantic features of the operator’s damage report (ODR).

Term	Occurrence	Term	Occurrence
top	102	damage	37
skin	87	threshold	35
rear	81	location	34
rib	79	j1	32
hole	75	fh	31
crack	72	face	31
fastener	62	rh	30
issue	59	structure	27
wing	56	fit	25
fatigue	55	bay	25
flange	49	justification	25
stress	47	material	25
inspection	47	load	24
corrosion	43	…	…

Besides the semantic features, the process-related features also need to be identified and extracted. Table 4 shows a list of process-related features that are extracted from the content and metadata of ODR. In the feature modelling module, named entity recognition and information extraction are applied to perform this task. Named entity recognition could automatically identify named entities from the data, and the identified entities include people, organisations, dates, locations, etc. Information extraction is applied to detect the relations/dependencies between identified entities. With the assistance of pre-defined rule sets and knowledge bases, information extraction also identifies information with more-specific properties, including aircraft types, part names, serial numbers, document reference numbers and contact information.

Table 4.

Identified process-related features of the operator’s damage report (ODR).

‘feature_name’:	‘project_actors’: [	‘activated_date’: [
‘ODR’,	P1,	‘2013-04-15’,
‘feature_id’:	P2,	‘2013-04-16’,
‘1’,	],	‘2013-04-22’,
‘feature_property’:	‘departments’: [	],
‘outgoing’,	D1,	‘document_refs’: [
‘file_type’: [	D2,	‘7057597x/0x4’,
‘text’,	],	‘7057597x/0x9’,
‘image’,	‘location’: [	‘7057597x/0x4’,
],	L1,	‘7057597x/0x2’,
‘aircraft_type’:	L2,	],
‘A320-XX’,	],	…

From a general information-management point of view, converting heterogeneous data into appropriate features could facilitate content analysis and feature modelling. It enables the system to integrate with different analytical technologies, including content retrieval, content-based clustering/classification and sequential pattern mining. Moreover, the selection and utilisation of knowledge bases will be performed more efficiently and effectively by further improving the automation and performance of the analytical modules.

Process modelling and monitoring

Once the data processing and feature identification has been completed, each project process is then represented by a set of combined features, that is, semantic features and process-related features. The semantic features need to be converted into feature vectors. Table 5 shows the feature vectors that are generated from part of the dataset. In this table, each row is the feature vector of one project process.

Table 5.

Sample of semantic feature vectors.

	skin	lh	multisite	lightning	rear	upper	rh	panel	surface	rib	…
P1	0.087	0	0.113	0	0	0	0.087	0.087	0	0	…
P2	0.066	0	0	0	0.087	0	0	0	0	0.066	…
P3	0.080	0	0.105	0	0	0	0	0.080	0	0.080	…
P4	0.066	0.087	0	0	0	0	0	0.066	0.066	0.066	…
P5	0	0	0	0.210	0	0	0.080	0	0	0	…
P6	0.075	0	0	0	0.098	0.098	0	0.075	0.075	0	…
P7	0.075	0	0	0	0	0.098	0	0.075	0.075	0.075	…
P8	0.070	0	0	0	0	0	0.070	0	0.070	0	…
P9	0	0.123	0	0	0	0	0	0	0	0	…
P10	0.087	0	0	0	0	0.113	0	0	0	0	…

To generate the sequence, process-related features are applied. Certain features may have higher importance than others, if they have higher correlations with the key activities in the project process. The information contained in these features is sufficient to represent the project process, thus the process modelling module only needs to take into account certain features, instead of the entire feature set. It also means that some features with less importance should be removed deliberately. This could improve the robustness and rationality of the process modelling module, as well as eliminate possible interferences caused by irrelevant information. Based on knowledge captured from the in-service departments, 15 types of process-related features are considered, and some of them are shown in Table 6.

Table 6.

List of process-related features.

Feature
OM: outgoing message
AW: answer
S&:F stress & fatigue
ODR: operator’s damage report
IM: incoming message
RDAS: repair/design approval sheet
TD: technical disposition
DRG: drawing
RI: repair instruction
…

For any given dataset, the process modelling module analyses its contained project information recursively, and generates a separate feature set for each project, whilst the timestamp of each feature is identified. The considered features (in Table 6) with timestamps are then used to create the process sequences. Figure 5 shows a visualisation of some modelled processes. In this visualisation, each row is a modelled project process. Each Tx indicates a single feature, and the features contained in a sequence are arranged in chronological order. This visualisation helps project actors to review multiple modelled processes simultaneously, which enables them to efficiently gain general understanding of the process structure.

Figure 5.

Sample of process sequences.

Moreover, Figure 6 shows another visualisation of the modelled processes based on an interactive interface. Comparing to the previous one, this visualisation shows the explicit relationship between each feature and its timestamp. It enables the project actors to have a drill-down view of each modelled process at the document or content level, which aims to help them to understand the inner structure of the processes, and the dependencies of the features. For example, EMAIL indicates the initialisation of a task; LINEITEM indicates the type of action being required by the task implementation; and ATTACHMENT indicates the actual documents or data being generated during the task implementation. All the features are then organised into groups based on their timestamps, which can be zoomed into or zoomed out of by using the navigation tabs on the interface. Both listed visualisations (Figures 5 and 6) are included in the process visualisation module.

Figure 6.

An interactive interface for visualising the modelled processes.

Process evaluation

This evaluation aims to investigate the process evaluation module through two scenarios, that is, unsupervised process evaluation and supervised process evaluation.

Unsupervised process evaluation

The process evaluation module contains automatic mechanisms of sequence analysis and process normality identification. In the unsupervised scenario, advanced domain knowledge, labelled or pre-classified data are not necessary for performing sequence analysis and normality identification. The process normality of projects is measured based on the sequence structure and the dataset structure. For the given dataset, the sequence similarity of each project is calculated by using the proposed approach, and then a similarity matrix is generated accordingly (as shown in Table 7). The data shown in the matrix is from Figure 5 (due to space limitations, only the first 10 entries are displayed in the matrix). To simplify it, sequence similarity values less than 0.5 are filtered out.

Table 7.

Sequence similarity and normality.

	P1	P2	P3	P4	P5	P6	P7	P8	P9	P10
P1		0	0	0.619	0	0	0.546	0	0.523	0.557
P2	0		0.719	0	0	0	0.662	0	0	0.593
P3	0	0.719		0	0	0	0.585	0	0	0
P4	0.620	0	0		0	0	0.508	0	0	0
P5	0	0	0	0		0	0	0	0	0
P6	0	0	0	0	0		0	0	0	0
P7	0.546	0.662	0.585	0.508	0	0		0	0	0.585
P8	0	0	0	0	0	0	0		0	0
P9	0.523	0	0	0	0	0	0	0		0.557
P10	0.557	0.593	0	0	0	0	0.585	0	0.557
M-Nor.	8.980	5.919	2.606	2.254	0.000	0.000	11.425	0.000	2.160	9.168

In the matrix, projects P2 and P3 are considered to be the most similar, with the highest similarity value 0.7185. According to their sequence structures shown in Figure 5, some common patterns can be identified in both of them. For example, in the early project stage, pattern {T1, T9} occurred; in the middle project stage, pattern {T1, T9, T13, T15, T1} occurred; and in the late project stage, patterns {T9, T11} and {T15, T13, T15, T9} occurred. By examining their content-related information, both projects are related to an identical aircraft model, same damage location and service type. Moreover, projects P5, P6 and P8 are all considered to be unique, as their similarity values are the lowest ones in the matrix. According to their sequence structures, few common patterns are contained in their processes when comparing them with the others. By examining their content-related information, project P5 is the only one having lightning-related damage, therefore its process should naturally be different from others. Project P6 is a new type of damage that has never occurred in the past; a special service procedure therefore needs to be designed for it, leading to its process being different from the others. Project P8 has a different damage location, that is, its damage is on the ‘lower surface’ whereas the others are on the ‘top surface’, hence a different process is required.

Comparing process normality for large numbers of modelled processes by the use of such detailed information is not always realistic, due to limited time and restricted access to the information in practice. The macro-level normality is considered to be a fast approach to comparing multiple processes more efficiently. It takes into account of both the similarity values and the quantities of similar projects. The macro-level normality of a modelled process is in proportion to the cumulative similarity of its related projects, in conjunction with the number of its related projects. For example, project P7 has 5 related projects, and its cumulative similarity is equal to $(0.546 + 0.662 + 0.585 + 0.508 + 0.585) = 2.886$ , thus its macro-level normality is $2.886 \times 5 = 14.430$ . In general, for two projects having identical cumulative similarity, the one with the most similar projects will be considered to have higher normality; for two projects having the same number of similar projects, the one with the higher value of cumulative similarity will be considered to have higher normality.

Supervised process evaluation

In the supervised scenario, the process evaluation module assesses the characteristics of new projects by using labelled data. In practice, the departments may already have labelled/annotated data of completed projects. By the use of past experience and modelled knowledge, more labelled data can be created automatically using data mining technologies. In this evaluation, six labels are assigned to the projects in the dataset, according to specific rules summarised by knowledge experts of the departments. That projects have identical labels means they are similar to each other in terms of their contained process-related features. Next, 10-fold cross-validation is applied to divide the dataset into a training set (containing 90% data) and a test set (containing 10% data). Afterwards, some classification approaches, that is, naive Bayes, artificial neural network (ANN), support vector machine (SVM) and random forest (RF), are applied to classify the test set based on the training set. The objectives of this evaluation include, i) testing whether the labelled projects can be used to evaluate new projects (the label of test data are invisible to these classifiers, thus the test data can be treated as ‘new projects’ in the evaluation), ii) testing whether the data representation proposed in this paper has the capability to work with standard classification approaches. The results of this evaluation are shown in Table 8.

Table 8.

Results of process classification.

	Naive Bayes	ANN	SVM	RF
Precision	0.792	0.933	0.859	0.915
Recall	0.794	0.932	0.841	0.916
F-measure	0.790	0.932	0.830	0.915

Overview

According to the results from the table, the classification approaches, especially ANN and RF, have good performance classifying unlabelled project processes. This indicates that new project processes can be automatically classified based on the labelled processes with appropriate accuracy. Meanwhile, the results also demonstrate that the proposed data representation can work with standard classification approaches with no issues. Moreover, the evaluation in the unsupervised scenario confirms that the process with low normality scores can be automatically identified by applying the proposed approach. All these results show that the process evaluation module in the system has the capability to evaluate the project processes.

Conclusions

The proposed system APAS in this paper aims to support the automation of process management for collaborative engineering projects. The main functionalities of the system include process modelling, process monitoring and process evaluation. To automate the system functionalities, related analytical approaches have been proposed, including feature modelling, process sequencing, process similarity measures and process normality identification. Using the proposed approaches, the system converts project data into processes that can be modelled, in the form of feature vectors and sequences. It is then possible to perform automatic analysis on them. Thus the approach and the associated system enables project actors to monitor and evaluate multiple project processes simultaneously, whilst giving them unique understanding of process evolution, project characteristic changes and environment dynamics.

The evaluation shows that the system with the proposed approaches have the capability to work with real engineering data. It also shows that the proposed data representation for engineering projects can be integrated with various data mining technologies, implying that the functionality of this process management system can be customised and extended.

Further work includes the development of a knowledge capture module. This module is required to be integrated with the knowledge bases. The main functionalities include the detection of common knowledge of historical projects, and the reduction of human intervention for knowledge base creation. In addition, the design of a formal knowledge structure for aerospace in-service is under consideration, and the evaluation of such knowledge structure is also required.

Footnotes

Acknowledgements

The authors would like to thank the industrial collaborators and their engineers for their input and support on this project.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research reported in this paper is funded by the Engineering and Physical Sciences Research Council (Grant Number EP/K014196/1).

References

Haddar

Makni

Abdallah

HB.

Literature review of reuse in business process modeling. Software Syst Model 2014; 13: 975–989.

Schäfermeyer

DKM

Rosenkranz

Holten

The impact of business process complexity on business process standardization. Bus Inf Syst Eng 2012; 4: 261–270.

Shen

Wang

Hao

Agent-based distributed manufacturing process planning and scheduling: A state-of-the-art survey. IEEE Trans Syst Man Cybern Part C Appl Rev 2006; 36: 563–577

Eppler

Mengis

The concept of information overload: A review of literature from organization science, accounting, marketing, MIS, and related disciplines. Inf Soc 2004; 20: 325–344.

Al-Najjar

Alsyouf

Enhancing a company’s profitability and competitiveness using integrated vibration-based maintenance: A case study. Eur J Oper Res 2004; 157: 643–657.

Airbus. Less maintenance, less costs. FAST magazines 2002.

Nieusma

Riley

Designs on development: Engineering, globalization, and social justice. Eng Stud 2010; 2: 29–59.

Abdalla

HS.

Concurrent engineering for global manufacturing. Int J Prod Econ 1999; 60: 251–260.

Carey

Culley

Weber

2013. Establishing key elements for handling in-service information and knowledge. In: International Conference on Engineering Design, Seoul, South Korea, 19–22 August 2013, vol. 6, pp.11–20. The Design Society.

10.

Xie

Culley

Weber

. Applying context to organize unstructured information in aerospace industry. In: 18th international conference on engineering design, Lyngby/Copenhagen, Denmark, 11–15 August 2011, vol. 6. Design Information and Knowledge.

11.

Agrawal

El Abbadi

Antony

et al . Data management challenges in cloud computing infrastructures. In: Databases in networked information systems. Aizu-Wakamatsu, Japan, 29–21 March 2010, pp.1–10. Springer.

12.

Deelman

Chervenak

. Data management challenges of data-intensive scientific workflows. In: 8th IEEE international symposium on cluster computing and the grid, Lyon, France, 19–22 May 2008, pp.687692. IEEE.

13.

Dychtwald

Erickson

Morison

. Workforce crisis: How to beat the coming shortage of skills and talent. Brighton, Massachusetts: Harvard Business Press, 2013.

14.

Lewin

Massini

Peeters

Why are companies offshoring innovation? The emerging global race for talent. J Int Bus Stud 2009; 40: 901–925.

15.

Weber

Dauphin

Fuschini

et al . Expertise transfer: A case study about knowledge retention at Airbus. 13th international conference on concurrent enterprising, Sophia-Antipolis, 4–7 June 2007, pp.329–338.

16.

Lindner

Wald

Success factors of knowledge management in temporary organizations. Int J Project Manage 2011; 29: 877–888.

17.

Rosemann

vom Brocke

. The six core elements of business process management. In: Handbook on Business Process Management 1. Berlin, Germany, 2015, pp.105–122. Springer.

18.

Hislop

Donald

. 2013. Knowledge management in organizations: A critical introduction. Oxford: Oxford University Press, Oxford.

19.

Hörisch

Johnson

Schaltegger

Implementation of sustainability management and company size: A knowledge-based view. Bus Strategy Environ 2014; 24: 765–779.

20.

Scheer

Nüttgens

. ARIS architecture and reference models for business process management. Berlin, Germany: Springer, 2000.

21.

Vianello

Xie

Ahmed-Kristensen

et al . Handling of in-service support: Comparison of two case studies from complex industries. In: 11th international design conference, Dubrovnik, Croatia, 17–20 May 2010.

22.

Weske

. Business process management: Concepts, languages, architectures. Berlin, Germany: Springer Science & Business Media, 2012.

23.

de Medeiros

AKA

Van der Aalst

Pedrinaci

. Semantic process mining tools: Core building blocks. In: Proceedings of ECIS 2008, Galway, Ireland, 9–11 June 2008, 96pp.

24.

Van Der Aalst

. Business process management: A comprehensive survey. ISRN Software Eng 2013; 37.

25.

Benner

Tushman

Process management and technological innovation: A longitudinal study of the photography and paint industries. Administrative Sci Q 2002; 47: 676–707.

26.

Pino

García

Piattini

Software process improvement in small and medium software enterprises: A systematic review. Software Qual Control 2008; 16: 237–261.

27.

de Mast

Kemper

Does

RJMM

et al . Process improvement in healthcare: Overall resource efficiency. Qual Reliab Eng Int 2011; 27: 1095–1106.

28.

Rao

Mansingh

Osei-Bryson

KM.

Building ontology based knowledge maps to assist business process re-engineering. Decis Support Syst 2012; 52: 577–589.

29.

Lerner

Christov

Osterweil

et al . Exception handling patterns for process modeling. IEEE Trans Software Eng 2010; 36: 162–183.

30.

RKL

. A computer scientist’s introductory guide to business process management (BPM). Crossroads 2009; 15: 4:11–4:18.

31.

Gnatz

Deubler

Meisinger

et al . Towards an integration of process modeling and project planning. In: Proceedings of the 26th international conference on software engineering, Edinburgh, UK, 3–28 May. Institution of Engineering and Technology. 2004, pp.22–31.

32.

Müller

Herbst

Hammori

et al . IT support for release management processes in the automotive industry. In: 4th international conference on business process management, Vienna, Austria, 5–7 September 2006, pp.368–377.

33.

Ming

Yan

Wang

et al . Collaborative process planning and manufacturing in product lifecycle management. Comput Ind 2008; 59(2–3): 154–166.

34.

Kephart

Chess

DM.

The vision of autonomic computing. Comput 2003; 36: 41–50.

35.

Sterritt

Parashar

Tianfield

et al . A concise introduction to autonomic computing. Adv Eng Inf 2005; 19: 181–187.

36.

IBM. An architectural blueprint for autonomic computing. IBM White Paper, 2006.

37.

Huebscher

McCann

JA.

A survey of autonomic computing—degrees, models, and applications. ACM Comput Surv 2008; 40: 7.

38.

Kim

El-Khamra

Rodero

et al . Autonomic management of application workflows on hybrid computing infrastructure. Sci Program 2011; 19(2–3): 75–89.

39.

Caton

Rana

Towards autonomic management for cloud services based upon volunteered resources. Concurrency Comput Pract Experience 2012; 24: 992–1014.

40.

Fallon

O Sullivan

. Aesop: A semantic system for autonomic management of end-user service quality. In: IFIP/IEEE international symposium on integrated network management, Ghent, Belgium, 27–31 May 2013, pp.716–719. IEEE.

41.

Choudhary

Harding

Tiwari

MK.

Data mining in manufacturing: A review based on the kind of knowledge. J Intell Manuf 2009; 20: 501–521.

42.

Harding

Shahbaz

Kusiak

et al . Data mining in manufacturing: A review. J Manuf Sci Eng 2006; 128: 969–976.

43.

Köksal

Batmaz

Testik

. A review of data mining applications for quality improvement in manufacturing industry. Expert Syst Appl 2011; 38: 13448–13467.

44.

Wang

Applying data mining to manufacturing: The nature and implications. J Intell Manuf 2007; 18: 487–495.

45.

Gröger

Niedermann

Mitschang

Data mining-driven manufacturing process optimization. Proc World Congr Eng 2012; 3: 4–6.

46.

Shi

Gopsill

Newnes

et al . A sequence-based approach to analysing and representing engineering project normality. In: IEEE international conference on tools with artificial intelligence, Limassol, Cyprus, 10–12 November 2014, pp.967–973.

47.

Shi

Gopsill

Snider

et al . Towards identifying pattern in engineering documents to aid project planning. In: 13th international design conference, Cavtat - Dubrovnik, Croatia, 19–22 May 2014.

48.

Agirre

Edmonds

. Word sense disambiguation: Algorithms and applications, volume 33. Berlin, Germany: Springer Science Business Media, 2007.