Abstract
Dunhuang is a unique art treasure and a world heritage site. In order to organise and manage Dunhuang cultural heritage resources, this article studies the classification of Dunhuang murals in different dynasties, and explores the topic distribution characteristics and evolution rules of them. First, image features are extracted through scale-invariant feature transform (SIFT) and Canny and scale-invariant feature transform (CSIFT), a visual dictionary is generated through the k-means clustering algorithm, and the term frequency–inverse document frequency (TF-IDF) vector is calculated and combined with the colour feature vector extracted via hue, saturation and value (HSV). Second, Dunhuang mural images are collected and the support vector machine (SVM) classifier is built. Finally, the knowledge graph-based topic maps are constructed, and graph theory is introduced to analyse the topic distribution and evolution of Dunhuang murals in different dynasties. The results show that the Dunhuang murals of different dynasties can be effectively classified through the bag of words, HSV and support vector machine (BOW_HSV_SVM) based on their visual features. Through topic maps, the topic distribution characteristics and evolution rules of Dunhuang murals with the dynasties are revealed.
Keywords
1. Introduction
As a treasure of culture and art, Dunhuang murals record history, religious beliefs, customs and so on in different dynasties. Developing digital humanities changes and innovates traditional humanities research methods. From the perspective of digital humanities, digital Dunhuang provides convenience for cultural tourism, scientific research and spiritual civilisation construction. Based on information science, management science and computer science, this article analyzes the contents and characteristics of Dunhuang murals resources, mines their associative relationships and topic evolution rules, and facilitates the organisation, management and application of them.
Cultural heritage, the product of society and history, provides vivid materials for current scientific research, and is of great significance for tracing the development of human civilisation and cultural exchange. Therefore, UNESCO, Europeana and other cultural organisations have launched cultural heritage digitalisation projects [1]. In the field of digital humanities, taking advantage of modern information technologies to promote the storage and utilisation of digital resources, and to strengthen the protection of cultural heritage is the theme of development. Various digital technologies have been applied to Dunhuang murals, such as colour restoration technology [2], three-dimensional laser scanning technology [3] and three-dimensional reconstruction technology [4]. At the current stage, a large number of cultural heritage digital resources have been accumulated, and their characteristics and rules need to be researched to improve the efficiency of resources management and knowledge mining.
Dunhuang cultural heritages, running through many dynasties, are the crystallisation of Chinese culture, religion and art. The contents of Dunhuang murals, representing the cultural and artistic ideas at that time, are the accumulations of history. To explore the topic evolution of Dunhuang murals through image classification, the SIFT (scale-invariant feature transform), CSIFT (Canny and scale-invariant feature transform) and HSV (hue, saturation and value) are used to extract image features, and a visual dictionary is generated through k-means. Under the assumption that there are differences in painting styles, colour matchings, lines and shapes of Dunhuang murals, the BOW_HSV_SVM (bag of words, HSV and support vector machine) classification model is constructed to identify the dynasties based on the low-level features of images. To further research the differences and correlations of the contents of Dunhuang murals with the alternation of dynasties, topic maps are constructed and graph theory (GT) is introduced to analyse the topic distribution and evolution of Dunhuang murals based on the high-level semantics of images. This research is conducive to image annotation [5,6], semantic representation [7,8] and knowledge discovery, and it promotes the inheritance, sharing and exchange of cultural heritage.
This article is organised as follows. Section 2 reviews the related theories and technologies. Section 3 describes the methodology of images classification and the construction of topic maps of Dunhuang murals. Section 4 presents the experiment results and findings, and the discussions are in section 5. Section 6 draws the conclusion and makes plans.
2. Related work
Dunhuang murals, which are created by strictly trained folk painters, reflect the faith and spirit of the central government. Therefore, it is of great significance to excavate the semantic information of Dunhuang murals for the study of the development of history, culture and the thought of ancient people. However, it requires a lot of manpower and complicated work to label and interpret Dunhuang murals. Furthermore, some murals are still difficult to understand due to the lack of historical documents. Therefore, there is a strong need for scientific and quantitative methods to facilitate obtaining and analysing intrinsic information from Dunhuang murals.
In recent years, technology-aided painting analysis has become a research hotspot. For example, fractal techniques were employed to analyse Pollock’s drip paintings, which created a precedent for using computational means to analyse the painting arts [9]. Besides, first-order and high-order wavelet statistics were used for painting art identification [10]. Gabor wavelet filter was used to analyse the brush strokes of Van Gogh paintings [11]. With the advent of the big data era, the large-scale collections of paintings can be used for modelling and analysing to discover the general rules of paintings within a historical period. Computational and statistical algorithms based on mathematics have been increasingly used for the analysis of painting art [12]. There is an opportunity for studying Dunhuang cultural heritage through modern research paradigms and advanced technologies.
Under the assumption that there are different painting topics and visual features of Dunhuang murals in different dynasties, this article collects and labels Dunhuang mural images, constructs a classifier based on machine learning method, identifies the murals of different periods, and then introduces GT to analyse the topic evolution rules based on topic maps.
2.1. Dunhuang murals
Dunhuang murals – containing a great deal of cultural, artistic, religious and political information – have remarkable humanistic characteristics. Dunhuang grottoes are internationally recognised cultural heritages, especially the Mogao Grottoes, which was listed as a world heritage site in 1987 [13]. Dunhuang Grottoes are huge and exquisite mural art palaces and historical galleries with a history of nearly 1600 years and Dunhuang murals are regarded as magnificent libraries on the wall [14]. Depicting various aspects of medieval politics, economy, art, religion, ethnic relations and daily activities in Western China, Dunhuang murals are important academic treasures, which provide precious resources for Dunhuang studies [15]. The Dunhuang mural images are different from ordinary images, for they contain semantic information related to artistic creative intentions and Buddhist culture. The styles of Dunhuang murals can be described in four aspects: topics, layouts, pattern elements and colours. And each image of Dunhuang murals reflects Buddhist artistic topics in a specific era.
The characters, historical documents and manuscripts of the murals can help us identify the dynasties in which the murals were created. In the absence of the above conditions, the painting styles and image features can also work. When a dynasty was replaced by another, the styles and colours of paintings would change. For example, slender ladies were very popular in the Sui Dynasty, while the chubby were admired in the later Tang Dynasty. Therefore, the fairies and ladies in the murals in the Tang Dynasty are fatter than those in the Sui Dynasty. At the same time, the painting colours in the Tang Dynasty were more abundant. In addition, the clothes, currencies, architectures and farm tools are also different. With the development of the digital humanities, the research in the field of computer, information science, philology and humanities constantly cross and merge, introducing new theories and technologies for information organisation, resource management and knowledge discovery. Therefore, machine learning technologies are used to analyse the visual coding of Dunhuang mural images and describe the contents, styles and characteristics of murals. Recognising the dynasties of Dunhuang murals based on the differences of image visual features facilitates the organisation and management of resources, especially the image interpretation, annotation and archive.
As a world cultural heritage, Dunhuang murals have remarkable humanistic characteristics. The digitisation of Dunhuang cultural heritage has preserved precious digital resources for cultural inheritance and humanities studies. However, it is difficult for users to understand Dunhuang culture through murals, which makes the value of Dunhuang cultural heritage hard to be fully reflected, hindering the spread and exchange of culture. These precious murals can help scientific researchers study human allusions and semantic contents in different dynasties, such as ‘玄奘西游 (Xuan Zang Journey to the West)’, ‘飞天 (Flying Apsaras)’ and ‘涅槃 (Nirvana)’. The information resource management of Dunhuang murals not only pays attention to the collection, storage and security protection of resources but also emphasises the effective utilisation of resources and knowledge discovery. Revealing the topic evolution of Dunhuang murals is conducive to the in-depth understanding and the maximisation of the value of cultural heritage.
2.2. The BOW and image classification
Image classifier is constructed to identify dynasties of Dunhuang murals, which is conducive to revealing their topic evolution rules. With strong interpretability, the BOW (bag of words) is selected to extract and represent image visual features. The BOW model is derived from text processing. Sivic and Zisserman [16] applied the BOW model to the computer vision domain. Texts are mapped to the dictionary through text segmentation, and word vectors are constructed for text calculation. Similar to text processing, a visual dictionary is constructed through clustering algorithms (k-means [17] and k-medoids [18]), and image features are mapped to it to generate a vector to represent this image. After calculating the distances between vectors (Euclidean distance, Manhattan distance and Hamming distance), images can be divided into different categories. The core of the BOW model is to cluster the image feature descriptors into a visual dictionary, and then statistically analyse the image features with visual words to generate word vectors [19], that is, BOW-based image vectors reflect 0-order statistical information of image features in the visual dictionary [20].
Currently, there are extensive studies on the BOW model. As one of the feature encoding algorithms in the aspect of image retrieval and text processing, the BOW model can be effectively applied to image classification [21,22] and object recognition [23,24]. Zou et al. [25] proposed an SVM-based image classification method through an improved BOW model and combined it with context information to generate middle-level semantic features. Zhou et al. [26] proposed the hierarchical bag of words (HBoW) for large-scale image retrieval. Wang et al. [27] researched image retrieval of Chinese herbal medicine plants in Changbai Mountain based on BOW. Yang et al. [28] used BOW to construct content-based computed tomography (CT) image retrieval of focal liver lesions. Yang et al. [29] proposed an enhanced BOW framework for a large-scale remote sensing image search for users.
Image classification is often applied to location recognition [30], plant identification [31], medical diagnosis [32], building recognition [33] and face recognition [34]. The process of image classification is extracting image features, calculating the distances among features and distinguishing images from different categories according to certain division rules. Representative machine learning classification models include SVM [35], AdaBoost (AB) [36], GBDT [37], Random Forest (RF) [38], Extra-Trees (ET) [39], Decision Trees (DT) [40] and linear discriminant analysis (LDA) [41]. The significance of image classification is to use image feature coding and classifiers to recognise the categories of images, thereby reducing human subjective judgements and greatly saving the cost of manual identification. Therefore, assuming that the Dunhuang murals of different dynasties are distinguishable, it is feasible to classify Dunhuang murals through classification. Compared with convolutional neural network (CNN) [42], BOW and HSV [43] are more explainable and efficient for Dunhuang murals with rich contents and distinctive colours. Histogram of oriented gradient (HOG)-SVM [44] is used for images with obvious texture features instead of the faded and blurring Dunhuang mural images with a long time of oxidation. Thus, BOW and HSV are used to describe image features and classifiers with excellent performance are selected for Dunhuang mural images classification.
2.3. Knowledge graph-based topic maps and GT
To clearly describe the relationships among objects, a knowledge graph uses the language of graphs to visually model them. Knowledge graph – which is very intuitive, direct and efficient – removes the intermediate process to simplify problems and avoids losing valuable information. Therefore, the knowledge graph-based topic maps of Dunhuang murals are constructed to reveal the topic distribution characteristics and evolution rules. Any complex network relationship among entities can be analysed by GT. Therefore, GT is used to describe and analyse the relationship among Dunhuang mural topics.
The concept of knowledge graph was first proposed by Google in 2012. It is regarded as a large-scale knowledge base composed of a large number of entities and relationships among them [45]. In recent years, as a semantic network, knowledge graph has been widely used in natural language processing [46], intelligent question-answering systems [47], intelligent recommendation systems [48] and other fields. Knowledge graph, together with big data and deep learning, has become the core driving force for artificial intelligence development [49,50]. The topic distribution characteristics and evolution rules can be revealed through the knowledge graph, which promotes knowledge mining and discovery. Topic map is used to intuitively describe the distribution and association of topics, and it has been widely used in public emergencies [51], especially in the management and control of online public opinion [52]. This article labels the topics of Dunhuang mural images and constructs topic maps based on knowledge graph to visualise the topic distribution.
GT is a branch of combinatorial mathematics and closely related to group theory, matrix theory and topology, emerging developments of which involve extremal graph theory (EGT) [53], random graph theory (RGT) [54], algebraic graph theory (AGT) [55], quantitative graph theory (QGT) [56]. The graph is the main research object of GT. A graph (G = (V, E)), composed of many given vertices and edges connecting two vertices, is used to describe specific relationships among certain things. GT originated from the famous seven bridges of Königsberg and this problem was solved by Euler in 1736 [57]; thus, Euler is generally believed to be the founder of GT [58]. At present, GT has a wide range of applications in the knowledge discovery of paintings [59], reliability assessment [60], pattern recognition [61] and artificial intelligence [62]. In this article, GT is introduced to analyse the network structure of topic maps to reveal the topic distribution characteristics and evolution rules, enabling researchers to understand Dunhuang murals deeply.
3. Methodology
The painting styles and topics contained in Dunhuang murals reflect social life, economic development, spiritual civilisation and artistic level in different dynasties. There are few Dunhuang murals with date labels in the grottoes, and some of them are controversial. Considering the massive digital resources and the high cost of resource management, the classification algorithms are used to identify the dynasties of the murals. Then, the topic distribution characteristics and evolution rules of the murals are analysed. In this section, the image classification framework and the knowledge graph-based topic maps for Dunhuang murals are constructed.
3.1. Dunhuang mural image classification based on BOW_HSV_SVM
Except some dated grottoes such as Mogao, the dates of the murals in some other grottoes are still controversial. The dates of murals are conjectured by experts according to their painting styles and contents. But this method lacks efficiency and rigorousness, especially in the absence of relevant historical documents. Therefore, the machine learning-based classification model is constructed to identify the creation date of Dunhuang murals efficiently. To verify the classification effect of this model, Dunhuang mural images are collected and divided into 80% training set and 20% test set. The constructed image classification framework of Dunhuang murals is in Figure 1.

Image classification framework of Dunhuang murals.
The image classification framework of Dunhuang murals based on the BOW_HSV model and machine learning algorithm comprises three modules: (1) the extraction of image features and the construction of the BOW_HSV model, (2) the construction and training of classifiers and (3) the evaluation of the Dunhuang mural image classification. The processes are as follows:
Step 1: local features of images are extracted through SIFT [63], speeded up robust features (SURF) [64], principal component analysis-scale invariant feature transform (PCA-SIFT) [65], binary robust invariant scalable keypoints (BRISK) [66], oriented FAST and rotated BRIEF (ORB) [67], KAZE [68] and so on. In this article, the classic image feature extraction algorithm (SIFT), which uses spatial pyramid and Gaussian kernel difference filtering to quickly approximate the extreme points in the Laplacian of Gaussian space, is chosen to extract the local features of Dunhuang mural images. Due to a few feature points being detected from blurred images and edge-smoothed targets through SIFT, the Canny edge detector [69,70] is introduced to accurately locate sharp intensity changes and highlight object boundaries. After detecting edges of targets and damaged parts of the Dunhuang mural image, the keypoints of these fine and smooth edges are extracted through SIFT to generate CSIFT descriptors. They are integrated with the SIFT descriptors to describe the features of a grey image and edges of objects to improve the performance of BOW in classification. The k-means clustering algorithm is used to cluster the image features and generate the visual dictionary. Based on the visual dictionary, the vector of the image is calculated by TF-IDF (term frequency–inverse document frequency) for each feature word [71]. Considering that Dunhuang mural images are rich in colours and SIFT is conducted in a grey image, HSV [72] is used to extract colour features. The final feature descriptors of Dunhuang mural images are shown in Figure 2.

Feature descriptors of Dunhuang mural images: (a) original image, (b) SIFT descriptors, (c) HSV descriptors, (d) Canny edge detection, (e) CSIFT descriptors and (f) final features descriptors.
Step 2: different image classifiers for Dunhuang murals are trained. Image features are extracted to represent visual attributes of murals, and classification rules of the selected classifiers can be learned in the process of data training. Classifiers include SVM, AB, GBDT, LDA, RF, ET and DT. Then, the best image classifier is selected to identify the dynasties of Dunhuang murals.
Step 3: the performance of image classifiers for Dunhuang murals is evaluated. The collected Dunhuang mural images are divided into the training set and the test set (4:1). The Precision, Recall, F1-score and Accuracy are selected as metrics for evaluating classification effects. In Table 1, TP represents that the positive instance is correctly predicted to be positive, and TN represents that the negative instance is correctly predicted to be negative. FN and FP represent that the positive and negative instances are predicted to be negative and positive, respectively. The calculations of Recall, Precision, F1-score and Accuracy are as follows
Confusion matrix of Dunhuang mural image classification.
3.2. The construction of knowledge graph-based topic maps
GT provides a powerful framework for studying the components of networks and their interactions. It has a wide range of applications, including physics, biology, sociology and information systems [73]. The traditional indexes describing the characteristics of network topology include node degree, close degree and betweenness centrality [59]. The knowledge graph can intuitively reflect entities and their relationships in the form of graphs. According to the resource description framework (RDF), it can be expressed in a factual triple in the form of (head, relation and tail) or (subject, predicate and object). G = {E, R, F} is the representation of knowledge graph, where E, R and F are the sets of entities, relations and facts, and a fact is a triple (h, r, t) є F. The combination of knowledge graph and topic labels is used to construct topic maps of Dunhuang murals, which can vividly depict topic distribution characteristics, providing a new perspective for the study of the topic evolution and the cultural and artistic changes of Dunhuang murals in different dynasties. The topic map is Gtopic = {(topici, relationij, topicj)}, where Gtopic represents the topic (node) and the relation (edge) set. When relationij is 1, there is an edge linking topici with topicj. Otherwise, when relationij is 0, there is no edge between topici and topicj. Therefore, Dunhuang cultural heritage resources can be represented and linked through knowledge graph [74]. Furthermore, GT is introduced to analyse topic distribution characteristics and correlations according to network topology structures of topic maps. The topic evolution of Dunhuang murals can reveal the content transition of Dunhuang cultural heritage from the perspective of semantics.
The topic semantic labels of Dunhuang murals in different dynasties are labelled. Gephi software is used to construct topic maps that revealed the relationship among topics and their distribution. In Gtopic, each topic is a node, and the edge between two nodes represents the two topics are related (relationij = 1). In topic maps, the size of the node represents the count of topics, and the thickness of the edge represents the weight of their relation. With the construction of topic maps, the topic distribution characteristics and evolution rules of Dunhuang murals are analysed. The contents of Dunhuang murals are complex, and the differences and correlations among them are hard to discover. However, these differences and correlations are of great significance to reveal the cultural exchange, spirit trend and dynasty transition. Therefore, the topic maps of Dunhuang murals are constructed and the topic distribution characteristics and evolution rules are analysed through GT, which is conducive to the resource organisation and knowledge mining of Dunhuang murals. Understanding the topic distribution characteristics and evolution rules is of great significance to image annotation, recommendation, retrieval and topic mining of Dunhuang cultural heritage.
4. Experiments and results
The experiments can be divided into the image classification of Dunhuang murals based on BOW_HSV and machine learning, and the topic evolution analysis of Dunhuang murals based on topic maps. The effects of classifiers are evaluated through Dunhuang mural images, and topic maps are constructed through the Gephi software.
4.1. Dunhuang mural image classification
This article collects 1750 images of Dunhuang murals in Northern Wei, Western Wei, Northern Zhou, Sui Dynasty, Initial Tang, Peak Tang, Middle Tang, Late Tang, Five Dynasties and Song, Western Xia and Yuan from the complete collection of Dunhuang frescoes in China [75]. In the experiment, 80% training set and 20% test set are employed to train and test the classifiers. RF, AB, DT, ET, LDA, GBDT and SVM are tested and compared, and then the classifier with the best classification effect is selected to classify Dunhuang murals. Besides, the classic SIFT and CSIFT are used to extract image features, the k-means clustering algorithm is used to generate a visual dictionary, and the TF-IDF vectors are calculated and combined with HSV colour vectors as the input of the classification model.
Figure 3 illustrates that the combination of SIFT, CSIFT and HSV is superior to single feature extraction method in the classification of Dunhuang murals. The dimension of the feature vectors of image relies on the number of visual words, which affects the effect of image classification. The classification effects of different classifiers are shown in Figure 3, where the number of descriptor points of the SIFT algorithm is set to 200, and the coefficient of the combination of TF-IDF and HSV feature vectors is the inverse of the ratio of their dimensionality.

The classification effect of SVM with different feature extraction methods.
Figure 4 shows that the classification effect of SVM is better than other classifiers when the number of visual words is in [100,500]. Therefore, the SVM classifier is appropriate to classify Dunhuang murals. Furthermore, as shown in Figure 5, the performance of the SVM is improved as the set of descriptor points in the SIFT increases. It is worth mentioning that BOW_HSV_SVM performs well when the number of descriptor points of SIFT is more than 300. All these experiment results illustrate that the Dunhuang murals of different dynasties are distinguishable in visual features.

The performance of different classifiers when the number of visual words is different. (a) Precision, (b) Recall, (c) F1-score and (d) Accuracy.

The performance of BOW_HSV_SVM when the number of descriptor points of SIFT is in [50,500]. (a) Precision, (b) Recall, (c) F1-score and (d) Accuracy.
4.2. The analysis on the topic evolution of Dunhuang murals
To dig out the differences and correlations of the contents of Dunhuang murals with the alternation of dynasties, knowledge graph-based topic maps are constructed and analysed. After collecting Dunhuang mural images, topics are labelled by five professional researchers. The Gephi-0.9.2 data visualisation [76], a network analysis software, is used to construct the topic maps and analyse the topic distribution and evolution of Dunhuang murals. The scheme of Force Atlas [77] is adopted to optimise the layout of nodes. The topic map is Gtopic = {(topici, relationij, topicj)}, where Gtopic represents the topic (node) and the relation (edge) set, and relationij is in {0, 1}. In this graph, the node represents the topic, the size of the node represents the number of the topic and the thickness of the edge represents the weight of their relation.
According to Figures 6 and 7, some topics span different dynasties, for example, ‘飞天 (Flying Apsaras)’, ‘菩萨 (Bodhisattva)’, ‘藻井 (Raised Ceiling)’, ‘供養人 (Supporting Beings)’ and ‘说法 (Teaching Buddhist)’, which represent the beliefs, religious figures and activities, and inevitable painting elements. The topic distribution of Dunhuang murals in different dynasties has different degrees of concentration (The larger the dynasty node is, the more topics connected to the dynasty are, and vice versa.). In topic maps of Dunhuang murals, the average degree is 1.25, the average weighted degree is 2.355, and the graph density is 0.002. The size of the degree of a topic represents the number of connections with other topics, implicating its importance. The density represents the concentration degree of topics, indicating the richness of topics in the current dynasty. For example, the lower the density is, the more dispersed the topics are. A reasonable explanation is that people were more open-minded, and there were more cultural exchanges and integration in this dynasty.

The topic map of Dunhuang murals measured by out-degree.

The topic map of Dunhuang murals measured by in-degree.
Figure 8 shows that there are differences in the topic distribution in different dynasties, which are closely related to the political situation of the central government, cultural differences, religious development, the evolution of artistic ideas and so on. The topic distribution is displayed to discover the characteristics of the contents of Dunhuang murals. According to the co-occurrence topics, ‘说法 (Teaching Buddhist)’ is a daily religious activity with great significance, ‘飞天 (Flying Apsaras)’ represents that humans expect to get the ability to fly and explore the mystery of the universe, and the ‘菩萨 (Bodhisattva)’ represents a kind of powerful immortals. There are different topics representing the combination of nations and cultural exchanges in each dynasty, such as ‘回鹘供養人 (Uighur Supporting Beings)’, ‘蒙古族供養人 (Mongolian Supporting Beings)’, ‘张骞出使西域 (Zhang Qian Journey to the Western Regions)’ and ‘玄奘西游 (Xuan Zang Journey to the West)’. In the prosperous Tang Dynasty, the topics containing ‘文殊變 (Manjusri Sutra)’, ‘普贤變 (Samantabhadra Sutra)’, ‘涅槃经變 (Nirvana Sutra)’, ‘弥勒经變 (Maitreya Buddha Sutra)’ and ‘金刚经變 (Vajrac-chedika-prajnaparamita)’ are a series of grand topics with a comprehensive narrative. Furthermore, these murals are composed of rich figures, ceremonious religious activities, bright colours and smooth lines, which illustrate the fact that the Tang Dynasty is prosperous and powerful. Therefore, the economic and social development, and the artistic level of a dynasty can be interpreted according to the Dunhuang murals.

The topic maps of Dunhuang murals in different dynasties. (a) Northern Wei, (b) Western Wei, (c) Northern Zhou, (d) Sui, (e) Initial Tang, (f) Peak Tang, (g) Middle Tang, (h) Late Tang, (i) Five Dynasties and Song and (j) Western Xia and Yuan.
Based on topic maps in Figure 8, the average degree, average weighted degree and density of the graph are counted, the results of which are shown in Table 2, and the topic evolution of Dunhuang mural images are shown in Figures 9 and 10.
The description of topic maps of different dynasties.

Average degree and average weighted degree of topic maps in different dynasties.

Density distribution of different topic maps of Dunhuang murals.
According to topic maps, the topics of Dunhuang murals in the Northern Wei and the Mid-Tang Dynasty are more concentrated. The topics of the Northern Wei Dynasty focused on ‘须摩提女请佛 (Magadha Sangmo Inviting the Buddha)’, ‘供养菩萨 (Supporting Bodhisattva)’, ‘降魔變 (Victory over Mara Sutra)’ and ‘胁侍菩萨 (Bodhisattva Attendant)’, and those of the Middle Tang Dynasty focused on ‘观无量寿经變 (Amitabha Sutra-illustration)’, ‘弥勒经變 (Maitreya Buddha Sutra)’, ‘报恩经變 (Returning Favour Sutra)’ and ‘涅槃经變 (Nirvana Sutra)’. Besides, the topics of Dunhuang murals in other dynasties are more scattered, for there were plenty of cultural exchanges and collisions, especially the Late Tang and the Five Dynasties and Song. As the transition periods between the Tang Dynasty and the Song Dynasty, the Late Tang and the Five Dynasties were exposed to the severe separatist situation with different provincial systems. The political and cultural unity was broken down, and the position of Buddhism was shaken, while other religions began to rise. Thus, the topics of Dunhuang murals were no longer focusing on Buddhism, but describing politics and diverse cultural characteristics, for example, ‘张议潮统军出行 (Zhang Yichao taking a Military Trip)’ and ‘驛使騎 (Couriers Riding the Horse)’ in the Late Tang, ‘回鹘公主供养 (Supporting Uighur Princess)’, ‘各族王子 (Princes of Various Nationalities)’ and ‘曹议金出行 (Cao Yijin taking a Trip)’ in the Five Dynasties and Song. Therefore, topic maps of Dunhuang murals reveal the characteristics and evolution of multi-ethnic culture and reflect the changes in religious belief. With the alternation of dynasties, the differences of customs, agricultural and economic background can be shown clearly in the topic maps.
5. Discussion
The BOW_HSV model is used to extract and represent the visual features of Dunhuang mural images and the classifiers are constructed. Canny edge detector locating and highlighting object boundaries enables SIFT to extract the features with smooth edges and the damaged parts. Furthermore, HSV is integrated to make up for the lack of colour in feature extraction of SIFT. According to Figure 4, the classification effect of SVM is the best, and the effects of LDA and SVM are very close. Generally, the higher the number of visual words is, the more comprehensive the image features are, and the dimension of the vectors describing the images increases. The AB classifier is not well-adapted to the classification of high-latitude feature vectors. With the increase in the number of visual words, the time complexity and space complexity of the calculation will increase significantly. Furthermore, Figure 5 also illustrates that the number of descriptor points set in SIFT algorithm can improve the effects of classifiers.
Apart from the SIFT, CSIFT and HSV used for image feature extraction, deep learning can also be used for image feature extraction and classification [78]. Deep learning models include AlexNet [79], VGG16 [80], ResNet50 [81], Xception [82] and InceptionV3 [83]. The image features of Dunhuang murals are extracted through deep learning, and images are classified based on them. Deep learning models have higher requirements for equipment, and the computational complexity of the model is higher than the machine learning model. The results of this article prove that machine learning can achieve the satisfactory classification effect of Dunhuang murals. This also shows that there are indeed differences in the visual features of Dunhuang murals in different dynasties. On the basis of the Dunhuang mural image classification, the topic maps of Dunhuang murals are constructed through the Gephi, and the topic distribution and evolution of Dunhuang murals are displayed and analysed.
Based on the low-level features of images, the BOW_HSV_SVM classification model is constructed to identify the dynasties of Dunhuang murals. Based on the high-level semantics of images, topic maps are constructed and GT is introduced to analyse the topic distribution and evolution of Dunhuang murals, which reveals the differences and correlations of contents with the alternation of dynasties. The topics of Dunhuang murals are labelled, and knowledge graph-based topic maps are used to visualise the topic distribution and evolution. In topic maps, the size of the node represents the number of topics, and the highly frequent topics of different dynasties can be extracted. Furthermore, the highly frequent topics can be regarded as the hot topics of Dunhuang murals in different dynasties. The top 10 of them in different dynasties are extracted and shown in Table 3. As indispensable components and hot topics of Dunhuang murals throughout the dynasties, ‘飞天 (Flying Apsaras)’, ‘菩萨 (Bodhisattva)’, ‘说法 (Teaching Buddhist)’, ‘供養人 (Supporting Beings)’, ‘佛 (Buddha)’ and ‘藻井 (Raised Ceiling)’ play important roles, and the ThemeRiver [84] figure of them is shown in Figure 11. The ThemeRiver figure can reflect the evolution of different topics in different dynasties.
The top 10 topics of Dunhuang murals.

The evolution of the six representative topics of Dunhuang murals in different dynasties.
Considering that the collected images are parts of the magnificent Dunhuang murals, the evolution rules of topics are not comprehensive. A larger data set may yield better results and generate greater research value. Apart from the temporal dimension, the spatial dimension (different caves and locations) may be a new perspective to study the topic distribution and evolution of Dunhuang murals in different dynasties, which can inspire interesting discoveries in the future. There is no doubt that the proposed method and the introduced theory provide a novel view to explore the topic distribution and evolution of Dunhuang murals and discover more exciting knowledge, which enables us to perceive and understand this precious cultural heritage deeply.
6. Conclusion and future work
This article researches the classification of Dunhuang murals in different dynasties and explores the topic distribution characteristics and evolution rules of them, which is conducive to the organisation and the knowledge discovery of Dunhuang cultural heritage resources. The experiment results illustrate that Dunhuang murals in different periods can be effectively classified through BOW_HSV_SVM. With the GT introduced, the differences and correlations of their contents with the alternation of dynasties are revealed through topic maps.
The limitation of this research lies in the amount of image data of Dunhuang murals. The topic distribution and evolution of Dunhuang murals are analysed in the temporal dimension, but the spatial dimension is also worth exploring. Therefore, the next step of the research work is to identify the locations and caves of Dunhuang murals, and then mine the topic distribution characteristics and evolution rules of Dunhuang murals in the spatial dimension. There is no denying that this research can provide convenience for public cultural institutions to conduct image archive, annotation, topic retrieval and recommendation. It is believed that more valuable research perspectives can be derived from the contributions of this article, and the treasure contained in Dunhuang murals is waiting to be discovered.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This research was supported by the National Natural Science Foundation of China (grant no. 71673203) and the Key Research Institutes of Philosophy and Social Science by Ministry of Education, PR China (grant no. 16JJD870003).
