Abstract
Aiming at the shortcomings of traditional water level prediction methods such as insufficient information mining ability and unclear mechanism of heuristic algorithms, this paper proposes for the first time a water level prediction method based on blockchain technology fused with long short-term memory (LSTM) network. The method utilizes blockchain and LSTM neural network to build a combined model, and directly uploads monitoring data such as import and export water flow and water level to predict the water level, which avoids the secondary error brought by the indirect calculation of flow. In this paper, the flow compensation strategy is proposed for the first time, and the monitoring data with large deviations are compensated accordingly to reduce the prediction error from the source. The results show that the combined Blockchain-LSTM model has the smallest prediction error after adopting the compensation strategy, with the MAE of 0.290 and the RMSE of 0.490, which are smaller than those of other models, and has high prediction accuracy and practicability, which provides technical support for real-time scheduling of the South-to-North Water Diversion Reservoir.
Introduction
Through use the modern technology, the Smart Water Conservancy [1, 2] can change the adverse effects of traditional water conservancy projects, improve the utilization efficiency of water resources, and reduce water waste. The management mode and the system of Smart Water Conservancy can achieve the whole process management from the source of water and the pump station and the end of the water users, and realize the centralized management and supervision from the beginning and the terminal. Through the deep integration of new technologies and water conservancy services, a thorough perception network covering hydrological and water resources [3] and other fields can be established to realize the intelligent perception of water conservancy facilities and provide a scientific basis for decision-making, scheduling, and command of the South-to-North Water Diversion Project [4, 5]. At present, the application of cloud computing technology [6] and big data technology [7, 8] provides an important technical guarantee for the construction site management of water conservancy projects, can effectively handle the large-capacity storage environment, fully excavate the application value of various types of data information, and lay a solid foundation for the construction of Water Conservancy Projects [9].
In this paper, we firstly combine the water level prediction model with blockchain technology to tag and on-chain storage of the historical water level data of the South-to-North Water Diversion Project, which ensures that the data is completely transparent and greatly improves the prediction accuracy of the results. For the first time, a flow compensation strategy is proposed to compensate the monitoring data with large deviation accordingly, reducing the prediction error from the source. To solve the problems of gradient disappearance [10] and long sequence information memory, Hochreiter [11] proposed the LSTM neural network model in 1997. LSTM neural networks can not only handle nonlinear mapping relationships between multiple variables, but also handle time series data very well. In view of the huge deviation between water consumption and water supply and the objection of artificial retrospection, this paper proposes to trace the water consumption data of each participant in each period with the traceability and auditability characteristics of the blockchain, and predict the future data to alleviate the problem of false reporting by each participant and build trust among the participants. In order to improve the prediction effect, the experiment uses the Blockchain-LSTM combination model to make predictions, making full use of the timing change of the water level, and also considering the flow measurement function of the water level [12].
In the process of water resources trading in the South-to-North Water Diversion Project [13], how to ensure the security and credibility of water-related data and prevent malicious attacks such as data tampering by all parties is very important. Blockchain opens up new possibilities for our solutions. Blockchain technology [14] breaks through the defects of the traditional centralized system structure, has the security characteristics of decentralization, de-trust, anonymity and tamper-proof, and can achieve distributed and efficient consensus in a large-scale network environment and establish a secure and trusted data storage system. Blockchain technology also enables large-scale trusted distributed computing power [14, 15] through smart contract mechanisms [16, 17]. Therefore, blockchain can build a trust foundation in a low-cost way in the context of multi-stakeholder participation, aiming to reshape the social credit system. Blockchain technology guarantees data integrity requirements and can be applied to all stages of data acquisition, transmission, storage, and identification. During the data acquisition and transmission phases, data encapsulation and signature techniques are often employed to ensure data integrity. During the data transfer phase, packet loss recovery techniques [18] are used. The combination of blockchain and data security can reduce the risk of centralization in data sharing while allowing distrusting parties to maintain a secure, credible, and immutable public ledger, which has broad application value. Blockchain-based decentralization methods [19] have received widespread attention in recent years. In recent years, with the rapid development of blockchain, people have begun to try to apply it to finance [20], agriculture [21], healthcare [22], construction industry [23], electric vehicle industry [24] and other fields, enabling the circulation and sharing of data in multiple fields. In light of the current approach to establishing credible data storage validation methods between mutually untrusting parties [25], this project combines for the first time the functionality of blockchain with the LSTM methodology to tag and predict water use through the combination of blockchain and data security. The research in this paper provides a good basis for the decision-making of the South-to-North Water Diversion Central Route Project. A blockchain-LSTM optimal combination prediction model is established, which contributes to solving the problems of data misreporting or malicious tampering in the process of water allocation in the project, contributes to saving water resources, and provides a flow compensation strategy for the party that generates losses to reduce the friction risk among the participants. Promote the progress of water automation and intelligence.
Materials and methods
Experimental model establishment
In this paper, a model based on the water level diversion situation of the water plant is first builded, as shown in Fig. 1. The water level of XiaoHeLiu pumping station is relatively stable, and the amount of water allocated to Waterworks 1 and 2 is also fixed. The water flow data at the beginning and end of each water plant are actually inconsistent. For example, the loss of water in the process of transportation, etc., so the relationship between the water flow of each water sub-plant and the total distribution of water between the pumping stations cannot be described by direct mathematical operations. Due to human factors or failure of monitoring equipment, when determining the end data of one water plant, in order to verify whether the data of the other water plant is correct, LSTM is used to build a model to predict the flow data of the water plant. According to the specific situation and the predicted data, the flow compensation of a certain water plant is performed to meet the water demand of the water plant.

Pumping Station Water Distribution.
A flow measurement system based on image processing is installed in the sump, which is used to measure the amount of water flowing out of the sedimentation tank. An image-processing water level gauge is installed in the sedimentation tank, which is used to measure the flow rate of the triangular thin-walled weir. As shown in Fig. 2, there are 22 water collection tanks in a water plant, and the width of the water tank

Water Plant 1 Catchment Tank.
The flow compensation strategy is a correction method for the systematic error of the measured value carried out by different flow meters, which is of great significance in flow measurement. The systematic error of the flow detection device, most of which is caused by changes in fluid properties and measurement conditions (such as fluid composition, flow range, temperature and pressure, etc.).
In practice, when the working conditions of the fluid deviates from the state based on the flow measurement, will cause different degrees of systematic error, and even make the flow measurement data lose its meaning. The greater the change in working conditions, the greater the error caused by the flow measurement system. Therefore, in the need for accurate measurement of the occasion, or working conditions fluctuations in the range of large and frequent occasions, must be used to compensate for the strategy.
Flow rate formula is:
Compensation strategy formula is:
Compensation: replace the value of real time density
The tank flow rate is measured by an image processing flow meter; the water level is measured by the water level gauge of the image processing method. The final egress flow
Blockchain technology architecture
This subject designs a water consumption labeling and compensation system framework based on blockchain and LSTM as shown in Fig. 3. The architecture is divided into client and server. The server is a blockchain system service, which includes mechanism modules such as smart contracts, consensus mechanisms, and ledger mechanisms. There will be a block that records the same ledger at every stage of the water supply and water use process. The ledger can only be appended and cannot be modified. The entire network is maintained through a consensus mechanism between blocks without human intervention, and each block information contains a unique timestamp to further ensure the security of the information. The client consists of the water supplier and the water user. The water supplier can enter the server backend to create the water consumption and record the information of each water consumption. The water user records information about each water use and the height of the water level in the storage device through the server. Usage can be verified by all parties to minimize the problem of false reporting by participants.

Water Consumption Labeling and Compensation System Framework.
The blockchain system consists of network layer, data layer, consensus layer, control layer and application layer. The blockchain can be seen as a state machine’s state transition process, which is:
where
In the above formula description,
Figure 4 shows the process of the blockchain converting the original message data into a hash value through a compression function:

Blockchain data conversion hash value.
LSTM neural networks have had successful cases in time series data processing and analysis in different fields. Therefore, in order to overcome the shortcomings of the traditional time series model that overemphasizes the time series and considers the influence of external conditions on the groundwater level [26], this paper applies the LSTM neural network to the study of water flow prediction [27] and introduces a number of auxiliary variables in order to obtain reasonable and effective prediction results.
In order to solve the problem of RNN “gradient disappearance” or “gradient explosion’’, Hochreiter et al. proposed LSTM in 1997, which combined short-term and long-term memory through gate control, thus solving the above problems to a certain extent.
Let

LSTM Modular Construction.
The LSTM network realizes the protection and control of information through three “gates’’:
The forget gate determines what information unit
The input gate determines how much information is added to the unit. It is determined by the information of sigmoid and the information of tanh, and the unit state is updated jointly with the forget gate. The input gate steps are:
The output gate, which determines which part of the current unit state information is output, is still completed through sigmoid and tanh. The output gate steps are:
Based on the characteristics of time series data and the principle of simplified design of recurrent neural network, The Blockchain-LSTM combinatorial forecasting model framework is shown in Fig. 6, which can be roughly divided into six parts: block header, block body, input layer, hidden layer, model training, and output layer, where hash value refers to the “data fingerprint” obtained by hash function (also known as hash function, digital digest) through a short random string representing an input message of any length;

Structure of Blockchain-LSTM combined forecasting model.
Block header and Block body: The exponential false block is divided into two parts, the block header and the block body, the block header stores the hash value of the previous block, the hash value of the current block, the random number, the timestamp and the Merkle root, and the block body stores all the real data.
Input layer: segment and normalize the original variable time series set to meet the network input requirements. That is, during data preprocessing, the data of all variables water_level, flow_1, flow_2 and flow_3 are arranged into CVS files at intervals of 30 minutes, which are called during training.
Hidden layer: The LSTM unit structure shown in Fig. 5 is used to update and optimize the parameters.
Output layer: Output prediction results, denormalization, validation errors.
Model training: The model uses the Adam optimization algorithm [29, 30] to update the network weights.
This model combines Blockchain with LSTM neural networks for the first time to tag and predict water level through the combination of blockchain and data security.
Results presentation
In this paper, we use the method the combined model of Blockchain-LSTM to make a prediction. The flow rate of water plant_1 is named as flow_1, the flow rate of water plant_2 is named as flow_2, the flow rate of the pumping station is named as flow_3, the water level of the pumping station is named as water_level, and the variable proposed according to the compensation strategy is called compensation_ flow. We select four different combinations of variables as the input set for the next prediction, each set of variables after the combination of model training to produce their respective predicted values and the true value of the comparison charts. The four combinations of variables are group
In order to demonstrate that the Blockchain_LSTM combined model in Chapter 2 outperforms the prediction of the single LSTM model, we input the four sets of variables into the single LSTM model and the Blockchain_LSTM combined model for training, respectively. The result plots of the single LSTM model correspond to Fig. 7, and the result plots of the Blockchain_LSTM combined model correspond to Fig. 8.

Comparison of prediction results under different variables with LSTM.

Comparison of prediction results under different variables with Blockchain_LSTM.
In order to quantify the prediction effect and evaluate the advantages and disadvantages of the Blockchain_LSTM model, the root mean square error (RMSE) are introduced as the evaluation indicators of the model. Compared with RMSE, it is more robust because MAE normalizes the error of each point. A smaller MAE and RMSE indicates a better prediction. The correlation coefficient R reflects the correlation degree between the predicted value and the actual value. The closer R is to 1, the higher the correlation degree will be. The calculation formula is as follows:
The water level information in front of the pump station at the current time is often affected by many factors, such as the section water level at one or more previous times in a certain area upstream and downstream, the pump station flow, and the flow difference between pump stations. Table 1 shows the comparison of the prediction errors of LSTM single model and Blockchain_LSTM combined model in four different sets of variable sets as input sets. (The first four rows
Comparison of errors in prediction results for different group of variables
From Table 1, it can be seen that the prediction error of the Blockchain_LSTM combination model is minimized when variable group
Figure 9 shows the error comparison of water level prediction using different combinations of models.Two combination models Blockchain_BP, Blockchain_SVM are selected for comparison with Blockchain_LSTM model and from the comparison of data in Fig. 9.

Comparison of errors in prediction results of different combination models.
It can be concluded that the prediction of Blockchain-LSTM combined model is better than Blockchain_BP and Blockchain_SVM.
Building a private chain based on Ethereum to record water level data. Although it is transparent, there is no guarantee that the data measured by the sensor is free of error, so it is very important to ensure the reliability of the data source; However, the online flow detection method based on image processing is the choice for future intelligent flow measurement. It will have a good application prospect if its feature extraction ability and anti-interference ability are further improved, especially when the flow state is relatively stable and when open channel flow monitoring is being performed; The representative machine learning algorithms such as BP neural network, support vector machine and LSTM model are utilized to predict and analyze the future development trend of hot topics in the field of water conservancy engineering. The experimental results show that the prediction accuracy of the Blockchain-LSTM combined prediction model is better than other single prediction models, and the stability is better. The support vector machine is not as good as the LSTM model, but better than the empirical risk-optimized BP neural network, with a view to providing an empirical basis and demonstration ideas for later prediction research.
Nothing can be omnipotent. If you want to achieve the desired result, the key is to have the ability to identify and solve problems. Modeling small data tends to be harder and more thought-provoking than big data. For deep model learning, I still strongly recommend that you have a general understanding of the connotation and principles of the model, and you can even deduce it yourself or simply implement the gradient descent algorithm, loss function construction, etc., otherwise it will be difficult to solve the real problem.
Declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Funding
The funding support provided by China Construction Seventh Engineering Bureau.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent
Informed consent was obtained from all individual participants included in the study.
Footnotes
Acknowledgments
We would like to express our heartfelt thanks to the sponsor for the South-to-North Water Transfer Middle Line Project, and thank you most for your financial support and data support. Thanks for the valuable suggestions of other professors and the help of colleagues. With their help, it is really encouraging and exciting to complete the research and obtain so many useful experiences and suggestions.
