Abstract
Green and energy-efficient maritime transport has become a strategic imperative under tightening decarbonization mandates by the International Maritime Organization (IMO). However, current ship energy efficiency optimization (EEO) frameworks often decouple fuel consumption prediction from operational decision-making, limiting real-time adaptability and integrated control. To address this gap, this study proposes a high-resolution collaborative framework that couples a Transformer-LSTM-A prediction model with a voyage-segmented NSGA-III multi-objective optimizer. An NSGA-III solves the four-objective, segment-level speed-and-trim optimization subject to practical stability and operating limits. The proposed architecture incorporates temporal attention mechanisms and VMD-enhanced multivariate inputs to accurately forecast fuel consumption rates, which then guide the segment-wise optimization of ship speed and trim. The optimization simultaneously minimizes fuel consumption, CO2 emissions, and the Energy Efficiency Operational Indicator (EEOI), while embedding soft constraints on voyage distance to preserve navigational feasibility. A real-world case study demonstrates the effectiveness of the proposed approach, achieving reductions of 4.76% in FCR, 3.04% in CO2 emissions, and 1.50% in EEOI. These results validate the framework’s potential for intelligent maritime energy management, offering a robust and scalable pathway toward low-carbon ship operations aligned with global regulatory targets.
Keywords
Introduction
Green and energy-efficient shipping has become a critical imperative in the face of escalating environmental concerns and regulatory pressures within the global maritime sector. 1 As of 2023, the shipping sector contributed approximately 798 million tons of CO2, accounting for about 2.3% of total anthropogenic CO2 emissions and was responsible for nearly 3% of global greenhouse gas (GHG) emissions overall, reflecting a 20% increase over the past decade due to expanding trade volumes and fleet activity. 2 In response to the growing environmental footprint of maritime transport, the International Maritime Organization (IMO) adopted a revised Greenhouse Gas (GHG) Strategy in 2023, setting forth a phased decarbonization pathway.3–5 The strategy aims to reduce total GHG emissions by at least 20% by 2030 and 70% by 2040, culminating in full decarbonization by 2050, relative to 2008 baselines. 2 Consequently, under increasingly stringent emission regulations, the optimization of ship fuel efficiency has become a pivotal element in supporting sustainable maritime operations and maintaining the long-term competitiveness of the shipping industry. 6
In alignment with these decarbonization mandates, recent research has focused extensively on ship energy efficiency optimization (EEO) techniques encompassing three primary components: route planning, speed adjustment, and trim control.7–10 Among these, operational control strategies are widely recognized as the most viable short-term measures due to their relatively low implementation costs and high responsiveness. 11 Nevertheless, developing accurate and adaptive optimization frameworks remains a critical challenge, particularly under the influence of highly variable sea states and dynamic operational environments. 12 The construction of high-fidelity predictive models and the integration of robust multi-objective optimization algorithms have become essential for enabling real-time decision-making and achieving sustainable vessel performance.13,14
As a foundation for energy efficiency optimization, current research on ship fuel consumption prediction models is generally categorized into three types: White Box Models (WBMs), Black Box Models (BBMs), and Gray Box Models (GBMs). 15 WBMs rely on empirical resistance formulations and CFD-based performance tables, but their generalization is often poor under dynamic operational conditions. 16
Within the black-box modeling paradigm, Bal Beşikçi et al. developed an artificial neural network (ANN) model using inputs such as ship speed, engine RPM, and trim to predict fuel consumption, demonstrating good performance across varying sea conditions. 17 Liu et al. proposed a TCN-GRU-MHSA hybrid deep learning model that achieved 96.04% prediction accuracy under complex wave environments, proving its advantage in multi-scale temporal feature learning. 18 Lee et al. constructed a deep feedforward network (DFN) incorporating navigational and meteorological inputs, which outperformed traditional regression methods in terms of RMSE and MAE. 19 In addition, Wang et al. used a LASSO-based regression framework to select relevant features and improve interpretability in multi-factor fuel consumption prediction tasks. 20
For gray box models, Fan et al. (2025) developed a hybrid model integrating navigation dynamics and data-driven learning to address prediction under sparse sampling conditions. 21 Han et al. (2024) constructed a semi-physical model that fused propulsion equations with neural network outputs, balancing model transparency with learning flexibility. 22 Their results indicated better extrapolation under partial-data conditions compared to pure data-driven models. Zwart et al. further enhanced this concept by incorporating real-time correction mechanisms using onboard sensor feedback. 23
To overcome limitations in basic black-box models, LSTM-based architectures have been extensively studied due to their ability to capture long-term dependencies in ship operational sequences.18,24 However, their limited capacity to model global interactions has motivated the development of hybrid attention-based architectures. Li et al. proposed an Evolutionary Attention-based LSTM (EA-LSTM) model that uses a genetic algorithm-inspired strategy to optimize attention weights, enhancing temporal prediction accuracy and outperforming traditional LSTM and attention-LSTM models. 25 Ren et al. proposed a Kolmogorov-Arnold attention-driven hybrid LSTM-Transformer model that integrates spatial, temporal, and input attention mechanisms to enhance long-term sequence modeling, achieving superior prediction accuracy in complex hydrological environments. 26 Bao et al. proposed a collaborative Transformer-LSTM framework that significantly enhanced prediction accuracy by jointly modeling long-term dependencies and short-term variations using attention-based mechanisms. 27 Han et al. proposed a parallel Transformer-LSTM model that integrates macro-level traffic and micro-level behavioral features, showing improved generalization across varying operational contexts through effective temporal-spatial learning. 28 Wang et al. proposed a self-attention-based LSTM (SA-LSTM) model for ship fuel consumption prediction, achieving up to 13% reduction in RMSE and 12% in MAPE compared to conventional LSTM, demonstrating superior performance across multiple voyage segments. 29 Similarly, Nascimento et al. developed a transformer-based deep neural network integrated with wavelet decomposition for multivariate wind speed and energy forecasting, showing that wavelet-based feature augmentation significantly improved prediction accuracy across multiple meteorological time series while reducing training time. 30 Hu et al. proposed a hybrid forecasting model integrating variational mode decomposition, sparse auto-encoder, and high-order fuzzy cognitive mapping, which significantly improved forecast accuracy and generalization in nonlinear time series prediction scenarios. 31 Fu et al. introduced a hybrid LSTM-Transformer model for multivariate sea surface temperature forecasting, demonstrating that integrating LSTM’s temporal modeling with Transformer’s attention mechanisms yields improved robustness and accuracy across various seasonal and geographic conditions. 32
Parallel to developments in prediction modeling, Deb and Jain introduced NSGA-III to address many-objective problems using reference-point-guided selection and improved diversity metrics. 33 Wang et al. proposed a multi-objective opposition-based marine predator optimization algorithm that integrates elite hierarchy and opposition learning mechanisms, achieving superior performance in balancing prediction accuracy and stability in probabilistic forecasting tasks. 34 Hamed employed NSGA-III for hull form optimization and reported better design robustness under varied loading conditions. 35 Li et al. developed a collaborative GA-LSTM and NSGA-III framework to co-optimize ship speed and trim, achieving a 4.54% reduction in fuel consumption and 153.2 tons of CO2 savings over a reference voyage. 36 Ge et al. proposed a hybrid LFOA-NSGA-III algorithm combining fruit fly optimization and matter-element modeling, which improved convergence and diversity in interval many-objective problems such as UAV path planning. 37 Zhou et al. applied NSGA-III to optimize the anchor cable system of a net-cage group under varying hydrodynamic conditions, demonstrating the algorithm’s effectiveness in handling high-dimensional constraints and maintaining sample diversity through a dual-mode spatial intersection screening method. 38 Xu et al. developed an improved NSGA-III algorithm incorporating a comprehensive adaptive penalty scheme, demonstrating enhanced performance in many-objective optimization problems. 39 In addition, Zhang et al. proposed an MOEA/D variant called MOEA/D-RWV, which introduces a weight vector resetting mechanism based on DBSCAN clustering and principal component analysis (PCA) to address multiobjective problems with discontinuous Pareto fronts. Their method effectively redistributes solutions by reconfiguring weight vectors according to the geometric structure of the evolving front, thereby improving convergence and diversity on irregular and disconnected Pareto fronts. 40 Vargas-Santiago et al. applied a constraint-handling NSGA-III framework to facility placement under disruption uncertainty, demonstrating faster convergence and improved solution diversity in multi-objective optimization settings. 41 Building on these advancements, Ranjan et al. introduced a threshold-based constrained θ-NSGA-III algorithm that integrates constraint-domination rules with angular selection strategies to address disconnected and narrow feasible regions in many-objective spaces, enabling more refined control over convergence and diversity balance. 42 Li et al. present a DLSTM-based Wiener identification scheme that disentangles linear/nonlinear dynamics and achieves superior predictive and control performance on PMSM benchmarks. 43 In integrated electricity–heat planning, a CNN–BiLSTM–Attention forecaster fuses spatial–temporal features with attention and reports lower RMSE/MAE/MAPE than single-model and other hybrid baselines. 44 For renewable forecasting, a DBN–Hammerstein architecture cascades a DBN nonlinear block with an ARX linear block and attains higher wind-power prediction accuracy than ELM/NFN/LSTM comparators. 45 Complementary NFN ARX Hammerstein identification that uses designed multi-signals to decouple and learn nonlinear/linear subsystems, alongside maritime EEOI/FEEMI-based energy-consumption analyses and ANP fuzzy Arctic-risk assessment—together underscoring the growing use of hybrid block-oriented models and multi-criteria decision-making in marine cyber-physical applications.46–49 Across safety-critical cyber-physical domains, recent studies have explicitly cast conflicting design goals as multi-objective evolutionary searches, stealth-versus-impact false-data injection in AC power grids solved via NSGA-II, and multi-objective defenses for federated intrusion detection—supporting the use of evolutionary Pareto optimization when objectives are non-convex and antagonistic. 50 Zeng et al. cast federated ICS backdoor defense as a mixed-variable multi-objective problem and solve it with NSGA-II to obtain Pareto-optimal clustering-combination defenses that outperform single-strategy and classic baselines. 51 In setting, FCR, CO2, and EEOI conflict with schedule/route-deviation constraints; the objective landscape is non-convex, non-differentiable, and relies on costly simulations/data, making weighted-sum or gradient-based methods brittle. To obtain a set of Pareto solutions rather than a single trade-off, we adopt evolutionary Pareto search; for three-plus objectives, NSGA-III’s reference-vector guidance typically yields more uniform and better-covered fronts than NSGA-II, facilitating knee-point discovery and decision support.
Collectively, these studies underscore multi-scale feature fusion and explicit linear–nonlinear treatment, yet they do not address segment-level, four-objective maritime optimization nor a tightly coupled predictor with voyage-segmented NSGA-III as analyzed here. These advancements collectively underscore the importance of evolving both the prediction and optimization components within an integrated maritime energy efficiency framework.
Despite substantial progress in ship energy efficiency research, existing frameworks still face critical limitations. Many models treat fuel consumption prediction and operational optimization as disjoint tasks, leading to inconsistent and suboptimal decision-making. Traditional LSTM-based predictors often lack attention mechanisms, limiting their ability to capture long-range dependencies and adapt to nonstationary marine conditions. Optimization efforts frequently prioritize fuel consumption alone, overlooking essential environmental indicators such as CO2 emissions and EEOI. Moreover, static voyage-level strategies are commonly used, lacking the segment-wise adaptability needed for dynamic sea states and real-time responsiveness. This study addresses these challenges through an integrated, high-resolution framework that couples a Transformer-LSTM-A prediction model with an NSGA-III-based segment-level optimizer. The proposed approach improves prediction accuracy, control granularity, and environmental performance, contributing to the development of intelligent and sustainable ship energy management. Beyond the GA-LSTM + NSGA-III collaborative scheme of Li et al. our framework unifies a VMD-enhanced Transformer–LSTM–Attention predictor with a voyage-segmented, four-objective NSGA-III and supplements the literature with optimizer benchmarks, strengthening both methodological novelty and empirical generality. The main contributions of this study are summarized as follows:
This study proposes a Transformer-LSTM-A prediction model to accurately forecast fuel consumption rates by incorporating temporal attention mechanisms and decomposed multivariate input features through variational mode decomposition (VMD).
A voyage-segmented NSGA-III optimization framework is constructed to jointly optimize ship speed and trim with the goal of minimizing fuel consumption, CO2 emissions, and EEOI, while embedding soft constraints on voyage distance and stability.
A real-world case study is carried out to validate the proposed framework, demonstrating its effectiveness in reducing energy consumption and improving emission performance relative to baseline and conventional models.
The framework of this study is as follows: Section 2 describes the dataset and preprocessing procedures. Section 3 presents the Transformer-LSTM-A prediction model. Section 4 details the NSGA-III multi-objective optimization strategy. Section 5 reports experimental results and comparative analyses. Finally, Section 6 concludes the study and outlines future research directions.
Data acquisition and preprocessing of ship operational efficiency
Data acquisition
The dataset used in this study was obtained from an ocean-going cargo vessel operating on two international voyages Figure 1(a). The primary analysis focuses on the first voyage from Zhoushan, China to Maranhão, Brazil. This long-haul route spans multiple climate zones and covers over 13,000 nautical miles, involving complex sea conditions and dynamic operational states Figure 1(b) and (c).

Wind field feature distribution along the vessel’s voyage trajectory: (a) vessel voyage track, (b)wind direction distribution, and (c)wind speed variation.
High-resolution navigational and engine data were collected using the ship’s integrated onboard monitoring system, which includes anemometers, GPS receivers, shaft power meters, engine load monitors, and fuel flow meters. All parameters were recorded every 10 min, resulting in a high-frequency, multivariate time series dataset. The operational data sample includes time-stamped environmental, navigational, and fuel-related variables such as relative wind speed and direction, main engine RPM, speed over ground, shaft power, engine load, shaft torque, mean draft, distance through water, trim, and fuel consumption rate. This structured time-series dataset forms the foundation for subsequent feature construction and supervised learning.
Data analysis and preprocessing
In real-world maritime operations, raw measurement data collected from onboard sensors are often subject to various sources of noise and distortion. Harsh environmental conditions at sea, transient mechanical faults, and intrinsic sensor inaccuracies can result in missing values, outliers, and nonstationary fluctuations. These irregularities significantly degrade the quality of training data and can mislead learning-based prediction models, particularly those relying on temporal correlations and continuous sampling. Therefore, prior to model construction, a comprehensive data analysis and preprocessing process is required to ensure feature consistency, suppress noise, and extract meaningful temporal structures for downstream learning tasks.
Correlation-based feature screening was conducted using Pearson analysis between Fuel Consumption Rate and candidate variables, including environmental, propulsion, and hydrodynamic parameters. As shown in Figure 2(a), FCR exhibited strong positive correlations with Main Engine Load, Shaft Power, Shaft Torque, and RPM, while Trim and Mean Draft showed negative correlations. These findings support informed feature selection, mitigating overfitting and dimensional redundancy in subsequent modeling.

(a) Pearson correlation heatmap between operational features and fuel consumption rate (FCR). (b) Time-frequency analysis of fuel consumption via variational mode decomposition.
Signal Decomposition with VMD, to enhance the temporal representation of FCR and mitigate the impact of nonstationary fluctuations, Variational Mode Decomposition (VMD) was applied. VMD is a signal processing technique that adaptively decomposes a signal into a finite number of band-limited intrinsic mode functions (IMFs), each corresponding to a specific frequency band. Let
Where:
The decomposed results are shown in Figure 2(b: a, b). This decomposition separates high-frequency noise (IMF1–7) from smoother trend components (IMF6–7 and Res), thereby enabling the model to learn multiscale temporal patterns more effectively. Without adequate feature fusion across VMD-derived components and operational covariates, the predictor underrepresents multiscale patterns, which in this study manifests as enlarged residual envelopes, heavier error tails, and delayed transient tracking.
In Figure 2(b), the horizontal axis represents the discrete time index of the recorded fuel consumption rate (FCR) series, expressed in sampling intervals of 10 min, which correspond to the temporal resolution of the vessel’s integrated monitoring system. This axis captures the chronological evolution of operational conditions over the observed voyage segment, thereby enabling the identification of transient fluctuations and long-term trends in FCR. The vertical axis denotes the center frequency of each decomposed intrinsic mode function (IMF) obtained through Variational Mode Decomposition (VMD). Frequencies are expressed in normalized units relative to the Nyquist frequency, reflecting the oscillatory rate of energy variation in the original signal. High-frequency components (upper region of the axis) are associated with rapid, short-term fluctuations arising from transient environmental perturbations or mechanical dynamics, while low-frequency components (lower region) correspond to gradual variations driven by voyage-scale hydrodynamic resistance changes, engine load adjustments, and trim optimization strategies. The color scale encodes the instantaneous amplitude of the signal within each time–frequency bin, normalized to unit variance, where warmer colors indicate higher local energy concentrations. This coordinated representation allows for a multi-resolution inspection of fuel consumption behavior, facilitating the isolation of noise-dominated high-frequency IMFs from physically meaningful low-frequency patterns that are critical for predictive modeling and operational decision-making.
For outliers and missing values, a spline interpolation is used to supplement them to ensure time continuity. All variables were normalized using z-score standardization to eliminate scale bias:
Where
The transformer-LSTM-A based ship fuel consumption prediction model
Ship fuel consumption is governed by complex, interrelated operational and environmental factors, making accurate estimation of instantaneous Fuel Consumption Rate (FCR) under real-world maritime conditions particularly challenging. Issues such as data sparsity, sensor noise, and delayed measurements hinder reliable assessment. Consequently, robust prediction models are essential for effective fuel monitoring and emission control. Recent advances include artificial neural networks, support vector regression, and black-box models like LSTM, which demonstrate strong capabilities in capturing nonlinear temporal dependencies.
LSTM networks are widely used for modeling sequential data due to their capability to capture long-term dependencies. However, they are prone to vanishing or exploding gradients, limiting their effectiveness over extended sequences. To address these issues, this study introduces a hybrid Transformer-LSTM-A model that combines the global attention mechanism of the Transformer encoder with LSTM’s temporal memory. This architecture enhances predictive accuracy for fuel consumption under complex, multivariate voyage conditions.
Model overview and functional structure
Originally developed for NLP, the Transformer architecture is employed here for its ability to model global dependencies via self-attention and enable parallel processing across time steps—advantageous for structured or decomposed inputs. In contrast, LSTM excels at capturing local temporal patterns and offers robustness against gradient issues. Integrating LSTM as the decoder allows the model to leverage both global context and temporal stability.
Figure 3(a) illustrates the overall structure of the proposed Transformer-LSTM-A model. The model receives as input a feature matrix that integrates operational parameters and decomposed historical fuel consumption sequences. Multivariate operational inputs, including shaft RPM, SOG, and trim, are collected over four consecutive time steps to form temporal feature sequences. Concurrently, 20 historical observations of fuel consumption rate are extracted and subjected to Variational Mode Decomposition. The sequence is decomposed intoK-IMFs and one residual component, which together capture both high-frequency variations and low-frequency trends, thereby enhancing the model’s ability to learn multiscale temporal patterns. The model is composed of three main subsystems: (1) input feature construction and decomposition, (2) a Transformer-based encoder, (3) an LSTM-based decoder with a regression output layer. These modules are sequentially connected to enable an end-to-end prediction of short-term future fuel consumption rates. The input stage begins with the collection of raw operational signals across multiple time steps. Each sample contains four consecutive historical time points of multivariate input features, represented as:

(a) The overall structure of the model. (b) LSTM decoder internal gate structure and forward flow.
where
The sequence is passed through a Variational Mode Decomposition (VMD) module to extract multiple frequency-specific modes. VMD adaptively decomposes the signal into K = 7 intrinsic mode functions (IMFs) and one residual component, the decomposition isolates underlying periodicities and dynamic behaviors of fuel consumption, enabling the model to distinguish between stable trends and transient noise components, denoted as:
The next step involves feature combination and embedding. The original operational inputs
To facilitate global pattern learning and contextual interaction among the features, the model applies positional encoding to the 52-dimensional input, resulting in position-aware embeddings. These are fed into a Transformer encoder block. Multi-head self-attention, configured with h = 4 heads and a key/query dimension of dk = 8, computes the contextual relationship between all feature dimensions via:
Where
Upon obtaining the global context-aware representation from the Transformer encoder, the vector sequence is passed to the LSTM decoder to further model the temporal evolution and generate predictive outputs. The decoder is composed of four stacked LSTM cells that sequentially update their internal hidden state
Where
The new cell state is updated by element-wise gating:
The output gate then determines the contribution of the memory to the hidden state, formulated as
These hidden states are concatenated and transformed by a fully connected regression layer, which produces the predicted fuel consumption rates for the next p = 5 time steps:
The model is trained using backpropagation with the mean squared error (MSE) as the loss function:
Where N is the number of samples, and
The architecture begins by enhancing the input through decomposition and sequential context enrichment, followed by global-local learning via attention and memory units, and culminates in a compact yet expressive prediction output. This hybrid structure not only leverages the strengths of each module but also addresses the key limitations of traditional time-series forecasting under noisy, nonlinear marine operating conditions.
Performance evaluation of the fuel consumption forecasting model
The experiments were conducted on a high-performance Windows workstation equipped with an AMD Ryzen 9 7945HX CPU, 16 GB RAM, and an NVIDIA GeForce RTX 4060 GPU (8 GB). The Transformer-LSTM-A model was developed using MATLAB R2024b and trained via the Deep Learning Toolbox. Z-score normalization was applied to all input features. The model was trained for 300 epochs with a batch size of 128, using the Adam optimizer and Mean Squared Error (MSE) as the loss function. A sequence length of 4 was adopted, with 8 features per time step and 20 historical FCR points included. The input composition strategy involved combining 32 predictive features with 20 historical values, producing 5-step ahead forecasts. Feature decomposition was performed using Variational Mode Decomposition with 7 Intrinsic Mode Functions and 1 residual. The Transformer module utilized 4 attention heads, each with a key dimension of 8, while the LSTM decoder included 32 hidden units. The learning rate was initialized at 0.001, and activation functions included tanh and sigmoid for LSTM layers, and ReLU for the Transformer encoder.
In order to objectively evaluate the performance of the proposed model, multiple statistical error metrics were employed. The definition and interpretation of these indicators are as follows:
MAE quantifies the average absolute error, offering an intuitive measure of accuracy with low sensitivity to outliers. MAPE expresses errors as a percentage, enabling scale-independent evaluation. RMSE emphasizes larger deviations, reflecting prediction consistency. The coefficient of determination
To further validate model effectiveness, predicted fuel consumption trajectories were compared against actual values across datasets. As shown in Figure 4(a) to (c), 500 samples in different test sets, the red lines represent predicted values, while blue dots indicate ground truth. The model accurately captures the fluctuations and level transitions in fuel consumption under various operational conditions. Particularly, both rapid changes and stable plateaus are well predicted, with minimal visible lag or offset.

Predicted versus actual fuel consumption: time series and density correlation analysis: (a) Training set, (b) Validation set, (c) Testing set, (d) Training set, (e) Validation set and (f) Testing set.
Model fidelity was further assessed using density scatter plots Figure 4(d)–(f), where predicted values closely align with the reference line y = x, indicating strong agreement with ground truth. The high R 2 values—0.9981 (training), 0.9972 (validation), and 0.9934 (test)—confirm the model’s excellent predictive accuracy and generalization across datasets.
As shown in the experimental results, the proposed model demonstrates excellent predictive performance across both voyages and dataset partitions. For the Zhoushan–Maranhao route, it achieves a training MSE of 0.0006 and an R2 of 0.9981, indicating strong fitting ability. On the validation and testing sets, the RMSE remains low at 0.0365 and 0.0507 respectively, with MAPE values of 2.23% and 2.58%, reflecting high generalization and stability. Similarly, for the Maranhao–Singapore route, the training phase yields an MSE of 0.0003 and an R2 of 0.9969, while the test set records an RMSE of 0.0402 and a MAPE of just 1.10%. These metrics collectively confirm the model’s robustness and reliability in short-term fuel consumption forecasting under diverse operational scenarios.
To evaluate real-world applicability, a forward prediction test was performed using only historical data. As shown in Figure 5(a), the model predicted fuel consumption for five future steps based on the preceding 25 observations. Despite the absence of future inputs, it accurately captured local trends and magnitudes, demonstrating strong generalization and adaptability to dynamic maritime conditions.

Short-term forecasting and uncertainty estimation of fuel consumption: (a) Real-time fuel prediction, (b) Regression fit with residual histograms, and (c) Fuel consumption prediction intervals.
A comprehensive evaluation of the Transformer-LSTM-A model’s regression performance and uncertainty quantification is presented in Figure 5(b) and (c). Figure 5(b) illustrates predicted versus actual fuel consumption for training and testing datasets, accompanied by regression lines, 95% confidence bands, and marginal histograms. The tight clustering of data points along the fitted lines and narrow confidence intervals indicate high predictive fidelity, low variance, and minimal overfitting. Symmetric, unimodal histograms further confirm centered residuals and stable performance.
Figure 5(c) demonstrates predictive uncertainty through multi-level prediction intervals (90%, 70%, 50%), generated via an ensemble sampling strategy (η = 100). The predicted trajectory remains closely aligned with observed values, and the well-nested, stable-width intervals reflect both sharpness and coverage. These results affirm the model’s capability for accurate point forecasting and reliable uncertainty estimation, supporting risk-aware decision-making in maritime operations.
Comparative performance assessment across architectures
As part of the model evaluation process, a comparative analysis was carried out against a suite of benchmark models, including LSTM, GRU, CNN, Transformer, and several hybrid variants (CNN-LSTM, LSTM-LSTM, CNN-GRU). The comparative results on the test dataset are illustrated in Figure 6, where the predicted fuel consumption curves generated by different models are plotted against the ground truth trajectory.

Comparative visualization of fuel consumption forecasts from multiple models.
As shown in the figure, the Transformer-LSTM-A model (dark red) aligns closely with actual fuel consumption across both steady and transient phases. In regions with abrupt changes—highlighted in zoomed subplots—it outperforms conventional models by exhibiting lower prediction errors and reduced oscillations. While RNN variants struggle with long-range dependencies, and CNN-based models show delayed responses during transitions, the Transformer-LSTM-A architecture combines global self-attention with local temporal modeling to capture both structural and sequential dynamics. The proposed Transformer–LSTM–Attention predictor exhibits a uniformly superior error profile against state-of-the-art baselines (Transformer, LSTM, GRU, and CNN-LSTM) across voyages and operating regimes. The residual envelopes and the tail behavior in Figure 6 indicate tighter concentration and fewer extreme deviations for our model, which is consistent with its multi-scale feature fusion and attention-aided long-range dependency modeling.
This synergy enables high sensitivity to rapid fluctuations and robust tracking in stable intervals, avoiding the over-smoothing seen in other models. The model demonstrates a favorable bias-variance trade-off, making it well-suited for complex maritime conditions with nonlinear and variable fuel consumption patterns-supporting tasks like energy optimization, emissions control, and voyage planning under uncertainty. A Transformer–LSTM with a temporal attention gate integrates global dependencies, local continuity, and adaptive feature weighting to reduce RMSE/MAE/MAPE over single-model baselines; we curb computational/overfitting costs with compact heads, dropout, and blocked-validation early stopping.
The collaborative optimization framework for ship energy efficiency
Design of segment-based optimization framework
The section presents a collaborative optimization framework aimed at improving ship energy efficiency under dynamic maritime conditions. The approach integrates a Transformer-LSTM-A-based fuel consumption prediction model with a segment-wise multi-objective optimization algorithm based on NSGA-III. By jointly optimizing navigation speed and trim, the framework seeks to minimize fuel consumption, CO2 emissions, and the Energy Efficiency Operational Indicator (EEOI), while incorporating distance deviation penalties and other soft constraints. As illustrated in Figure 7, the framework comprises four stages: data acquisition, historical sequence construction, predictive modeling, and population-based optimization—enabling an effective coupling of prediction and decision-making for optimal voyage planning. The full decision vector over N segments is expressed as:

The construction of the collaborative optimization processes.
Decision variables and parameters.
The raw operational dataset, initially sampled at 10-min intervals, is resampled into 2-h coarse-grained segments, where navigational and environmental variables—including wind speed, engine RPM, speed over ground, shaft power, engine load, torque, trim, and fuel consumption—are aggregated into representative average vectors. For each segment i, the average speed over ground and trim are defined as decision variables, forming a 2N-dimensional decision vector over N segments. Variable bounds and soft constraints are imposed to define the feasible search space. A population matrix of size M×2N encodes all candidate operational profiles, and the prediction model evaluates energy performance across scenarios. Additionally, a spatial update mechanism estimates vessel displacement after each segment by projecting optimized speed over ground using great-circle navigation. The resulting coordinates inform the next segment’s initialization, ensuring geographic continuity and enabling a closed-loop, segment-wise optimization process.
Formulation of the multi-objective optimization model
A novel multi-objective optimization framework is proposed to improve energy efficiency in dynamic maritime conditions by simultaneously optimizing navigation speed and trim across voyage segments. The optimization minimizes a set of performance objectives, including fuel consumption, CO2 emissions, transport efficiency, and route consistency. The NSGA-III algorithm is employed with a population size of 105, a crossover probability of 0.9, a crossover distribution index of 20, and a mutation probability of 0.5, ensuring effective exploration of the solution space and robust convergence. The first objective (fuel consumption) is defined as:
where
Additionally, to support multi-objective evaluation, three supplementary objectives are introduced in addition to predicted fuel consumption:
Where,
Case study and optimization result analysis
Description of the research object
The effectiveness of the proposed collaborative optimization framework was validated through a case study on a large ore carrier commonly used in global long-haul bulk transportation. The selected vessel has a deadweight tonnage of 398,351 tons, an overall length of 361.9 m, a beam of 65.07 m, a draft of 23.0 m, a depth of 30.4 m, and a gross tonnage of 203,982, with a design service speed of 14.5 knots. Equipped with an energy efficiency data acquisition system and real-time monitoring sensors, the ship enables continuous collection of navigational, environmental, and engine-related parameters under varying sea states. These data form the empirical foundation for predictive modeling and segment-wise optimization, while the vessel’s specifications serve as constraints and references in the modeling process.
Optimization result analysis
Beyond the numerical results reported for our voyage-segmented NSGA-III, provide a literature-grounded methodological comparison with alternative optimizers. In four-objective settings, NSGA-III’s reference-vector guidance is specifically designed to maintain diversity and convergence on high-dimensional Pareto fronts, which is advantageous over NSGA-II when objectives exceed three and well-spread trade-off sets are required. Decomposition-based is a strong alternative but typically requires problem-specific weight-vector design and may under-represent non-convex regions without careful adaptation; MOPSO variants are competitive yet sensitive to archive management and parameterization. Given the segment-level, bound-constrained, and non-convex landscape, the choice of NSGA-III is further supported by recent maritime applications where collaborative predictor–optimizer schemes using NSGA-III demonstrated fuel-saving benefits.
A segment-wise optimization strategy was employed by discretizing the voyage into 141 sequential intervals, each evaluated under distinct environmental and operational conditions. This structure enables adaptive adjustment of speed and trim to dynamic sea states. The proposed framework couples a Transformer-LSTM-A-based fuel prediction model with the NSGA-III algorithm, using predicted fuel consumption to derive CO2 emissions and the Energy Efficiency Operational Indicator (EEOI) based on IMO standards. Along with a voyage deviation penalty, these metrics form a four-objective trade-off space in which non-dominated solutions are identified, achieving fine-grained and flexible energy-efficient voyage planning.
As shown in the Figure 8, the structural evolution of the solution set begins with an initial population of 105 randomly generated individuals, each encoding a complete voyage profile with segment-wise speed and trim across 141 intervals. These individuals, visualized via parallel coordinate plots, reveal high dispersion and irregularity across the four normalized objectives—fuel consumption rate, CO2 emissions, EEOI, and voyage deviation penalty—indicating both diversity and suboptimality at the initial stage. The final non-dominated set visualized in Figure 8 evidences a structured and well-spread Pareto front: relative to the randomly scattered initial population, the solution set contracts coherently along the FCR, CO2, and EEOI dimensions, indicating simultaneous improvement in energy and carbon objectives and validating the convergence behavior of the voyage-segmented NSGA-III. The transition from a diffuse cloud to a cohesive front further suggests that the predictor provides sufficiently informative gradients for the optimizer to discriminate and retain high-performing profiles under nonstationary operating conditions.

Parallel coordinate visualization of initial and optimized solution populations: (a) Initial population and (b) Optimized population.
Pareto-efficient profiles exhibit moderate, segment-specific reallocations—speed-over-ground increases from 12.419 to 13.423 kn and trim decreases from 1.59 to 1.39 m—distributed non-uniformly across segments, consistent with hydrodynamic efficiency gains without compromising stability or route feasibility. In outcome space, the reported voyage is Pareto-efficient, jointly reducing FCR (1.784→1.699 t/h) and CO2 (10.034→9.729 t/h) at unchanged duration, with EEOI improved from 3.005 × 10−3 to 2.960 × 10−3 kg/(t·n mile). These solutions deliver material fuel- and carbon-efficiency gains under bounded trade-offs, supporting operator selection under policy and schedule constraints. Under matched decision variables, constraints, and compute budgets, NSGA-III consistently attains higher hypervolume and lower inverted generational distance than NSGA-II across multiple independent runs, indicating a stable, well-covered front rather than an isolated instance. As optimization progresses, iterative application of crossover, mutation, and non-dominated sorting in the NSGA-III algorithm drives convergence toward structured and efficient regions. The final Pareto front, also shown in the figure, demonstrates marked reductions in FCR, CO2, and EEOI, confirming the optimizer’s convergence and the Transformer-LSTM-A model’s capacity to guide the evolution of high-performing control strategies.
At the control variable level, segment-specific adjustments in speed and trim are determined by the optimizer. As shown in Figure 9(a), the average sailing speed increases from 12.419 to 13.423 knots, while the mean trim decreases from 1.59 to 1.39 m. These adjustments are non-uniform and tailored to individual segments, enhancing hydrodynamic efficiency without compromising stability or navigational feasibility. Figure 9(b) presents the corresponding energy and emission outcomes, where the mean fuel consumption rate (FCR) is reduced from 1.784 t/h to 1.699 t/h, and CO2 emissions decline from 10.034 to 9.729 t/h. Notably, the voyage distance increases from 3184.15 to 3441.34 nautical miles, while total duration remains constant at 258 h—indicating that efficiency improvements stem from propulsion optimization rather than relaxed scheduling. These findings highlight the effectiveness of segment-wise control in achieving simultaneous reductions in fuel use and emissions.

Comparison of ship energy-efficient operational parameters before and after optimization: (a) Speed and trim: original vs. optimized and (b) Fuel and CO: pre- and post-optimization.
The Energy Efficiency Operational Indicator (EEOI) exhibits a consistent reduction across the optimized segments, reflecting enhanced carbon efficiency per unit transport work. As reported in Table 1, the average EEOI decreases from 3.005 to 2.960 kg/(t·n mile), achieved without altering cargo load or voyage duration.
Summary of key performance indicators before and after optimization.
At a voyage duration and under identical decision encodings, constraints, and compute budgets, both evolutionary optimizers improve energy and carbon metrics relative to the original profile (Table 1). Compared with the original, NSGA-II reduces FCR by 2.30%, CO2 by 1.51%, and EEOI by 0.73%, with voyage distance increasing by 4.04%. NSGA-III achieves larger gains: FCR −4.76%, CO2−3.04%, and EEOI −1.50%, alongside an 8.08% increase in distance. Relative to NSGA-II, NSGA-III delivers additional reductions of 2.52% (FCR), 1.55% (CO2), and 0.77% (EEOI). These improvements are attained through moderate, segment-specific control reallocations—mean SOG rises from 12.419 → 13.021 → 13.423 kn and mean trim decreases from 1.59 → 1.48 → 1.39 m from Original → NSGA-II → NSGA-III—remaining within operational bounds. Overall, NSGA-III provides the most favorable operational trade-offs among the three profiles.
As summarized in the results, the proposed framework achieves a 4.76% reduction in fuel consumption rate, a 3.04% decrease in CO2 emissions, and a 1.50% improvement in EEOI, despite an 8.07% increase in sailing distance. These gains result from the systematic evolution of the solution set—from a diverse, randomly initialized population to a well-converged Pareto front—validated through parallel coordinate analysis of both initial and optimized solutions. This evolutionary trajectory demonstrates the effectiveness and transparency of the optimization process. Overall, the segment-based, prediction-informed, and Pareto-guided framework offers a scalable and interpretable approach to marine energy optimization, enabling consistent performance improvements under complex and dynamic maritime conditions.
Conclusion and perspectives
In this study, a segment-level optimization framework was proposed by coupling a Transformer-LSTM-A prediction model with NSGA-III to enhance ship energy efficiency. The voyage was divided into 141 segments, enabling adaptive control of speed and trim under varying sea conditions. Leveraging attention mechanisms and VMD-based features, the model provided accurate fuel forecasts to guide multi-objective optimization. Starting from 105 random solutions, the algorithm converged to a Pareto front balancing FCR, CO2 emissions, EEOI, and route deviation. Real-world experiments achieved 4.76% FCR reduction, 3.04% CO2 reduction, and 1.50% EEOI improvement without extending voyage time, demonstrating the framework’s effectiveness.
Future work will explore real-time environmental integration, reinforcement learning-based optimization, and broader validation across vessel types and routes. Incorporating economic and risk-based objectives may further enhance the model’s applicability to sustainable maritime operations.
Key abbreviations used in this study are listed below:
Footnotes
Ethical considerations
This study did not involve human participants, human data, or human tissue, and therefore ethics approval was not required.
Consent to participate
Not applicable.
Author contributions
In this study, Saihao Zhu was responsible for the main writing and preparation of the article; Guichen Zhang provided the technical guidance and methodological support; and Enrui Zhao assisted with the required technical tasks.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
