Sage Journals: Discover world-class research

Abstract

In recent years, neural heuristics leveraging deep reinforcement learning have exhibited considerable promise in addressing multi-objective combinatorial optimization problems (MOCOPs). Nonetheless, challenges persist in attaining both high learning efficiency and optimal solution quality. To address this issue, we propose a novel multi-objective optimization algorithm grounded in information geometry and machine learning principles, which integrates adaptive gradient descent with meta-reinforcement learning techniques to effectively tackle MOCOPs. In this paper, we present a meta-learning framework aimed at enhancing model performance in multi-objective combinatorial optimization through tensor remodeling, preconditioned gradient descent, and entropy regularization strategies. Experimental results demonstrate that the proposed method yields significant performance improvements across several classic multi-objective combinatorial optimization challenges, including the Multi-objective Traveling Salesman Problem (MOTSP), Multi-objective Vehicle Routing Problem (MOCVRP), and Multi-objective Knapsack Problem (MOKP).

Keywords

Riemannian manifold meta-learning deep reinforcement learning multi-objective combinatorial optimization

Get full access to this article

View all access options for this article.

References

Amari

S. I.

(1998). Natural gradient works efficiently in learning. Neural Computation, 10(2), 251–276. https://doi.org/10.1162/089976698300017746

Amari

S. I.

Grosse

Nitanda

Suzuki

(2020). When does preconditioning help or hurt generalization? arXiv preprint arXiv:2006.10732.

Bazgan

Hugot

Vanderpooten

(2009). Solving efficiently the 0–1 multi-objective knapsack problem. Computers & Operations Research, 36(1), 260–279. https://doi.org/10.1016/j.cor.2007.09.009

Bertsimas

Tsitsiklis

(1993). Simulated annealing, volume 8. Dordrecht: Institute of Mathematical Statistics. https://doi.org/10.1214/ss/1177011077

Chen

Dohan

(2023a). Evoprompting: Language models for code-level neural architecture search. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt & S. Levine (Eds.), Advances in neural information processing systems (Vol. 36, pp. 7787–7817). https://proceedings.neurips.cc/paper_files/paper/2023/file/184c1e18d00d7752805324da48ad25be-Paper-Conference.pdf.

Chen

Wang

Zhang

Cao

Chen

(2023b). Efficient meta neural heuristic for multi-objective combinatorial optimization. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt & S. Levine (Eds.), Advances in neural information processing systems (Vol. 36, pp. 56825–56837). https://proceedings.neurips.cc/paper_files/paper/2023/file/b1efde53be364a73914f58805a001731-Paper-Conference.pdf.

Deb

Pratap

Agarwal

Meyarivan

(2002). A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II, Volume 6. New York, NY: IEEE. https://doi.org/10.1109/4235.996017 .

Gendreau

Potvin

J. Y.

(2005). Tabu search. In E. K. Burke & G. Kendall (Eds.), Search Methodologies: Introductory tutorials in optimization and decision support techniques (pp. 165–186). Boston, MA: Springer US. ISBN 978-0-387-28356-2. https://doi.org/10.1007/0-387-28356-0_6

Ibarz

Kurin

Papamakarios

Nikiforou

Bennani

Csordás

Dudzik

A. J.

Bošnjak

Vitvitskyi

Rubanova

Deac

Bevilacqua

Ganin

Blundell

Veličković

(2022). A generalist neural algorithmic learner. In B. Rieck & R. Pascanu (Eds.), Proceedings of the first learning on graphs conference, Proceedings of Machine Learning Research (Vol. 198, pp. 2:1–2:23). https://proceedings.mlr.press/v198/ibarz22a.html.

10.

Kang

Hwang

Kim

Rhee

(2023). Meta-learning with a geometry-adaptive preconditioner. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 16080–16090). https://arxiv.org/abs/2304.01552.

11.

Kolda

T. G.

Bader

B. W.

(2009). Tensor decompositions and applications. SIAM Review, 51(3), 455–500. https://doi.org/10.1137/07070111X

12.

Kool

Van Hoof

Welling

(2018). Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475 https://arxiv.org/abs/1803.08475.

13.

Kwon

Y. D.

Choo

Kim

Yoon

Gwon

Min

(2020). Pomo: Policy optimization with multiple optima for reinforcement learning. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan & H. Lin (Eds.), Advances in neural information processing systems (Vol. 33, pp. 21188–21198). https://proceedings.neurips.cc/paper_files/paper/2020/file/f231f2107df69eab0a3862d50018a9b2-Paper.pdf.

14.

Lacomme

Prins

Sevaux

(2006). A genetic algorithm for a bi-objective capacitated arc routing problem. Computers & Operations Research, 33(12), 3473–3493. https://doi.org/10.1016/j.cor.2005.02.017

15.

Lee

J. M.

(2012). Smooth manifolds. New York, NY: Springer New York. ISBN 978-1-4419-9982-5, pp. 1–31. https://doi.org/10.1007/978-1-4419-9982-5_1

16.

Lee

Choi

(2018). Gradient-based meta-learning with learned layerwise metric and subspace. In J. Dy & A. Krause (Eds.), Proceedings of the 35th international conference on machine learning, Proceedings of Machine Learning Research (Vol. 80, pp. 2927–2936). https://proceedings.mlr.press/v80/lee18a.html.

17.

Zhou

Chen

(2017). Meta-sgd: Learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835 https://arxiv.org/abs/1707.09835.

18.

Lin

Yang

Zhang

(2022). Pareto set learning for neural multi-objective combinatorial optimization. arXiv preprint arXiv:2203.15386 https://doi.org/10.48550/arXiv.2203.15386.

19.

Liu

Lin

Wang

Zhang

Xialiang

Yuan

(2024a). Multi-task learning for routing problem with cross-problem zero-shot generalization. In Proceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining KDD ’24 (p. 1898–1908). New York, NY, USA: Association for Computing Machinery. ISBN 9798400704901. https://doi.org/10.1145/3637528.3672040

20.

Liu

Xialiang

Yuan

Lin

Luo

Wang

Zhang

(2024b). Evolution of heuristics: Towards efficient automatic algorithm design using large language model. In Forty-first international conference on machine learning. https://openreview.net/forum?id=BwAkaxqiLB.

21.

Lourenço

H. R.

Martin

O. C.

Stützle

(2003). Iterated local search. In F. Glover & G. A. Kochenberger (Eds.), Handbook of metaheuristics (pp. 320–353). Boston, MA: Springer US. ISBN 978-0-306-48056-0. https://doi.org/10.1007/0-306-48056-5_11

22.

Lust

Teghem

(2010a). The multiobjective traveling salesman problem: A survey and a new approach. Berlin, Heidelberg: Springer Berlin Heidelberg. ISBN 978-3-642-11218-8, pp. 119–141. https://doi.org/10.1007/978-3-642-11218-8_6

23.

Lust

Teghem

(2010b). Two-phase pareto local search for the biobjective traveling salesman problem. Journal of Heuristics, 16(3), 475–510. https://doi.org/10.1007/s10732-009-9103-9

24.

Meyerson

Nelson

M. J.

Bradley

Gaier

Moradi

Hoover

A. K.

Lehman

(2023). Language model crossover: Variation through few-shot prompting. arXiv preprint arXiv:2302.12170 https://arxiv.org/abs/2302.12170.

25.

Nichol

Achiam

Schulman

(2018). On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999 https://arxiv.org/abs/1803.02999.

26.

Rajasegaran

Khan

Hayat

Khan

F. S.

Shah

(2020). Meta-learning the learning trends shared across tasks. arXiv preprint arXiv:2010.09291 https://arxiv.org/abs/2010.09291.

27.

Reed

Zolna

Parisotto

Colmenarejo

S. G.

Novikov

Barth-Maron

Gimenez

Sulsky

Kay

Springenberg

J. T.

Eccles

Bruce

Razavi

Edwards

Heess

Chen

Hadsell

Vinyals

Bordbar

De Freitas

(2022). A generalist agent. arXiv preprint arXiv:2205.06175 https://arxiv.org/abs/2205.06175.

28.

Romera-Paredes

Barekatain

Novikov

Balog

Kumar

M. P.

Dupont

Ruiz

F. J. R.

Ellenberg

J. S.

Wang

Fawzi

Kohli

Fawzi

(2024). Mathematical discoveries from program search with large language models. Nature, 625(7995), 468–475. https://doi.org/10.1038/s41586-023-06924-6

29.

Simon

Koniusz

Nock

Harandi

(2020). On modulating the gradient for meta-learning. In A. Vedaldi, H. Bischof, T. Brox & J. M. Frahm (Eds.), Computer Vision – ECCV 2020 (pp. 556–572). Cham: Springer International Publishing. ISBN 978-3-030-58598-3.

30.

von Oswald

Zhao

Kobayashi

Schug

Caccia

Zucchet

Sacramento

(2021). Learning where to learn: Gradient sparsity in meta and continual learning. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang & J. W. Vaughan (Eds.), Advances in neural information processing systems (Vol. 34, pp. 5250–5263). https://proceedings.neurips.cc/paper_files/paper/2021/file/2a10665525774fa2501c2c8c4985ce61-Paper.pdf.

31.

Wang

(2023). Efficient training of multi-task combinatorial neural solver with multi-armed bandits. arXiv preprint arXiv:2305.06361 https://arxiv.org/abs/2305.06361.

32.

Wang

Dai

Liu

(2024). Adagc: A novel adaptive optimization algorithm with gradient bias correction. Expert Systems with Applications, 256, 124956. https://doi.org/10.1016/j.eswa.2024.124956

33.

Yang

Zhao

Zhu

Zhou

Jia

Zan

(2024). Zhongjing: Enhancing the chinese medical capabilities of large language model through expert feedback and real-world multi-turn dialogue. Proceedings of the AAAI Conference on Artificial Intelligence, 38(17), 19368–19376. https://doi.org/10.1609/aaai.v38i17.29907

34.

Zhang

(2007). Moea/d: A multiobjective evolutionary algorithm based on decomposition. IEEE Transactions on Evolutionary Computation, 11(6), 712–731. https://doi.org/10.1109/TEVC.2007.892759

35.

Zhang

Wang

Zhang

Zhou

(2021). Modrl/d-el: Multiobjective deep reinforcement learning with evolutionary learning for multiobjective optimization. In 2021 International joint conference on neural networks (IJCNN) (pp. 1–8). https://doi.org/10.1109/IJCNN52387.2021.9534083

36.

Zhang

Wang

(2023). Meta-learning-based deep reinforcement learning for multiobjective optimization problems. IEEE Transactions on Neural Networks and Learning Systems, 34(10), 7978–7991. https://doi.org/10.1109/TNNLS.2022.3148435

37.

Zhao

Kobayashi

Sacramento

von Oswald

(2020). Meta-learning via hypernetworks. In Proceedings of the 4th workshop on meta-learning at NeurIPS 2020 (MetaLearn 2020). s.l.: NeurIPS. https://doi.org/10.3929/ethz-b-000465883

Adaptive Geometry Based Meta-Learning for Multi-Objective Combinatorial Optimization Problems

Abstract

Keywords

Get full access to this article

References