VARIATIONAL METHODS FOR NEURAL NETWORK TRAINING: APPLICATIONS OF STURM-LIOUVILLE ENERGY ESTIMATES

Massimiliano Ferrara

doi:10.17654/0972087125015

Authors

Massimiliano Ferrara

Keywords:

variational methods, artificial neural networks, boundary value problems

DOI:

https://doi.org/10.17654/0972087125015

Abstract

This paper establishes a novel connection between local minimization principles for Sturm-Liouville equations and optimization techniques used in training neural networks. By interpreting the training of neural networks as a variational problem, we demonstrate how recent results on energy estimates for mixed boundary value problems in Sturm-Liouville theory can be adapted to analyze and improve neural network convergence. We present two main theorems: the first establishes conditions for guaranteed convergence to non-zero local minima in neural network training, and the second demonstrates the existence of multiple critical points with energy estimates. Our theoretical results are supported by experimental validation on benchmark datasets, showing improved performance in avoiding trivial solutions during training. This work bridges the gap between classical differential equation theory and modern machine learning optimization.

Received: April 12, 2025
Accepted: May 9, 2025

References

A. Choromanska, M. Henaff, M. Mathieu, G. B. Arous and Y. LeCun, The loss surfaces of multilayer networks, Artificial Intelligence and Statistics, 2015, pp. 192-204.

Y. N. Dauphin, R. Pascanu, C. Gulcehre, K. Cho, S. Ganguli and Y. Bengio, Identifying and attacking the saddle point problem in high-dimensional non-convex optimization, Advances in Neural Information Processing Systems, 2014, pp. 2933-2941.

L. C. Evans, Partial differential equations, Graduate Studies in Mathematics, American Mathematical Society, Vol. 19, 2010.

S. Heidarkhani, S. Moradi and M. Ferrara, Energy estimates and existence results for a mixed boundary value problem for a complete Sturm-Liouville equation exploiting a local minimization principle, WSEAS Trans. Math. 24 (2025), 220-230.

K. Kawaguchi, Deep learning without poor local minima, Advances in Neural Information Processing Systems, 2016, pp. 586-594.

I. E. Lagaris, A. Likas and D. I. Fotiadis, Artificial neural networks for solving ordinary and partial differential equations, IEEE Transactions on Neural Networks 9(5) (1998), 987-1000.

G. H. Liu and E. A. Theodorou, Deep learning theory review: an optimal control and dynamical systems perspective, 2019. arXiv preprint arXiv:1908.10920.

P. H. Rabinowitz, Minimax methods in critical point theory with applications to differential equations, American Mathematical Society, 1986, pp. 1-100.

M. Raissi, P. Perdikaris and G. E. Karniadakis, Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comput. Phys. 378 (2019), 686-707.

S. Ruder, An overview of gradient descent optimization algorithms, 2016. arXiv preprint arXiv:1609.04747.

R. T. Q. Chen, Y. Rubanova, J. Bettencourt and D. Duvenaud, Neural ordinary differential equations, Advances in Neural Information Processing Systems, 2018, pp. 6571-6583.

Article Stats:

Far East Journal of Mathematical Sciences (FJMS)

VARIATIONAL METHODS FOR NEURAL NETWORK TRAINING: APPLICATIONS OF STURM-LIOUVILLE ENERGY ESTIMATES

Authors

Keywords:

DOI:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

Quick Links

Recently Published:

Recently Published:

Pushpa Publishing House

QUICKLINKS

SERVICES

Frequently Asked Questions (FAQ)

Quick Links

Article Stats:

Far East Journal of Mathematical Sciences (FJMS)

VARIATIONAL METHODS FOR NEURAL NETWORK TRAINING: APPLICATIONS OF STURM-LIOUVILLE ENERGY ESTIMATES

Authors

Keywords:

DOI:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

Quick Links

Recently Published:

Recently Published: