Optimization of Numerical Algorithms for Solving Large Linear Equation Systems in Industrial Mathematical Computing
DOI:
https://doi.org/10.62951/ijamc.v1i4.275Keywords:
Gauss–Seidel, Jacobi, LU decomposition, Numerical computing, Parallel algorithmAbstract
The rapid advancement of modern computing has driven extensive research on numerical algorithms for solving large-scale systems of linear equations. Classical methods such as LU decomposition, Jacobi, and Gauss–Seidel have been revisited and optimized to leverage parallel architectures, GPUs, and even quantum platforms. Recent studies demonstrate that optimized algorithms can reduce computation time by more than 50% while maintaining high accuracy in solving high-dimensional problems. LU decomposition, particularly in its parallel and GPU-based implementations, has shown superior performance in batch processing and industrial-scale simulations. Meanwhile, iterative methods such as Jacobi and Gauss–Seidel remain relevant due to their flexibility in numerical modeling, with further developments for block matrix systems, finite element applications, and FPGA architectures. The integration of these enhanced algorithms is not only beneficial for the advancement of scientific software development but also supports practical applications in engineering simulations, large-scale data optimization, and machine learning. Therefore, an integrative review of modern numerical algorithm developments is crucial in bridging the gap between industrial demands and research progress in scientific computing.
References
Abdelfattah, A., Anzt, H., Dongarra, J., Gates, M., Haidar, A., Kurzak, J., Luszczek, P., Tomov, S., Yamazaki, I., & YarKhan, A. (2016). Linear algebra software for large-scale accelerated multicore computing. Acta Numerica, 25, 1–160. https://doi.org/10.1017/S0962492916000015
Ahmadi, A., Manganiello, F., Khademi, A., & Smith, M. C. (2021). A parallel Jacobi-embedded Gauss-Seidel method. IEEE Transactions on Parallel and Distributed Systems, 32(6), 1452–1464. https://doi.org/10.1109/TPDS.2021.3052091
Axelsson, O., Liang, Z.-Z., Kruzik, J., & Horak, D. (2021). Inner product free iterative solution and elimination methods for linear systems of a three-by-three block matrix form. Journal of Computational and Applied Mathematics, 383, 113117. https://doi.org/10.1016/j.cam.2020.113117
Badawy, M. O., Hanafy, Y. Y., & Eltarras, R. (2012). LU factorization using multithreaded system. 2012 22nd International Conference on Computer Theory and Applications (ICCTA), 9–14. https://doi.org/10.1109/ICCTA.2012.6523540
Czumaj, A. (2023). Modern parallel algorithms. Leibniz International Proceedings in Informatics (LIPIcs), 272, 3. https://doi.org/10.4230/LIPIcs.MFCS.2023.3
Das, A., Upadhyaya, I., Meng, X., & Talwalkar, A. (2017). Collaborative filtering as a case-study for model parallelism on bulk synchronous systems. Proceedings of the International Conference on Information and Knowledge Management, 969–977. https://doi.org/10.1145/3132847.3132862
Feali, M. S., Ahmadi, A., Hamidi, A., & Ahmadi, M. (2017). Fixed-point arithmetic error analysis of sparse LU decomposition on FPGAs. ISSCS 2017 - International Symposium on Signals, Circuits and Systems. https://doi.org/10.1109/ISSCS.2017.8034900
Haleem, B. A., El Aghoury, I. M., Tork, B. S., & El-Arabaty, H. A. (2023). A new, fast method for solving finite-element equations iteratively based on Gauss-Seidel. Proceedings of the Institution of Civil Engineers: Engineering and Computational Mechanics, 176(1), 1–12. https://doi.org/10.1680/jencm.22.00017
Humphrey, J., Price, D., Spagnoli, K., & Kelmelis, E. (2012). Accelerating CULA linear algebra routines with hybrid GPU and multicore computing. In W.-m. W. Hwu (Ed.), GPU computing gems Jade edition (pp. 133–143). Elsevier. https://doi.org/10.1016/B978-0-12-385963-1.00012-5
Isotton, G., Frigo, M., Spiezia, N., & Janna, C. (2021). Chronos: A general purpose classical AMG solver for high performance computing. SIAM Journal on Scientific Computing, 43(5), C335–C357. https://doi.org/10.1137/21M1398586
Kumar, G. P., & Ramesh, C. (2019). Implementation of an area efficient high throughput architecture for sparse matrix LU factorization. 2019 International Conference on Electronics, Materials Engineering and Nano-Technology (IEMENTech). https://doi.org/10.1109/IEMENTech48150.2019.8981319
Kwan, C. (2023). Classical LU decomposition in ACL2. Electronic Proceedings in Theoretical Computer Science, 393, 1–3. https://doi.org/10.4204/eptcs.393.1
Lei, X., Zhang, X., Ma, L., & Bao, T. (2022). High-performance batched LU decomposition on GPU. Advances in Transdisciplinary Engineering, 30, 380–396. https://doi.org/10.3233/ATDE221053
Li, D. H., Xie, S., & Xu, H. R. (2017). Splitting methods for tensor equations. Numerical Linear Algebra with Applications, 24(5), e2102. https://doi.org/10.1002/nla.2102
Liu, F., Fredriksson, A., & Markidis, S. (2022). A survey of HPC algorithms and frameworks for large-scale gradient-based nonlinear optimization. Journal of Supercomputing, 78(16), 17513–17542. https://doi.org/10.1007/s11227-022-04555-8
Liu, X., Jing, L., Han, L., & Gao, J. (2021). HHL analysis and simulation verification based on origin quantum platform. Journal of Physics: Conference Series, 2113(1), 012083. https://doi.org/10.1088/1742-6596/2113/1/012083
Long, G.-P., & Fan, D.-R. (2009). Parallelization of LU decomposition on the Godson-Tv1 many-core architecture. Chinese Journal of Computers, 32(11), 2157–2167. https://doi.org/10.3724/SP.J.1016.2009.02157
Meade, A., Deeptimahanti, D. K., Johnston, M., Buckley, J., & Collins, J. J. (2014). Data decomposition for code parallelization in practice: What do the experts need? Proceedings of the IEEE International Conference on High Performance Computing and Communications, 754–761. https://doi.org/10.1109/HPCC.and.EUC.2013.110
Naik, T. U., & Guinde, N. (2017). Implementing the Gauss-Seidel algorithm for solving eigenvalues of symmetric matrices with CUDA. Proceedings of the International Conference on Computing Methodologies and Communication (ICCMC 2017), 922–925. https://doi.org/10.1109/ICCMC.2017.8282601
Plum, H.-J., Krechel, A., Gries, S., Metsch, B., Nick, F., Schweitzer, M. A., & Stüben, K. (2017). Parallel algebraic multigrid. In U. Trottenberg (Ed.), Scientific computing and algorithms in industrial simulations (pp. 121–134). Springer. https://doi.org/10.1007/978-3-319-62458-7_6
Saha, M., & Chakravarty, J. (2020). Convergence of generalized SOR, Jacobi and Gauss–Seidel methods for linear systems. International Journal of Applied and Computational Mathematics, 6(3), 77. https://doi.org/10.1007/s40819-020-00830-5
Shaimardanov, A. R., Shulga, D. A., & Palyulin, V. A. (2019). Iterative solvers for empirical partial atomic charges: Breaking the curse of cubic numerical complexity. Journal of Chemical Information and Modeling, 59(4), 1434–1443. https://doi.org/10.1021/acs.jcim.8b00848
Shylo, V. P., & Shylo, O. V. (2017). Algorithm portfolios and teams in parallel optimization. In P. M. Pardalos (Ed.), Springer optimization and its applications (Vol. 130, pp. 481–493). Springer. https://doi.org/10.1007/978-3-319-68640-0_23
Tamuli, M., Debnath, S., Ray, A., & Majumdar, S. (2016). Implementation of Jacobi iterative solver in Verilog HDL. 2016 2nd International Conference on Control, Instrumentation, Energy and Communication (CIEC), 103–105. https://doi.org/10.1109/CIEC.2016.7513747
Walter, D., Adamtschuk, T., Hannig, F., & Teich, J. (2024). Analysis and optimization of block LU decomposition for execution on tightly coupled processor arrays. Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors, 97–106. https://doi.org/10.1109/ASAP61560.2024.00029
Wang, G., Monti, A., & Quan, G. (2007). Out-of-core LU decomposition on a multiple-DSP platform. IEEE Electric Ship Technologies Symposium (ESTS), 275–280. https://doi.org/10.1109/ESTS.2007.372098
Xing, E. P., Ho, Q., Dai, W., Kim, J. K., Wei, J., Lee, S., Zheng, X., Xie, P., Kumar, A., & Yu, Y. (2015). Petuum: A new platform for distributed machine learning on big data. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1335–1344. https://doi.org/10.1145/2783258.2783323
Yang, W., Liu, X., Qin, L., & Zhang, Y. (2022). From standard ZND solving time-variant linear system (LS) to elegant-formula ZND solving time-invariant LS linked to Jacobi iteration. Proceedings of the Chinese Control and Decision Conference (CCDC), 4490–4495. https://doi.org/10.1109/CCDC55256.2022.10033452
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 International Journal of Applied Mathematics and Computing

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


