|Table of Contents|

Citation:
 Njideka Chima-Amaeshi,Chris OMalley,Mark Willis.Predicting Marine Fuel with High Sulphur Content Using Machine Learning Algorithms[J].Journal of Marine Science and Application,2026,(2):617-629.[doi:10.1007/s11804-025-00674-9]
Click and Copy

Predicting Marine Fuel with High Sulphur Content Using Machine Learning Algorithms

Info

Title:
Predicting Marine Fuel with High Sulphur Content Using Machine Learning Algorithms
Author(s):
Njideka Chima-Amaeshi Chris O’Malley Mark Willis
Affilations:
Author(s):
Njideka Chima-Amaeshi Chris O’Malley Mark Willis
School of Engineering, Newcastle University, Newcastle Upon Tyne, NE1 7RU, UK
Keywords:
Machine learning|Marine fuel|Support vector machines|Agglomerative hierarchical clustering|High sulphur fuel oil|Very low sulphur fuel oil
分类号:
-
DOI:
10.1007/s11804-025-00674-9
Abstract:
Marine transportation is a significant source of air pollution especially around coastal areas with maritime vessels creating 12% of global sulphur oxides emission in 2014 alone. In compliance with International Maritime Organisation (IMO) regulations, the determination of sulphur content of marine fuels is typically carried out using lengthy laboratory-based analyses. The regulations prohibit the use of High-Sulphur Fuel Oil (HSFO) (>0.5% by weight of Sulphur) in Emission Control Areas (ECA). There is a need for a more efficient means of predicting Sulphur content and differentiating between HSFO and Very Low Sulphur Fuel Oil (VLSFO) samples. This study compares the application of a Support Vector Machine (SVM) and Agglomerative Hierarchical Clustering (AHC) algorithm enhanced with Principal Component Analysis for dimensionality reduction purposes to predict HSFO and VLSFO marine fuel samples based on near-infrared (NIR) industrial data from North Sea operations correlated with laboratory-measured sulphur values instead of relying on lengthy laboratory-based measurements. The study also compares the effect of normalising the data by setting the area under the curve to one and standardising it by subtracting the mean of predictor variables and scaling by standard deviation. The results show that although >70% of HSFO samples were accurately predicted with the SVM, a better result was achieved using the unsupervised learning approach of AHC/PCA with >80% of HSFO samples correctly predicted despite the imbalance in the industrial data providing an effective model for the rapid and well-informed decision-making tool for vessel operators. Normalising the area under the curve to one produced similar results to using standardised data.

References:

[1] Ahmad H, Dang S (2015) Performance Evaluation of Clustering Algorithm Using Different Dataset. International Journal of Advance Research in Computer Science and Management Studies, 8
[2] Al Ibrahim E, Farooq A (2021) Prediction of the Derived Cetane Number and Carbon/Hydrogen Ratio from Infrared Spectroscopic Data. Energy & Fuels 35(9): 8141-8152. https://doi.org/10.1021/acs.energyfuels.0c03899
[3] Awad M, Khanna R (2015) Support Vector Machines for Classification. Efficient Learning Machines, 39-66
[4] Bangert P (2021) 3.3.3 Support Vector Machines. Machine Learning and Data Science in the Oil and Gas Industry-Best Practices, Tools, and Case Studies, 48-49
[5] Bekkar M, Djemaa HK, Alitouche TA (2013) Evaluation Measures for Models Assessment over Imbalanced Data Sets. J Inf Eng Appl 3(10)
[6] Bertsekas DP (2014) Constrained Optimization and Lagrange Multiplier Methods. Academic press. https://doi.org/10.1016/C2013-0-10366-2
[7] Bilgili L (2021) Life Cycle Comparison of Marine Fuels for Imo 2020 Sulphur Cap. Science of The Total Environment 774: 145719. https://doi.org/10.1016/j.scitotenv.2021.145719
[8] Blanco M, Villarroya I (2002) Nir Spectroscopy: A Rapid-Response Analytical Tool. TrAC Trends in Analytical Chemistry 21(4): 240-250. https://doi.org/10.1016/S0165-9936(02)00404-1
[9] Broadhurst DI, Kell DB (2006) Statistical Strategies for Avoiding False Discoveries in Metabolomics and Related Experiments. Metabolomics 2(4): 171-196
[10] Christopher J, Patel MB, Ahmed S, Basu B (2001) Determination of Sulphur in Trace Levels in Petroleum Products by Wavelength-Dispersive X-Ray Fluorescence Spectroscopy. Fuel 80(13): 1975-1979. https://doi.org/10.1016/S0016-2361(00)00213-1
[11] Ciaburro G, Joshi P (2019) 1.6 Normalization. Python Machine Learning Cookbook (2nd Edition)
[12] Concawe (2016) Marine Fuel Facts, 2022 (10 November)
[13] Corbett JJ, Winebrake JJ, Green EH, Kasibhatla P, Eyring V, Lauer A (2007) Mortality from Ship Emissions: A Global Assessment. Environmental Science & Technology 41(24): 8512-8518. https://doi.org/10.1021/es071686z
[14] Cortes C, Vapnik V (1995) Support-Vector Networks. Machine learning 20(3): 273-297. https://doi.org/10.1007/BF00994018
[15] Cullinane K, Bergqvist R (2014) Emission Control Areas and Their Impact on Maritime Transport. Transportation Research Part D: Transport and Environment 28: 1-5. https://doi.org/10.1016/j.trd.2013.12.004
[16] Dadi HS, Pillutla GM (2016) Improved Face Recognition Rate Using Hog Features and Svm Classifier. IOSR Journal of Electronics and Communication Engineering 11(04): 34-44. https://doi.org/10.9790/2834-1104013444
[17] Deng F, Guo S, Zhou R, Chen J (2015) Sensor Multifault Diagnosis with Improved Support Vector Machines. IEEE transactions on automation science and engineering 14(2): 1053-1063. https://doi.org/10.1109/TASE.2015.2487523
[18] Everitt BS, Dunn G (2001) 6.2 Agglomerative Hierarchical Clustering Techniques. Applied Multivariate Data Analysis (2nd Edition)
[19] Eyring V, Isaksen ISA, Berntsen T, Collins WJ, Corbett JJ, Endresen O, Grainger RG, Moldanova J, Schlager H, Stevenson DS (2010) Transport Impacts on Atmosphere and Climate: Shipping. Atmospheric Environment 44(37): 4735-4771. https://doi.org/10.1016/j.atmosenv.2009.04.059
[20] Fan L, Shen H, Yin J (2023) Mixed Compliance Option Decisions for Container Ships under Global Sulphur Emission Restrictions. Transportation Research Part D: Transport and Environment 115: 103582. https://doi.org/10.1016/j.trd.2022.103582
[21] Fanali S, Haddad PR, Poole CF, Riekkola M-L (2017) 21.3.3 Normalization. Liquid Chromatography-Fundamentals and Instrumentation, Volume 1 (2nd Edition)
[22] Gelbart MA, Snoek J, Adams RP (2014) Bayesian Optimization with Unknown Constraints. arXiv preprint arXiv: 1403.5607. https://doi.org/10.48550/arXiv.1403.5607
[23] Gu Y, Wang Y, Iris ? (2025) Integrated Green Technology Adoption, Ship Speed Optimization and Slot Management for Shipping Alliance under Emission Limits and Uncertain Fuel Prices. Journal of Cleaner Production 494: 144939. https://doi.org/10.1016/j.jclepro.2025.144939
[24] Gunn SR (1998) Support Vector Machines for Classification and Regression. ISIS technical report 14(1): 5-16
[25] Hassell?v IM, Turner DR, Lauer A, Corbett JJ (2013) Shipping Contributes to Ocean Acidification. Geophysical Research Letters 40(11): 2731-2736. https://doi.org/10.1002/grl.50521
[26] He H, Garcia EA (2009) Learning from Imbalanced Data. IEEE Transactions on knowledge and data engineering 21(9): 1263-1284. https://doi.org/10.1109/TKDE.2008.239
[27] Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B (1998) Support Vector Machines. IEEE Intelligent Systems and their applications 13(4): 18-28. https://doi.org/10.1109/5254.708428
[28] Huang J, Romero-Torres S, Moshgbar M (2010) Practical Considerations in Data Pre-Treatment for Nir and Raman Spectroscopy, American Pharmaceutical Review. Dostopno na: http://www.americanpharmaceuticalreview.com/Featured-Articles/116330-Practical-Considerations-in-Data-Pre-treatment-for-NIR-and-Raman-Spectroscopy/. [Dostop: 10-Sep-2019]
[29] IHMMarineSurveys (2020) Fuel Oil Sulphur Testing and Analysis, 2023 (20 July)
[30] Ju H-j, Jeon S-k (2022) Effect of Ultrasound Irradiation on the Properties and Sulfur Contents of Blended Very Low-Sulfur Fuel Oil (Vlsfo). Journal of Marine Science and Engineering 10(7): 980. https://doi.org/10.3390/jmse10070980
[31] Kapoutsis E, Theodoulidis B, Saraee M (2024) Svm Categorizer: A Generic Categorization Tool Using Support Vector Machines. Proceedings of the International Conference on Machine Learning; Models, Technologies and Applications, 1109-1112.
[32] Kuzu SL, Bilgili L, Kili? A (2021) Estimation and Dispersion Analysis of Shipping Emissions in Bandirma Port, Turkey. Environment, Development and Sustainability 23(7): 10288-10308. https://doi.org/10.1007/s10668-020-01057-6
[33] Lammoglia T, de Souza Filho CR (2011) Spectroscopic Characterization of Oils Yielded from Brazilian Offshore Basins: Potential Applications of Remote Sensing. Remote Sensing of Environment 115(10): 2525-2535. https://doi.org/10.1016/j.rse.2011.04.038
[34] Lantz B (2019) 10.1.5 Visualizing Performance Tradeoffs with Roc Curves. Machine Learning with R (3rd Edition), 331-332
[35] Lantz B (2023) Machine Learning with R (4th Edition) -Learn Techniques for Building and Improving Machine Learning Models, from Data Preparation to Model Tuning, Evaluation, and Working with Big Data
[36] Li H, Chen H, Li Y, Chen Q, Fan X, Li S, Ma M (2023) Prediction of the Optical Properties in Photonic Crystal Fiber Using Support Vector Machine Based on Radial Basis Functions. Optik 275: 170603. https://doi.org/10.1016/j.ijleo.2023.170603
[37] Liping W, Xuelong H, Jiang N (2017) Robust Time Delay Estimation Based on Asinh Transform under α-Stable Noises. 2017 13th IEEE International Conference on Electronic Measurement & Instruments (ICEMI) 162-166. https://doi.org/10.1109/ICEMI.2017.8265932
[38] Liu Y (2020) Python Machine Learning by Example (3rd Edition). Packt Publishing
[39] Maklin C (2018) Hierarchical Agglomerative Clustering Algorithm: Example in Python, 2021 (21 July)
[40] Mathworks (2022) Fitcsvm, 2022 (June 06)
[41] Mehta S, Kundra D (2024) Combining Cnn and Svm for Robust Cattle Disease Classification in Veterinary Applications. 2024 International Conference on Intelligent Computing and Sustainable Innovations in Technology (IC-SIT) 1-5. https://doi.org/10.1109/IC-SIT63503.2024.10862162
[42] Meyer D, Leisch F, Hornik K (2003) The Support Vector Machine under Test. Neurocomputing 55(1-2): 169-186. https://doi.org/10.1016/S0925-2312(03)00431-4
[43] Murtagh F, Contreras P (2012) Algorithms for Hierarchical Clustering: An Overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2(1): 86-97. https://doi.org/10.1002/widm.53
[44] Nagla JR (2014) Statistics for Textile Engineers
[45] NCSS (2021) Hierarchical Clustering/Dendrograms, 2021 (09 September)
[46] Patel KA, Thakral P (2016) The Best Clustering Algorithms in Data Mining. 2016 International Conference on Communication and Signal Processing (ICCSP) 2042-2046. https://doi.org/10.1109/ICCSP.2016.7754534
[47] Saldana DA, Starck L, Mougin P, Rousseau B, Ferrando N, Creton B (2012) Prediction of Density and Viscosity of Biofuel Compounds Using Machine Learning Methods. Energy & fuels 26(4): 2416-2426
[48] Sandak J, Sandak A, Meder R (2016) Assessing Trees, Wood and Derived Products with near Infrared Spectroscopy: Hints and Tips. Journal of Near Infrared Spectroscopy 24(6): 485-505. https://doi.org/10.1255/jnirs.1255
[49] Spackman KA (1989) Signal Detection Theory: Valuable Tools for Evaluating Inductive Learning. Proceedings of the sixth international workshop on Machine learning, 160-163. https://doi.org/10.1016/B978-1-55860-036-2.50047-3
[50] Sreedhar Kumar S, Madheswaran M, Vinutha B, Manjunatha Singh H, Charan K (2019) A Brief Survey of Unsupervised Agglomerative Hierarchical Clustering Schemes. Int J Eng Technol (UAE) 8(1): 29-37
[51] Stratiev D, Dinkov R, Petkov K, Stanulov K (2010) Evaluation of Crude Oil Quality. Petroleum & Coal 52(1): 35-43
[52] Sun D-W (2009) 4.3 Evaluation of Classification Performances. Infrared Spectroscopy for Food Quality Analysis and Control
[53] Thijssen P, Hadjiloucas S (2020) 12.3.2 Advances in Support Vector Machine Classifiers. State Estimation in Chemometrics-the Kalman Filter and Beyond (2nd Edition), 237
[54] Van TC, Ramirez J, Rainey T, Ristovski Z, Brown RJ (2019) Global Impacts of Recent Imo Regulations on Marine Fuel Oil Refining Processes and Ship Emissions. Transportation Research Part D: Transport and Environment 70: 123-134. https://doi.org/10.1016/j.trd.2019.04.001
[55] Wang H, Hu L, Zhang Y (2023a) Svm Based Imbalanced Correction Method for Power Systems Transient Stability Evaluation. ISA Transactions 136: 245-253. https://doi.org/10.1016/j.isatra.2022.10.039
[56] Wang Q, Chen D, Li M, Li S, Wang F, Yang Z, Zhang W, Chen S, Yao D (2023b) A Novel Method for Petroleum and Natural Gas Resource Potential Evaluation and Prediction by Support Vector Machines (Svm). Applied Energy 351: 121836. https://doi.org/10.1016/j.apenergy.2023.121836
[57] Westerhuis JA, Hoefsloot HC, Smit S, Vis DJ, Smilde AK, van Velzen EJ, van Duijnhoven JP, van Dorsten FA (2008) Assessment of Plsda Cross Validation. Metabolomics 4(1): 81-89
[58] Workman J (2001) Handbook of Organic Compounds: Nir, Ir, Raman and Uv-Vis Spectra Featuring Polymers and Surfactants (a 3-Volume Set). 3. Ir and Raman Spectra. Academic Press
[59] Zhang N, Wei M, Bai B, Wang X, Hao J, Jia S (2022) Pattern Recognition for Steam Flooding Field Applications Based on Hierarchical Clustering and Principal Component Analysis. ACS Omega 7(22): 18804-18815. https://doi.org/10.1021/acsomega.2c01693
[60] Zhang T (2001) An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Ai Magazine 22(2): 103-103. https://doi.org/10.1017/CBO9780511801389
[61] Zis TP, Cullinane K (2020) The Desulphurisation of Shipping: Past, Present and the Future under a Global Cap. Transportation Research Part D: Transport and Environment 82: 102316. https://doi.org/10.1016/j.trd.2020.102316

Memo

Memo:
Received date:2024-12-27;Accepted date:2025-4-22。<br>Foundation item:This work was supported by Newcastle University and the Engineering and Physical Sciences Research Council (EPSRC) [grant numbers 2020/21 DTP: ref.EP/T517914/1].<br>Corresponding author:Njideka Chima-Amaeshi,E-mail:n.chima-amaeshi2@newcastle.ac.uk
Last Update: 2026-06-08