Abstract
This work utilizes Machine Learning (ML) regression and feature ranking techniques for water quality monitoring from remotely sensed data. The investigated regression methods include the Gaussian Process Regression (GPR), Suport Vector Regression (SVR) and Partial Least Squares Regression (PLSR). Feature relevance in the GPR model is as- sessed by the probabilistic Sensitivity Analysis (SA) approach.This thesis introduces the SA of the predictive mean and variance function of the GPR, which reveals the relev- ance of the input features and the spectral spacing of the input space, respectively. The approach was applied to both controlled and Chlorophyll-a (Chl-a)/ Remote sensing reflectance (Rrs) matchup datasets with promising results.
The SA of the predictive mean function of the GPR was compared and evaluated with the Automatic Relevance Determination (ARD) and Variable Importance in Pro- jection (VIP) feature ranking methods. The ARD is associated with GPR model, and the VIP is used to assign relevance to the input features in the PLSR model. The comparison results showed that feature ranking methods can not only be used to reduce dimension, while still obtaining satisfactory regression, but also to reveal the underlying biophys- ical properties of aquatic environments.
Feature ranking methods and ML regression models were combined to design an Automatic Model Selection Approach (AMSA). AMSA automatically compares and val- idates regression models by evaluating the number and combination of ranked input features. The output of AMSA is a regression model and the number and position of features used for obtaining the strongest model based on user defined statistical meas- ures. AMSA was tested on several Chl-a/ Rrs matchups representing various water conditions.
Finally, AMSA was applied to an aquatic environment showing a large variety of water conditions. The chosen test site was Lake Balaton, due to its unique optical prop- erties. Lake Balaton represents eutrophic, oligotrophic, turbid and clear, open ocean like conditions. Thus, being able to retrieve water quality by using a unified model es- tablished by AMSA, for all these different water conditions of the lake might allow the generalization of the model.
Has part(s)
Paper 1: Blix, K., Camps-Valls, G. & Jenssen, R. (2017). Gaussian Process Sensitivity Analysis for Oceanic Chlorophyll Estimation. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 10(4), 1265-1277. Published version not available in Munin due to publisher’s restrictions. Published version available at https://doi.org/10.1109/JSTARS.2016.2641583. Accepted manuscript version available in Munin at https://hdl.handle.net/10037/16500.
Paper 2: Blix, K. & Eltoft, T. (2018). Evaluation of feature ranking and regression methods for oceanic chlorophyll-a estimation. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 11(5), 1403-1418. Published version not available in Munin due to publisher’s restrictions. Published version available at https://doi.org/10.1109/JSTARS.2018.2810704. Accepted manuscript version available in Munin at https://hdl.handle.net/10037/15028.
Paper 3: Blix, K. & Eltoft, T. (2018). Machine Learning Automatic Model Selection Algorithm for Oceanic Chlorophyll-a Content Retrieval. Remote Sensing, 10(5). Also available in Munin at https://hdl.handle.net/10037/14038.
Paper 4: Blix, K., Pálffy, K., Tóth, V.R. & Eltoft, T. (2018). Remote Sensing of Water Quality Parameters over Lake Balaton by Using Sentinel-3 OLCI. Water, 10(10). Also available in Munin at https://hdl.handle.net/10037/14037.