Comparing the performance of generalized linear model (GLM) and random forest (RF) models in predicting catch distribution of Caspian Kutum (Rutilus frisii)

Authors

1 Ph.D. Graduate, Department of Fisheries, Faculty of Natural Resources, University of Tehran, Karaj, Iran

2 Associate Professor, Department of Fisheries, Faculty of Natural Resources, University of Tehran, Karaj, Iran

3 Professor, Department of Forestry and Forest Economics, Faculty of Natural Resources, University of Tehran, Karaj, Iran

10.22059/jfisheries.2023.91491

Abstract

The present study aimed to assess the performance of generalized linear model (GLM) and random forest (RF) model in predicting Caspian Kutum (Rutilus frisii) catch distribution. Caspian Kutum catch per unit of effort (CPUE) data was used as the response variable. Remotely-sensed data of five environmental parameters were used as model predictors as well, including daily sea surface temperature (SST), chlorophyll-a concentration (CHL), aerosol optical thickness (ASL), particulate organic carbon (POC) and particulate inorganic carbon (PIC) concentrations. The coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE) scores were used as measures of model performance and accuracy. The best fitted GLM had only Log(PIC) and POC as significant parameters, while the RF model contained all predictors. RF showed higher explaining potential compared to GLM (RF: R2=0.47; GLM: R2=0.053). Also, higher accuracy was observed using RF (MAE=972.4; RMSE=1326.1 (kg/hour.seine)) than GLM (MAE=1328.7; RMSE=1465.6 (kg/hour.seine)). ASL (33.31%) and CHL (28.87%) were parameters with the highest and lowest relative influence in the RF model. Based on the results, random forest modelling is suggested as a practical technique for predicting fish catch distribution.   

Keywords

Main Subjects


Abdolhay, H.A., Daud, S.K., Rezvani, S., Pourkazemi, M., Siraj, S.S., Laloei, F., Javanmard, A., Hassanzadeh Saber, M., 2012. Population genetic structure of Mahi Sefid (Rutilus frisii kutum) in the South Caspian Sea: Implications for fishery management. Iranian Journal of Animal Biosystematics 8(1), 15-26.
Boyd, P.W., Ellwood, M.J., Tagliabue, A., Twining, B.S., 2017. Biotic and abiotic retention, recycling and remineralization of metals in the ocean. Nature Geoscience 10(3), 167-173.
Breiman, L., 2001. Random forests. Machine Learning 45, 5-32.
Bučas, M., Bergström, U., Downie, A.L., Sundblad, G., Gullström, M., Von Numers, M., Šiaulys, A., Lindegarth, M., 2013. Empirical modelling of benthic species distribution, abundance, and diversity in the Baltic Sea: evaluating the scope for predictive mapping using different modelling approaches. ICES Journal of Marine Science 70(6), 1233-1243.
Bui, D.T., Lofman, O., Revhaug, I., Dick, O., 2011. Landslide susceptibility analysis in the Hoa Binh province of Vietnam using statistical index and logistic regression. Natural Hazards, 59(3), 1413-1444.
Campbell, R.A., 2015. Constructing stock abundance indices from catch and effort data: Some nuts and bolts. Fisheries Research 161, 109-130.
Chen, X., Cao, J., Chen, Y., Liu, B., Tian, S., 2012. Effect of the Kuroshio on the spatial distribution of the red flying squid Ommastrephes bartramii in the Northwest Pacific Ocean. Bulletin of Marine Science 88(1), 63-71.
Cutler, D.R., Edwards Jr, T.C., Beard, K.H., Cutler, A., Hess, K.T., Gibson, J., Lawler, J.J., 2007. Random forests for classification in ecology. Ecology 88 (11), 2783–2792.
Denis, V., Lejeune, J., Robin, J.P., 2002. Spatio-temporal analysis of commercial trawler data using General Additive models: patterns of Loliginid squid abundance in the north-east Atlantic. ICES Journal of Marine Science 59(3), 633-648.
Díaz-Uriarte, R., de Andrés, S.A., 2006. Gene selection and classification of microarray data using random forest. BMC bioinformatics, 7(1), 1-13.
Eagderi, S., Mouludi-Saleh, A., Esmaeli, H.R., Sayyadzadeh, G., Nasri, M., 2022. Freshwater lamprey and fishes of Iran; a revised and updated annotated checklist-2022. Turkish Journal of Zoology 46(6), 500-522.
Ghasemi, M., Zamani, H., Hosseini, S.M., Karsidani, S.H., Bergmann, S.M., 2014. Caspian White Fish (Rutilus frisii kutum) as a host for spring viraemia of carp virus. Veterinary Microbiology 170(3-4), 408-413.
Giannoulaki, M., Iglesias, M., Tugores, M.P., Bonanno, A., Patti, B., de Felice, A., Leonori, I., Bigot, J.L., Tičina, V., Pyrounaki, M.M., Tsagarakis, K., 2013. Characterizing the potential habitat of European anchovy Engraulis encrasicolus in the Mediterranean Sea, at different life stages. Fisheries Oceanography 22(2), 69-89.
Gormley, A.M., Forsyth, D.M., Griffioen, P., Lindeman, M., Ramsey, D.S., Scroggie, M.P., Woodford, L., 2011. Using presence‐only and presence–absence data to estimate the current and potential distributions of established invasive species. Journal of Applied Ecology 48(1), 25-34.
Greenwell, B.M., 2017. pdp: An R Package for Constructing   Partial Dependence Plots. The R Journal, 9(1), 421-436. URL https://journal.r-project.org/archive/2017/RJ-2017-016/index.html.
Guisan, A., Tingley, R., Baumgartner, J.B., Naujokaitis‐Lewis, I., Sutcliffe, P.R., Tulloch, A.I., Regan, T.J., Brotons, L., McDonald‐Madden, E., Mantyka‐Pringle, C., Martin, T.G., 2013. Predicting species distributions for conservation decisions. Ecology Letters 16(12), 1424-1435.
Hart, R.A., 2012. Stock assessment of brown shrimp (Farfantepenaeus aztecus) in the US Gulf of Mexico for 2011.
Hijmans, R.J., 2021. Raster: Geographic Data Analysis and Modeling. R package version 3.5-11.   https://CRAN.R-project.org/package=raster
Hopkins, J., Henson, S.A., Poulton, A.J., Balch, W.M., 2019. Regional characteristics of the temporal variability in the global particulate inorganic carbon inventory. Global Biogeochemical Cycles 33(11), 1328-1338.
Hua, C., Li, F., Zhu, Q., Zhu, G., Meng, L., 2020. Habitat suitability of Pacific saury (Cololabis saira) based on a yield-density model and weighted analysis. Fisheries Research 221, 105408.
Johanson, A. F., Jenkins, S. R., Hiddink, J. G., Hinz, H., 2013. Linking temperate demersal fish species to habitat: scales, patterns and future directions. Fish and Fisheries 14(3), 256-280.
Kempf, A., Stelzenmüller, V., Akimova, A., Floeter, J., 2013. Spatial assessment of predator–prey relationships in the North Sea: the influence of abiotic habitat properties on the spatial overlap between 0‐group cod and grey gurnard. Fisheries Oceanography 22(3), 174-192.
Kuhn, M., 2022. caret: Classification and Regression Training. R package version 6.0-92. https://CRAN.R-project.org/package=care
Kwon, Y.S., Bae, M.J., Hwang, S.J., Kim, S.H., Park, Y.S., 2015. Predicting potential impacts of climate change on freshwater fish in Korea. Ecological Informatics 29, 156-165.
Li, J., Heap, A.D., Potter, A., Daniell, J.J., 2011a. Application of machine learning methods to spatial interpolation of environmental variables. Environmental Modelling & Software 26(12), 1647-1659.
Li, J., Heap, A.D., Potter, A., Huang, Z., Daniell, J.J., 2011b. Can we improve the spatial predictions of seabed sediments? A case study of spatial interpolation of mud content across the southwest Australian margin. Continental Shelf Research 31(13), 1365-1376.
Li, M., Zhang, C., Xu, B., Xue, Y., Ren, Y., 2017. Evaluating the approaches of habitat suitability modelling for whitespotted conger (Conger myriaster). Fisheries Research, 195, 230-237.
Li, Z., Ye, Z., Wan, R., Zhang, C., 2015. Model selection between traditional and popular methods for standardizing catch rates of target species: a case study of Japanese Spanish mackerel in the gillnet fishery. Fisheries Research 161, 312-319.
Liaw, A., Wiener, M., 2002. Classification and Regression by randomForest. R News 2(3), 18-22.
Lin, Y.P., Lin, W.C., Wu, W.Y., 2015. Uncertainty in various habitat suitability models and its impact on habitat suitability estimates for fish. Water 7(8), 4088-4107.
Luan, J., Zhang, C., Xu, B., Xue, Y., Ren, Y., 2018. Modelling the spatial distribution of three Portunidae crabs in Haizhou Bay, China. PloS One 13(11), p.e0207457.
Mahowald, N.M., Hamilton, D.S., Mackey, K.R., Moore, J.K., Baker, A.R., Scanza, R.A., Zhang, Y., 2018. Aerosol trace metal leaching and impacts on marine microorganisms. Nature Communications 9(1), 1-15.
McCullagh, P., Nelder, J.A., 2019. Generalized linear models. Routledge.
Mitchell, C., Hu, C., Bowler, B., Drapeau, D., Balch, W.M., 2017. Estimating particulate inorganic carbon concentrations of the global ocean from ocean color measurements using a reflectance difference approach. Journal of Geophysical Research: Oceans 122(11), 8707-8720.
Naderi Jolodar, M., Salarvand, G., Abdoli, A., Fazli, H., Eshaqi Nimvary, M., 2013. The feeding strategy of the Caspian Sea Kutum (Rutilus frissi kutum Kamenski, 1901). Journal of Applied Ichthyological Reseach 1(3), 63-79. (In Persian)
Naimi, B., Hamm, Na., Groen, T.A., Skidmore, A.K., Toxopeus, A.G., 2014. Where is positional uncertainty a problem for species distribution modelling. Ecography 37, 191-203.
Okun, O. and Priisalu, H., 2007, June. Random forest for gene expression based cancer classification: overlooked issues. In Iberian conference on pattern recognition and image analysis (pp. 483-490). Springer, Berlin, Heidelberg.
Olaya-Marín, E.J., Martínez-Capel, F., Vezza, P., 2013. A comparison of artificial neural networks and random forests to predict native fish species richness in Mediterranean rivers. Knowledge and Management of Aquatic Ecosystems (409), 07.
Olsen, Z., 2019. Quantifying nursery habitat function: variation in habitat suitability linked to mortality and growth for juvenile Black Drum in a hypersaline estuary. Marine and Coastal Fisheries 11(1), 86-96.
R Core Team (2021). R: A language and environment for statistical   computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
Robinson, C.L., Proudfoot, B., Rooper, C.N., Bertram, D.F., 2021. Comparison of spatial distribution models to predict subtidal burying habitat of the forage fish Ammodytes personatus in the Strait of Georgia, British Columbia, Canada. Aquatic Conservation: Marine and Freshwater Ecosystems 31(10), 2855-2869.
Robinson, N.M., Nelson, W.A., Costello, M.J., Sutherland, J.E., Lundquist, C.J., 2017. A systematic review of marine-based species distribution models (SDMs) with recommendations for best practice. Frontiers in Marine Science 4, 421.
Schickele, A., Leroy, B., Beaugrand, G., Goberville, E., Hattab, T., Francour, P., Raybaud, V., 2020. Modelling European small pelagic fish distribution: Methodological insights. Ecological Modelling 416, 108902.
Sguotti, C., Lynam, C.P., García‐Carreras, B., Ellis, J.R., Engelhard, G.H., 2016. Distribution of skates and sharks in the North Sea: 112 years of change. Global Change Biology 22(8), 2729-2743.
Shabani, F., Kumar, L., Ahmadi, M., 2016. A comparison of absolute performance of different correlative and mechanistic species distribution models in an independent area. Ecology and Evolution 6(16), 5973-5986.
Smoliński, S., Radtke, K., 2017. Spatial prediction of demersal fish diversity in the Baltic Sea: comparison of machine learning and regression-based techniques. ICES Journal of Marine Science 74(1), 102-111.
Valipour, A., Kanipour, A., Khadivi Nia Moghaddam, M., Valinassab, T., 2011. Kutum: jewel of the Caspian Sea, vol 1. Iranian Fisheries Research Organization, Tehran. (In Persian)
Walsh, W.A., Kleiber, P., 2001. Generalized additive model and regression tree analyses of blue shark (Prionace glauca) catch rates by the Hawaii-based commercial longline fishery. Fisheries Research 53(2), 115-131.
Ward, E.J., Jannot, J.E., Lee, Y.W., Ono, K., Shelton, A.O., Thorson, J.T., 2015. Using spatiotemporal species distribution models to identify temporally evolving hotspots of species co‐occurrence. Ecological Applications 25(8), 2198-2209.
Yang, T., Chen, Y., Zhou, S., Li, H., 2019. Impacts of aerosol copper on marine phytoplankton: A review. Atmosphere 10(7), 414.
Zhang, M., Wu, Y., Qi, L., Xu, M., Yang, C., Wang, X., 2019. Impact of the migration behavior of mesoplagic fishes on the compositions of dissolved and particulate organic carbon on the northern slope of the South China Sea. Deep Sea Research Part II: Topical Studies in Oceanography 167, 46-54.
Zhang, Y., Xu, B., Ji, Y., Zhang, C., Ren, Y., Xue, Y., 2021. Comparison of habitat models in quantifying the spatio-temporal distribution of small yellow croaker (Larimichthys polyactis) in Haizhou Bay, China. Estuarine, Coastal and Shelf Science 261, 107512.
Zhang, Z., Mammola, S., Xian, W., Zhang, H., 2020. Modelling the potential impacts of climate change on the distribution of ichthyoplankton in the Yangtze Estuary, China. Diversity and Distributions 26(1),
126-137.