Nine tips for ecologists using machine learning

Desprez, Marine, Miele, Vincent, Gimenez, Olivier

arXiv.org Artificial Intelligence 

Ecological datasets are generally characterised by complex interactions between variables, nonlinearity, missing values, dependence in the observations and/or a continuously expanding size [1-3], especially since the recent increase in the use of remote sensing and automatic recorders [4]. A growing number of those datasets cannot be effectively processed by humans anymore and require methods that can deal with high number of variables and complex data structures [3, 5, 6]. Because of their ability to process large and complicated datasets, machine learning models are expected to become a standard framework in the analysis of ecological data [3, 7, 8]. Over the last few years, machine learning algorithms have become increasingly popular due to their high performance and flexibility [8]. In ecology, they have been successfully applied to perform various tasks such as identifying species from images or sounds [9], monitoring animal behaviour [10] or modelling species distribution [11] and new innovative studies and perspectives keep being regularly documented [3, 12]. However, implementing a machine learning model is not yet a trivial task and may seem intimidating to ecologists with no previous experience in this area. In this paper, we aim to share nine tips to help ecologists avoid some of the most common errors and incorrect practices in machine learning. We focused our tips on classification problems as a substantial number of ecological studies aim to assign data into predefined classes such as ecological states or biological entities. Some typical examples of classification include species identification through pictures [9] or sound recordings [13-15], distinction of different phenological phases in plant life cycle [16, 17], description of animal behaviour [18] and detection of disease in plants [19].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found