valeur
Modèles de Substitution pour les Modèles à base d'Agents : Enjeux, Méthodes et Applications
Saves, Paul, Verstaevel, Nicolas, Gaudou, Benoît
Multi-agent simulations enables the modeling and analyses of the dynamic behaviors and interactions of autonomous entities evolving in complex environments. Agent-based models (ABM) are widely used to study emergent phenomena arising from local interactions. However, their high computational cost poses a significant challenge, particularly for large-scale simulations requiring extensive parameter exploration, optimization, or uncertainty quantification. The increasing complexity of ABM limits their feasibility for real-time decision-making and large-scale scenario analysis. To address these limitations, surrogate models offer an efficient alternative by learning approximations from sparse simulation data. These models provide cheap-to-evaluate predictions, significantly reducing computational costs while maintaining accuracy. Various machine learning techniques, including regression models, neural networks, random forests and Gaussian processes, have been applied to construct robust surrogates. Moreover, uncertainty quantification and sensitivity analysis play a crucial role in enhancing model reliability and interpretability. This article explores the motivations, methods, and applications of surrogate modeling for ABM, emphasizing the trade-offs between accuracy, computational efficiency, and interpretability. Through a case study on a segregation model, we highlight the challenges associated with building and validating surrogate models, comparing different approaches and evaluating their performance. Finally, we discuss future perspectives on integrating surrogate models within ABM to improve scalability, explainability, and real-time decision support across various fields such as ecology, urban planning and economics.
$\pi$-yalli: un nouveau corpus pour le nahuatl
Torres-Moreno, Juan-Manuel, Guzmán-Landa, Juan-José, Ranger, Graham, Garrido, Martha Lorena Avendaño, Figueroa-Saavedra, Miguel, Quintana-Torres, Ligia, González-Gallardo, Carlos-Emiliano, Pontes, Elvys Linhares, Morales, Patricia Velázquez, Jiménez, Luis-Gil Moreno
The NAHU$^2$ project is a Franco-Mexican collaboration aimed at building the $\pi$-YALLI corpus adapted to machine learning, which will subsequently be used to develop computer resources for the Nahuatl language. Nahuatl is a language with few computational resources, even though it is a living language spoken by around 2 million people. We have decided to build $\pi$-YALLI, a corpus that will enable to carry out research on Nahuatl in order to develop Language Models (LM), whether dynamic or not, which will make it possible to in turn enable the development of Natural Language Processing (NLP) tools such as: a) a grapheme unifier, b) a word segmenter, c) a POS grammatical analyser, d) a content-based Automatic Text Summarization; and possibly, e) a translator translator (probabilistic or learning-based).
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > Mexico > Veracruz > Xalapa (0.04)
- (6 more...)
Introduction to speech recognition
This document contains lectures and practical experimentations using Matlab and implementing a system which is actually correctly classifying three words (one, two and three) with the help of a very small database. To achieve this performance, it uses speech modeling specificities, powerful computer algorithms (dynamic time warping and Dijktra's algorithm) and machine learning (nearest neighbor). This document introduces also some machine learning evaluation metrics.
Comparative study of clustering models for multivariate time series from connected medical devices
Courrier, Violaine, Biernacki, Christophe, Preda, Cristian, Vittrant, Benjamin
In healthcare, patient data is often collected as multivariate time series, providing a comprehensive view of a patient's health status over time. While this data can be sparse, connected devices may enhance its frequency. The goal is to create patient profiles from these time series. In the absence of labels, a predictive model can be used to predict future values while forming a latent cluster space, evaluated based on predictive performance. We compare two models on Withing's datasets, M AGMAC LUST which clusters entire time series and DGM${}^2$ which allows the group affiliation of an individual to change over time (dynamic clustering).
Construction de variables a l'aide de classifieurs comme aide a la regression
Troisemaine, Colin, Lemaire, Vincent
This paper proposes a method for the automatic creation of variables (in the case of regression) that complement the information contained in the initial input vector. The method works as a pre-processing step in which the continuous values of the variable to be regressed are discretized into a set of intervals which are then used to define value thresholds. Then classifiers are trained to predict whether the value to be regressed is less than or equal to each of these thresholds. The different outputs of the classifiers are then concatenated in the form of an additional vector of variables that enriches the initial vector of the regression problem. The implemented system can thus be considered as a generic pre-processing tool. We tested the proposed enrichment method with 5 types of regressors and evaluated it in 33 regression datasets. Our experimental results confirm the interest of the approach.
Interpretabilit\'e des mod\`eles : \'etat des lieux des m\'ethodes et application \`a l'assurance
Delcaillau, Dimitri, Ly, Antoine, Vermet, Franck, Papp, Alizé
Since May 2018, the General Data Protection Regulation (GDPR) has introduced new obligations to industries. By setting a legal framework, it notably imposes strong transparency on the use of personal data. Thus, people must be informed of the use of their data and must consent the usage of it. Data is the raw material of many models which today make it possible to increase the quality and performance of digital services. Transparency on the use of data also requires a good understanding of its use through different models. The use of models, even if efficient, must be accompanied by an understanding at all levels of the process that transform data (upstream and downstream of a model), thus making it possible to define the relationships between the individual's data and the choice that an algorithm could make based on the analysis of the latter. (For example, the recommendation of one product or one promotional offer or an insurance rate representative of the risk.) Models users must ensure that models do not discriminate against and that it is also possible to explain its result. The widening of the panel of predictive algorithms - made possible by the evolution of computing capacities -- leads scientists to be vigilant about the use of models and to consider new tools to better understand the decisions deduced from them . Recently, the community has been particularly active on model transparency with a marked intensification of publications over the past three years. The increasingly frequent use of more complex algorithms (\textit{deep learning}, Xgboost, etc.) presenting attractive performances is undoubtedly one of the causes of this interest. This article thus presents an inventory of methods of interpreting models and their uses in an insurance context.
- Europe > France > Île-de-France > Paris > Paris (0.04)
- North America > United States > New York (0.04)
- Europe > France > Brittany > Finistère > Brest (0.04)
An Object Model for the Representation of Empirical Knowledge
Colloc, Joël, Boulanger, Danielle
We are currently designing an object oriented model which describes static and dynamical knowledge in diff{\'e}rent domains. It provides a twin conceptual level. The internal level proposes: the object structure composed of sub-objects hierarchy, structure evolution with dynamical functions, same type objects comparison with evaluation functions. It uses multiple upward inheritance from sub-objects properties to the Object. The external level describes: object environment, it enforces object types and uses external simple inheritance from the type to the sub-types.
- North America > Canada > Quebec (0.07)
- North America > United States > Massachusetts > Middlesex County > Reading (0.04)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- (2 more...)
Proposition d'une nouvelle approche d'extraction des motifs ferm\'es fr\'equents
This work is done as part of a master's thesis project. The increase in the volume of data has given rise to various issues related to the collection, storage, analysis and exploitation of these data in order to create an added value. In this master, we are interested in the search of frequent closed patterns in the transaction bases. One way to process data is to partition the search space into subcontexts, and then explore the subcontexts simultaneously. In this context, we have proposed a new approach for extracting frequent closed itemsets. The main idea is to update frequent closed patterns with their minimal generators by applying a strategy of partitioning of the initial extraction context. Our new approach called UFCIGs-DAC was designed and implemented to perform a search in the test bases. The main originality of this approach is the simultaneous exploration of the research space by the update of the frequent closed patterns and the minimal generators. Moreover, our approach can be adapted to any algorithm of extraction of the frequent closed patterns with their minimal generators.
- Africa > Middle East > Tunisia > Tunis Governorate > Tunis (0.04)
- Europe > France > Auvergne-Rhône-Alpes > Puy-de-Dôme > Clermont-Ferrand (0.04)
- Asia (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.46)
- Information Technology > Data Science > Data Mining > Big Data (0.46)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.34)
Protection of an information system by artificial intelligence: a three-phase approach based on behaviour analysis to detect a hostile scenario
Fauvelle, Jean-Philippe, Dey, Alexandre, Navers, Sylvain
The analysis of the behaviour of individuals and entities (UEBA) is an area of artificial intelligence that detects hostile actions (e.g. attacks, fraud, influence, poisoning) due to the unusual nature of observed events, by affixing to a signature-based operation. A UEBA process usually involves two phases, learning and inference. Intrusion detection systems (IDS) available still suffer from bias, including over-simplification of problems, underexploitation of the AI potential, insufficient consideration of the temporality of events, and perfectible management of the memory cycle of behaviours. In addition, while an alert generated by a signature-based IDS can refer to the signature on which the detection is based, the IDS in the UEBA domain produce results, often associated with a score, whose explainable character is less obvious. Our unsupervised approach is to enrich this process by adding a third phase to correlate events (incongruities, weak signals) that are presumed to be linked together, with the benefit of a reduction of false positives and negatives. We also seek to avoid a so-called "boiled frog" bias inherent in continuous learning. Our first results are interesting and have an explainable character, both on synthetic and real data.
- Oceania > Australia > Victoria (0.04)
- North America > United States > Washington (0.04)
- North America > Canada > British Columbia (0.04)
- (3 more...)
Etude de Mod\`eles \`a base de r\'eseaux Bay\'esiens pour l'aide au diagnostic de tumeurs c\'er\'ebrales
Lamine, Fradj Ben, Kalti, Karim, Mahjoub, Mohamed Ali
This article describes different models based on Bayesian networks RB modeling expertise in the diagnosis of brain tumors. Indeed, they are well adapted to the representation of the uncertainty in the process of diagnosis of these tumors. In our work, we first tested several structures derived from the Bayesian network reasoning performed by doctors on the one hand and structures generated automatically on the other. This step aims to find the best structure that increases diagnostic accuracy. The machine learning algorithms relate MWST-EM algorithms, SEM and SEM + T. To estimate the parameters of the Bayesian network from a database incomplete, we have proposed an extension of the EM algorithm by adding a priori knowledge in the form of the thresholds calculated by the first phase of the algorithm RBE . The very encouraging results obtained are discussed at the end of the paper
- Africa > Middle East > Tunisia > Sousse Governorate > Sousse (0.05)
- North America > United States > California > San Francisco County > San Francisco (0.04)