Goto

Collaborating Authors

 Cherrier, Noëlie


Unified Occupancy on a Public Transport Network through Combination of AFC and APC Data

arXiv.org Artificial Intelligence

In a transport network, the onboard occupancy is key for gaining insights into travelers' habits and adjusting the offer. Traditionally, operators have relied on field studies to evaluate ridership of a typical workday. However, automated fare collection (AFC) and automatic passenger counting (APC) data, which provide complete temporal coverage, are often available but underexploited. It should be noted, however, that each data source comes with its own biases: AFC data may not account for fraud, while not all vehicles are equipped with APC systems. This paper introduces the unified occupancy method, a geostatistical model to extrapolate occupancy to every course of a public transportation network by combining AFC and APC data with partial coverage. Unified occupancy completes missing APC information for courses on lines where other courses have APC measures, as well as for courses on lines where no APC data is available at all. The accuracy of this method is evaluated on real data from several public transportation networks in France.


Context-Aware Automated Passenger Counting Data Denoising

arXiv.org Artificial Intelligence

A reliable and accurate knowledge of the ridership in public transportation networks is crucial for public transport operators and public authorities to be aware of their network's use and optimize transport offering. Several techniques to estimate ridership exist nowadays, some of them in an automated manner. Among them, Automatic Passenger Counting (APC) systems detect passengers entering and leaving the vehicle at each station of its course. However, data resulting from these systems are often noisy or even biased, resulting in under or overestimation of onboard occupancy. In this work, we propose a denoising algorithm for APC data to improve their robustness and ease their analyzes. The proposed approach consists in a constrained integer linear optimization, taking advantage of ticketing data and historical ridership data to further constrain and guide the optimization. The performances are assessed and compared to other denoising methods on several public transportation networks in France, to manual counts available on one of these networks, and on simulated data.


Consistent Feature Construction with Constrained Genetic Programming for Experimental Physics

arXiv.org Machine Learning

A good feature representation is a determinant factor to achieve high performance for many machine learning algorithms in terms of classification. This is especially true for techniques that do not build complex internal representations of data (e.g. decision trees, in contrast to deep neural networks). To transform the feature space, feature construction techniques build new high-level features from the original ones. Among these techniques, Genetic Programming is a good candidate to provide interpretable features required for data analysis in high energy physics. Classically, original features or higher-level features based on physics first principles are used as inputs for training. However, physicists would benefit from an automatic and interpretable feature construction for the classification of particle collision events. Our main contribution consists in combining different aspects of Genetic Programming and applying them to feature construction for experimental physics. In particular, to be applicable to physics, dimensional consistency is enforced using grammars. Results of experiments on three physics datasets show that the constructed features can bring a significant gain to the classification accuracy. To the best of our knowledge, it is the first time a method is proposed for interpretable feature construction with units of measurement, and that experts in high-energy physics validate the overall approach as well as the interpretability of the built features.