Goto

Collaborating Authors

 variable combination


LIFT: Interpretable truck driving risk prediction with literature-informed fine-tuned LLMs

Hu, Xiao, Lian, Yuansheng, Zhang, Ke, Li, Yunxuan, Su, Yuelong, Li, Meng

arXiv.org Artificial Intelligence

This study proposes an interpretable prediction framework with literature-informed fine-tuned (LIFT) LLMs for truck driving risk prediction. The framework integrates an LLM-driven Inference Core that predicts and explains truck driving risk, a Literature Processing Pipeline that filters and summarizes domain-specific literature into a literature knowledge base, and a Result Evaluator that evaluates the prediction performance as well as the interpretability of the LIFT LLM. After fine-tuning on a real-world truck driving risk dataset, the LIFT LLM achieved accurate risk prediction, outperforming benchmark models by 26.7% in recall and 10.1% in F1-score. Furthermore, guided by the literature knowledge base automatically constructed from 299 domain papers, the LIFT LLM produced variable importance ranking consistent with that derived from the benchmark model, while demonstrating robustness in interpretation results to various data sampling conditions. The LIFT LLM also identified potential risky scenarios by detecting key combination of variables in truck driving risk, which were verified by PERMANOVA tests. Finally, we demonstrated the contribution of the literature knowledge base and the fine-tuning process in the interpretability of the LIFT LLM, and discussed the potential of the LIFT LLM in data-driven knowledge discovery.


Discovering Car-following Dynamics from Trajectory Data through Deep Learning

Angah, Ohay, Enouen, James, Xuegang, null, Ban, null, Liu, Yan

arXiv.org Artificial Intelligence

There are two recent trends in transportation and the broader science/engineering fields, which make the headlines almost every day. The first one is the emergence of connected/automated vehicles (CAVs) that i) may introduce new, complex traffic dynamics and interactions in the current and future traffic streams, and ii) generate increasingly available and massive datasets from both vehicles and the infrastructure. The second trend is the rapid development and application of deep learning techniques that seem to revolutionize almost every aspect of technology, science, engineering, and the entire society. While there have been numerous studies and applications of deep learning in transportation, in the paper, we are interested in the question of whether deep learning can help discover traffic dynamics (car-following models in particular) from data directly with no or little human involvement. An affirmative answer to this question will not only help discover/develop traffic dynamics models in this era but also have important implications for other science/engineering fields where dynamical systems and their governing equations are widely used and studied. Car-following depicts the driving behavior of how a vehicle (driver) follows and interacts with the vehicle in front of it. It is one of the basic traffic models in revealing traffic dynamics characteristics at the microscopic traffic flow level Brackstone and McDonald [1999]. Car-following studies can be traced back to the 1950s and 1960s when Pipes [1953], Chandler et al. [1958], Kometani and Sasaki [1958], Gazis et al. [1959, 1961], and Helly [1959] initiated an era of modeling car-following and traffic dynamics.


Compare outlier detection methods with the OutliersO3 package

#artificialintelligence

There are many different methods for identifying outliers and a lot of them are available in R. But are outliers a matter of opinion? Do all methods give the same results? Articles on outlier methods use a mixture of theory and practice. Theory is all very well, but outliers are outliers because they don't follow theory. Practice involves testing methods on data, sometimes with data simulated based on theory, better with real' datasets.


Exhaustive search for sparse variable selection in linear regression

Igarashi, Yasuhiko, Takenaka, Hikaru, Nakanishi-Ohno, Yoshinori, Uemura, Makoto, Ikeda, Shiro, Okada, Masato

arXiv.org Machine Learning

We propose a K-sparse exhaustive search (ES-K) method and a K-sparse approximate exhaustive search method (AES-K) for selecting variables in linear regression. With these methods, K-sparse combinations of variables are tested exhaustively assuming that the optimal combination of explanatory variables is K-sparse. By collecting the results of exhaustively computing ES-K, various approximate methods for selecting sparse variables can be summarized as density of states. With this density of states, we can compare different methods for selecting sparse variables such as relaxation and sampling. For large problems where the combinatorial explosion of explanatory variables is crucial, the AES-K method enables density of states to be effectively reconstructed by using the replica-exchange Monte Carlo method and the multiple histogram method. Applying the ES-K and AES-K methods to type Ia supernova data, we confirmed the conventional understanding in astronomy when an appropriate K is given beforehand. However, we found the difficulty to determine K from the data. Using virtual measurement and analysis, we argue that this is caused by data shortage.