AITopics | literature data

Collaborating Authors

literature data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Comparing hundreds of machine learning classifiers and discrete choice models in predicting travel behavior: an empirical benchmark

Wang, Shenhao, Mo, Baichuan, Zheng, Yunhan, Hess, Stephane, Zhao, Jinhua

arXiv.org Artificial IntelligenceJan-17-2025

Numerous studies have compared machine learning (ML) and discrete choice models (DCMs) in predicting travel demand. However, these studies often lack generalizability as they compare models deterministically without considering contextual variations. To address this limitation, our study develops an empirical benchmark by designing a tournament model, thus efficiently summarizing a large number of experiments, quantifying the randomness in model comparisons, and using formal statistical tests to differentiate between the model and contextual effects. This benchmark study compares two large-scale data sources: a database compiled from literature review summarizing 136 experiments from 35 studies, and our own experiment data, encompassing a total of 6,970 experiments from 105 models and 12 model families. This benchmark study yields two key findings. Firstly, many ML models, particularly the ensemble methods and deep learning, statistically outperform the DCM family (i.e., multinomial, nested, and mixed logit models). However, this study also highlights the crucial role of the contextual factors (i.e., data sources, inputs and choice categories), which can explain models' predictive performance more effectively than the differences in model types alone. Model performance varies significantly with data sources, improving with larger sample sizes and lower dimensional alternative sets. After controlling all the model and contextual factors, significant randomness still remains, implying inherent uncertainty in such model comparisons. Overall, we suggest that future researchers shift more focus from context-specific model comparisons towards examining model transferability across contexts and characterizing the inherent uncertainty in ML, thus creating more robust and generalizable next-generation travel demand models.

artificial intelligence, machine learning, model family, (19 more...)

arXiv.org Artificial Intelligence

2102.0113

Country: North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(2 more...)

Add feedback

Semi-automated Extraction of Literature Data Using Machine Learning Methods

#artificialintelligenceSep-1-2022, 21:57:37 GMT

NICEATM, other scientists within the NIEHS Division of the NTP, the DOE's Oak Ridge National Laboratory, and FDA are collaborating to automate the process of identifying high-quality developmental toxicity studies in the published scientific literature. The approach applies natural language processing and machine learning methods to identify specific data elements in the full text of scientific publications using both unsupervised and supervised approaches. Preliminary models were trained using a uterotrophic database (Kleinstreuer et al. 2016) built for the EPA Endocrine Disruptor Screening Program. The models leveraged natural language processing and multivariate machine learning models to identify papers that meet minimum criteria to be considered guideline-like studies (Herrmannova et al. 2018). Supervised and unsupervised approaches were developed to automatically extract text features that correspond to study descriptors and classify papers based on their adherence to minimum criteria derived from regulatory guideline studies.

literature data, machine learning method, semi-automated extraction, (4 more...)

#artificialintelligence

Country: North America > United States (1.00)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government > FDA (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback