AITopics | kaggle

Collaborating Authors

kaggle

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

7e810b2c75d69be186cadd2fe3febeab-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 11:55:10 GMT

discovery, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Asia > India (0.04)
Asia > Middle East > Jordan (0.04)
(11 more...)

Genre: Research Report > Experimental Study (0.47)

Industry:

Transportation > Air (1.00)
Media > News (1.00)
Law (1.00)
(4 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Kaggle Chronicles: 15 Years of Competitions, Community and Data Science Innovation

Bönisch, Kevin, Losaria, Leandro

arXiv.org Machine LearningNov-21-2025

Since 2010, Kaggle has been a platform where data scientists from around the world come together to compete, collaborate, and push the boundaries of Data Science. Over these 15 years, it has grown from a purely competition-focused site into a broader ecosystem with forums, notebooks, models, datasets, and more. With the release of the Kaggle Meta Code and Kaggle Meta Datasets, we now have a unique opportunity to explore these competitions, technologies, and real-world applications of Machine Learning and AI. And so in this study, we take a closer look at 15 years of data science on Kaggle - through metadata, shared code, community discussions, and the competitions themselves. We explore Kaggle's growth, its impact on the data science community, uncover hidden technological trends, analyze competition winners, how Kagglers approach problems in general, and more. We do this by analyzing millions of kernels and discussion threads to perform both longitudinal trend analysis and standard exploratory data analysis. Our findings show that Kaggle is a steadily growing platform with increasingly diverse use cases, and that Kagglers are quick to adapt to new trends and apply them to real-world challenges, while producing - on average - models with solid generalization capabilities. We also offer a snapshot of the platform as a whole, highlighting its history and technological evolution. Finally, this study is accompanied by a video (https://www.youtube.com/watch?v=YVOV9bIUNrM) and a Kaggle write-up (https://kaggle.com/competitions/meta-kaggle-hackathon/writeups/kaggle-chronicles-15-years-of-competitions-communi) for your convenience.

data mining, large language model, machine learning, (22 more...)

arXiv.org Machine Learning

2511.06304

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Banking & Finance (0.67)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Add feedback

A Supplementary Material

Amelia Jimenez Sanchez

Neural Information Processing SystemsNov-20-2025, 04:14:01 GMT

On HuggingFace, we find information about the annotation creators ( e.g., crowdsource, experts, ml-generated) or specific task categories ( e.g., image-classification, image-to-text, text-to-image). Kaggle automatically computes a usability score, which is associated with the tag "well-documented", Kaggle's usability score is based on: Completeness: subtitle, tag, description, cover image . Credibility: provenance, public noteboook, update frequency . Compatibility: license, file format, file description, column description . The usability score is based on only 4 out of 7 aspects from Datasheets [40].

artificial intelligence, dataset, social media, (17 more...)

Neural Information Processing Systems

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (0.37)
Information Technology > Communications > Social Media > Crowdsourcing (0.35)

Add feedback

cdbeaeb8a0313940a5752c4ec8838ca6-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsNov-20-2025, 04:13:58 GMT

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
(9 more...)

Genre: Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Nuclear Medicine (1.00)
(5 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
(7 more...)

Add feedback

Reviewer

Neural Information Processing SystemsNov-19-2025, 11:27:27 GMT

We thank the reviewers for their positive and constructive feedback. We now reply to the main points raised by the individual reviewers. Section 3.3 are only limited evidence for adaptive overfitting due to the strong null hypothesis. We will update our discussion in Section 3.5 to more clearly reflect this. We will update our figures with alpha values for the scatter plots.

artificial intelligence, machine learning, reviewer, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.84)

Add feedback

Falcon: A Comprehensive Chinese Text-to-SQL Benchmark for Enterprise-Grade Evaluation

Luo, Wenzhen, Guan, Wei, Yao, Yifan, Pan, Yimin, Wang, Feng, Yu, Zhipeng, Wen, Zhe, Chen, Liang, Zhuang, Yihong

arXiv.org Artificial IntelligenceOct-30-2025

We introduce Falcon, a cross-domain Chinese text-to-SQL benchmark grounded in an enterprise-compatible dialect (MaxCompute/Hive). It contains 600 Chinese questions over 28 databases; 77% require multi-table reasoning and over half touch more than four tables. Each example is annotated along SQL-computation features and Chinese semantics. For evaluation, we release a robust execution comparator and an automated evaluation pipeline, under which all current state-of-the-art large-scale models (including Deepseek) achieve accuracies of at most 50%. Major errors originate from two sources: (1) schema linking in large enterprise landscapes - hundreds of tables, denormalized fields, ambiguous column names, implicit foreign-key relations and domain-specific synonyms that make correct join/column selection difficult; and (2) mapping concise, colloquial Chinese into the exact operators and predicates required for analytics - e.g., choosing the correct aggregation and group-by keys, expressing time windows and granularities, applying unit conversions, handling NULLs and data-quality rules, and formulating nested or windowed subqueries. Falcon therefore targets Chinese-specific semantics and enterprise dialects (abbreviations, business jargon, fuzzy entity references) and provides a reproducible middle ground before full production deployment by using realistic enterprise schemas, query templates, an execution comparator, and an automated evaluation pipeline for end-to-end validation.

kaggle, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.24762

Genre: Research Report (0.52)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Supplementary Material

Amelia Jimenez Sanchez

Neural Information Processing SystemsOct-10-2025, 16:54:11 GMT

artificial intelligence, dataset, social media, (18 more...)

Neural Information Processing Systems

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (0.37)
Information Technology > Communications > Social Media > Crowdsourcing (0.35)

Add feedback

Copycats: the many lives of a publicly available medical imaging dataset Amelia Jiménez-Sánchez

Neural Information Processing SystemsOct-10-2025, 16:54:07 GMT

Medical Imaging (MI) datasets are fundamental to artificial intelligence in healthcare.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
(9 more...)

Genre: Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Nuclear Medicine (1.00)
(2 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

7e810b2c75d69be186cadd2fe3febeab-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 23:33:38 GMT

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Africa > Middle East > Egypt (0.14)
Africa > Nigeria (0.14)
(15 more...)

Genre: Research Report > Experimental Study (0.47)

Industry:

Transportation > Air (1.00)
Media > News (1.00)
Law (1.00)
(6 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Convolutional Neural Nets vs Vision Transformers: A SpaceNet Case Study with Balanced vs Imbalanced Regimes

Gothi, Akshar

arXiv.org Artificial IntelligenceOct-7-2025

We present a controlled comparison of a convolutional neural network (EfficientNet-B0) and a Vision Transformer (ViT-Base) on SpaceNet under two label-distribution regimes: a naturally imbalanced five-class split and a balanced-resampled split with 700 images per class (70:20:10 train/val/test). With matched preprocessing (224x224, ImageNet normalization), lightweight augmentations, and a 40-epoch budget on a single NVIDIA P100, we report accuracy, macro-F1, balanced accuracy, per-class recall, and deployment metrics (model size and latency). On the imbalanced split, EfficientNet-B0 reaches 93% test accuracy with strong macro-F1 and lower latency; ViT-Base is competitive at 93% with a larger parameter count and runtime. On the balanced split, both models are strong; EfficientNet-B0 reaches 99% while ViT-Base remains competitive, indicating that balancing narrows architecture gaps while CNNs retain an efficiency edge. We release manifests, logs, and per-image predictions to support reproducibility.

artificial intelligence, epoch, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2510.03297

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback