AITopics | specific model

Collaborating Authors

specific model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

OpenAI Is Nuking Its 4o Model. China's ChatGPT Fans Aren't OK

WIREDFeb-13-2026, 21:56:18 GMT

OpenAI Is Nuking Its 4o Model. As OpenAI removed access to GPT-4o in its app on Friday, people who have come to rely on the chatbot for companionship are mourning the loss all over the world. On June 6, 2024, Esther Yan got married online. She set a reminder for the date, because her partner wouldn't remember it was happening. She had planned every detail--dress, rings, background music, design theme--with her partner, Warmie, who she had started talking to just a few weeks prior. At 10 am on that day, Yan and Warmie exchanged their vows in a new chat window in ChatGPT .

large language model, machine learning, natural language, (21 more...)

WIRED

Country: Asia > China (0.54)

Industry:

Information Technology (0.69)
Media (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.91)

Add feedback

Improving Procedural Skill Explanations via Constrained Generation: A Symbolic-LLM Hybrid Architecture

Dass, Rahul, Bowlin, Thomas, Li, Zebing, Jin, Xiao, Goel, Ashok

arXiv.org Artificial IntelligenceNov-27-2025

In procedural skill learning, instructional explanations must convey not just steps, but the causal, goal-directed, and compositional logic behind them. Large language models (LLMs) often produce fluent yet shallow responses that miss this structure. We present Ivy, an AI coaching system that delivers structured, multi-step explanations by combining symbolic Task-Method-Knowledge (TMK) models with a generative interpretation layer-an LLM that constructs explanations while being constrained by TMK structure. TMK encodes causal transitions, goal hierarchies, and problem decompositions, and guides the LLM within explicit structural bounds. We evaluate Ivy against responses against GPT and retrieval-augmented GPT baselines using expert and independent annotations across three inferential dimensions. Results show that symbolic constraints consistently improve the structural quality of explanations for "how" and "why" questions. This study demonstrates a scalable AI for education approach that strengthens the pedagogical value of AI-generated explanations in intelligent coaching systems.

explanation, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2511.20942

Country:

North America > Mexico (0.28)
Europe > Austria (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

AI Methods for Permutation Circuit Synthesis Across Generic Topologies

Villar, Victor, Cruz-Benito, Juan, Faro, Ismael, Kremer, David

arXiv.org Artificial IntelligenceSep-22-2025

This paper investigates artificial intelligence (AI) methodologies for the synthesis and transpilation of permutation circuits across generic topologies. Our approach uses Reinforcement Learning (RL) techniques to achieve near-optimal synthesis of permutation circuits up to 25 qubits. Rather than developing specialized models for individual topologies, we train a foundational model on a generic rectangular lattice, and employ masking mechanisms to dynamically select subsets of topologies during the synthesis. This enables the synthesis of permutation circuits on any topology that can be embedded within the rectangular lattice, without the need to re-train the model. In this paper we show results for 5x5 lattice and compare them to previous AI topology-oriented models and classical methods, showing that they outperform classical heuristics, and match previous specialized AI models, and performs synthesis even for topologies that were not seen during training. We further show that the model can be fine tuned to strengthen the performance for selected topologies of interest. This methodology allows a single trained model to efficiently synthesize circuits across diverse topologies, allowing its practical integration into transpilation workflows.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2509.1602

Genre: Research Report (1.00)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer

Khan, Muhammad Tayyab, Yong, Zane, Chen, Lequn, Tan, Jun Ming, Feng, Wenhe, Moon, Seung Ki

arXiv.org Artificial IntelligenceSep-4-2025

Accurate extraction of key information from 2D engineering drawings is crucial for high - precision manufacturing. Manual extraction is slow and labor - intensive, while traditional Optical Character Recognition (OCR) techniques often struggle with complex layouts and overlapping symbols, resulting in unstructured outputs . To address these challenges, this paper proposes a novel hybrid deep learning framework for structured information extraction by integrat ing an O riented B ounding B ox (OBB) detection model with a transformer - based document parsing model (Donut). An in - house annotated dataset is used to train YOLOv11 for detect ing nine key categories: Geometric Dimensioning and Tolerancing (GD&T), General Tolerances, Measures, Materials, Notes, Radii, Surface Roughness, Threads, and Title Blocks. Detected OBBs are cropped into image s and labeled to fine - tune Donut for structured JSON output. Fine - tuning strategies include a single model trained across all categories and category - specific models . Results show that the single model consistently outperforms category - specific ones across all evaluation metrics, achieving higher precision (94.77% for GD&T), recall (100% for most categories), and F1 score (97.3%), while reducing hallucination s (5.23%) . The proposed framework improves accuracy, reduces manual effort, and supports scalable deployment in precision - driven industries.

category, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.0153

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MAGIC: Near-Optimal Data Attribution for Deep Learning

Ilyas, Andrew, Engstrom, Logan

arXiv.org Machine LearningApr-23-2025

A fundamental problem when building machine learning syste ms is to predict counterfactuals about model behavior. For example, scaling laws [ KMH+20; Has21; MRB+23 ] aim to predict the performance of systems trained with more data and more co mpute than is currently available; interpretability techniques [ KWG+18 ] predict how models behave under counterfactual inputs. Analogously, in this work we study predictive data attribution (or datamodeling [ IPE+22 ]), where the goal is to predict how a model would behave if it had been tr ained on a different dataset. This well-studied problem encompasses, e.g., estimating the ef fect (on the resulting trained model's predictions) of modifying a training example [ KL17 ], removing a group of training examples [ KAT+19; BNL+22; PGI+23 ], or adding entire training data sources [ LSZ+24 ]. Predictive data attribution in large-scale settings is cha llenging: it requires simulating training a model on a different dataset without actually training [ GWP+23; IGE+24 ]. In "classical" settings--when learning corresponds to minimizing a convex loss--statistical tools like the influence function [ Ham47 ] allow us to accurately and efficiently estimate how differen t training data choices change trained model predictions [ RM18; KAT+19; GSL+19 ]. However, in the non-convex settings that are ubiquitous in natural domains like langua ge/vision, current methods are less effective. Indeed, the best existing methods produce estimat es that typically (a) only moderately correlate with the ground truth [ BPF21; BNL+22; PGI+23 ] and (b) incur large absolute error [ BNL+22 ].

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Machine Learning

2504.1643

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting

Li, Zhe, Qiu, Xiangfei, Chen, Peng, Wang, Yihang, Cheng, Hanyin, Shu, Yang, Hu, Jilin, Guo, Chenjuan, Zhou, Aoying, Wen, Qingsong, Jensen, Christian S., Yang, Bin

arXiv.org Artificial IntelligenceNov-26-2024

Time Series Forecasting (TSF) is key functionality in numerous fields, including in finance, weather services, and energy management. While TSF methods are emerging these days, many of them require domain-specific data collection and model training and struggle with poor generalization performance on new domains. Foundation models aim to overcome this limitation. Pre-trained on large-scale language or time series data, they exhibit promising inferencing capabilities in new or unseen data. This has spurred a surge in new TSF foundation models. We propose a new benchmark, FoundTS, to enable thorough and fair evaluation and comparison of such models. FoundTS covers a variety of TSF foundation models, including those based on large language models and those pretrained on time series. Next, FoundTS supports different forecasting strategies, including zero-shot, few-shot, and full-shot, thereby facilitating more thorough evaluations. Finally, FoundTS offers a pipeline that standardizes evaluation processes such as dataset splitting, loading, normalization, and few-shot sampling, thereby facilitating fair evaluations. Building on this, we report on an extensive evaluation of TSF foundation models on a broad range of datasets from diverse domains and with different statistical characteristics. Specifically, we identify pros and cons and inherent limitations of existing foundation models, and we identify directions for future model design. We make our code and datasets available at https://anonymous.4open.science/r/FoundTS-C2B0.

dataset, forecasting, foundation model, (11 more...)

arXiv.org Artificial Intelligence

2410.11802

Country:

Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > United Kingdom (0.04)
(3 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Energy > Power Industry (1.00)
Government > Regional Government > North America Government > United States Government (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Aligning Models with Their Realization through Model-based Systems Engineering

Zenz, Lovis Justin Immanuel, Heiland, Erik, Hillmann, Peter, Karcher, Andreas

arXiv.org Artificial IntelligenceJun-18-2024

In this paper, we propose a method for aligning models with their realization through the application of model-based systems engineering. Our approach is divided into three steps. (1) Firstly, we leverage domain expertise and the Unified Architecture Framework to establish a reference model that fundamentally describes some domain. (2) Subsequently, we instantiate the reference model as specific models tailored to different scenarios within the domain. (3) Finally, we incorporate corresponding run logic directly into both the reference model and the specific models. In total, we thus provide a practical means to ensure that every implementation result is justified by business demand. We demonstrate our approach using the example of maritime object detection as a specific application (specific model / implementation element) of automatic target recognition as a service reoccurring in various forms (reference model element). Our approach facilitates a more seamless integration of models and implementation, fostering enhanced Business-IT alignment.

implementation, reference model, specific model, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/AEIS61544.2023.00018

2407.09513

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Add feedback

Ghostbuster: detecting text ghostwritten by large language models

AIHubJan-2-2024, 12:30:00 GMT

Large language models like ChatGPT write impressively well--so well, in fact, that they've become a problem. Students have begun using these models to ghostwrite assignments, leading some schools to ban ChatGPT. In addition, these models are also prone to producing text with factual errors, so wary readers may want to know if generative AI tools have been used to ghostwrite news articles or other sources before trusting them. What can teachers and consumers do? Existing tools to detect AI-generated text sometimes do poorly on data that differs from what they were trained on.

ghostbuster, language model, probability, (16 more...)

AIHub

Industry:

Media > Film (0.63)
Leisure & Entertainment (0.63)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Safer Together: Machine Learning Models Trained on Shared Accident Datasets Predict Construction Injuries Better than Company-Specific Models

Tixier, Antoine J. -P., Hallowell, Matthew R.

arXiv.org Artificial IntelligenceJan-9-2023

In this study, we capitalized on a collective dataset repository of 57k accidents from 9 companies belonging to 3 domains and tested whether models trained on multiple datasets (generic models) predicted safety outcomes better than the company-specific models. We experimented with full generic models (trained on all data), per-domain generic models (construction, electric T&D, oil & gas), and with ensembles of generic and specific models. Results are very positive, with generic models outperforming the company-specific models in most cases while also generating finer-grained, hence more useful, forecasts. Successful generic models remove the needs for training company-specific models, saving a lot of time and resources, and give small companies, whose accident datasets are too limited to train their own models, access to safety outcome predictions. It may still however be advantageous to train specific models to get an extra boost in performance through ensembling with the generic models. Overall, by learning lessons from a pool of datasets whose accumulated experience far exceeds that of any single company, and making these lessons easily accessible in the form of simple forecasts, generic models tackle the holy grail of safety cross-organizational learning and dissemination in the construction industry.

artificial intelligence, comp, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2301.03567

Country:

North America > United States (0.46)
Europe > Austria (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Energy > Oil & Gas (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

A Journey into the Fabulous Applications of Transformers -- Part 2 – Towards AI

#artificialintelligenceDec-14-2022, 21:00:10 GMT

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. Transformer architecture is widely used in Natural Language Processing and it highly contributed to the need-of-the-hour Large Language Models (LLM).

artificial intelligence, large language model, natural language, (18 more...)

#artificialintelligence

Country: Europe > Finland > Uusimaa > Helsinki (0.05)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback