AITopics

2410.23069

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Finland > Southwest Finland > Turku (0.04)
North America > United States > Virginia (0.04)
(17 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Instructional Material (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.95)
Education > Educational Setting (0.67)
Education > Educational Technology (0.67)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)

Zargham, Nima, Dubiel, Mateusz, Desai, Smit, Mildner, Thomas, Belz, Hanz-Joachim

Designing AI Personalities: Enhancing Human-Agent Interaction Through Thoughtful Persona Design

arXiv.org Artificial IntelligenceOct-30-2024

In the rapidly evolving field of artificial intelligence (AI) agents, designing the agent's characteristics is crucial for shaping user experience. This workshop aims to establish a research community focused on AI agent persona design for various contexts, such as in-car assistants, educational tools, and smart home environments. We will explore critical aspects of persona design, such as voice, embodiment, and demographics, and their impact on user satisfaction and engagement. Through discussions and hands-on activities, we aim to propose practices and standards that enhance the ecological validity of agent personas. Topics include the design of conversational interfaces, the influence of agent personas on user experience, and approaches for creating contextually appropriate AI agents. This workshop will provide a platform for building a community dedicated to developing AI agent personas that better fit diverse, everyday interactions.

computing machinery, conversational user interface, new york, (10 more...)

2410.22744

Country:

Europe > Germany > Bremen > Bremen (0.30)
North America > United States > New York > New York County > New York City (0.10)
Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.05)
(19 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Safa, Abdulfattah, Kapanadze, Tamta, Uzunoğlu, Arda, Şahin, Gözde Gül

A Systematic Survey on Instructional Text: From Representation Formats to Downstream NLP Tasks

arXiv.org Artificial IntelligenceOct-30-2024

Recent advances in large language models have demonstrated promising capabilities in following simple instructions through instruction tuning. However, real-world tasks often involve complex, multi-step instructions that remain challenging for current NLP systems. Despite growing interest in this area, there lacks a comprehensive survey that systematically analyzes the landscape of complex instruction understanding and processing. Through a systematic review of the literature, we analyze available resources, representation schemes, and downstream tasks related to instructional text. Our study examines 177 papers, identifying trends, challenges, and opportunities in this emerging field. We provide AI/NLP researchers with essential background knowledge and a unified view of various approaches to complex instruction understanding, bridging gaps between different research directions and highlighting future research opportunities.

computational linguistic, instruction, proceedings, (14 more...)

2410.18529

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > Maryland > Baltimore (0.14)
(31 more...)

Genre:

Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.45)
Research Report > Experimental Study (0.34)

Industry:

Education (0.92)
Leisure & Entertainment > Games > Computer Games (0.68)
Information Technology (0.67)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Franceschi, Luca, Donini, Michele, Perrone, Valerio, Klein, Aaron, Archambeau, Cédric, Seeger, Matthias, Pontil, Massimiliano, Frasconi, Paolo

Hyperparameter Optimization in Machine Learning

arXiv.org Machine LearningOct-30-2024

Hyperparameters are configuration variables controlling the behavior of machine learning algorithms. They are ubiquitous in machine learning and artificial intelligence and the choice of their values determine the effectiveness of systems based on these technologies. Manual hyperparameter search is often unsatisfactory and becomes unfeasible when the number of hyperparameters is large. Automating the search is an important step towards automating machine learning, freeing researchers and practitioners alike from the burden of finding a good set of hyperparameters by trial and error. In this survey, we present a unified treatment of hyperparameter optimization, providing the reader with examples and insights into the state-of-the-art. We cover the main families of techniques to automate hyperparameter search, often referred to as hyperparameter optimization or tuning, including random and quasi-random search, bandit-, model- and gradient- based approaches. We further discuss extensions, including online, constrained, and multi-objective formulations, touch upon connections with other fields such as meta-learning and neural architecture search, and conclude with open questions and future research directions.

algorithm, hyperparameter, optimization, (12 more...)

arXiv.org Machine Learning

2410.22854

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(12 more...)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Information Technology (0.67)
Education (0.67)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(6 more...)

Fernsel, Linda, Kalff, Yannick, Simbeck, Katharina

Assessing the Auditability of AI-integrating Systems: A Framework and Learning Analytics Case Study

arXiv.org Artificial IntelligenceOct-29-2024

Audits contribute to the trustworthiness of Learning Analytics (LA) systems that integrate Artificial Intelligence (AI) and may be legally required in the future. We argue that the efficacy of an audit depends on the auditability of the audited system. Therefore, systems need to be designed with auditability in mind. We present a framework for assessing the auditability of AI-integrating systems that consists of three parts: (1) Verifiable claims about the validity, utility and ethics of the system, (2) Evidence on subjects (data, models or the system) in different types (documentation, raw sources and logs) to back or refute claims, (3) Evidence must be accessible to auditors via technical means (APIs, monitoring tools, explainable AI, etc.). We apply the framework to assess the auditability of Moodle's dropout prediction system and a prototype AI-based LA. We find that Moodle's auditability is limited by incomplete documentation, insufficient monitoring capabilities and a lack of available test data. The framework supports assessing the auditability of AI-based LA systems in use and improves the design of auditable systems and thus of audits.

auditability, data mining, machine learning, (15 more...)

2411.08906

Country:

North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
(4 more...)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.93)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Online (1.00)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(2 more...)

Kardanova, Elena, Ivanova, Alina, Tarasova, Ksenia, Pashchenko, Taras, Tikhoniuk, Aleksei, Yusupova, Elen, Kasprzhak, Anatoly, Kuzminov, Yaroslav, Kruchinskaia, Ekaterina, Brun, Irina

A Novel Psychometrics-Based Approach to Developing Professional Competency Benchmark for Large Language Models

arXiv.org Artificial IntelligenceOct-29-2024

The era of large language models (LLM) raises questions not only about how to train models, but also about how to evaluate them. Despite numerous existing benchmarks, insufficient attention is often given to creating assessments that test LLMs in a valid and reliable manner. To address this challenge, we accommodate the Evidence-centered design (ECD) methodology and propose a comprehensive approach to benchmark development based on rigorous psychometric principles. In this paper, we have made the first attempt to illustrate this approach by creating a new benchmark in the field of pedagogy and education, highlighting the limitations of existing benchmark development approach and taking into account the development of LLMs. We conclude that a new approach to benchmarking is required to match the growing complexity of AI applications in the educational context. We construct a novel benchmark guided by the Bloom's taxonomy and rigorously designed by a consortium of education experts trained in test development. Thus the current benchmark provides an academically robust and practical assessment tool tailored for LLMs, rather than human participants. Tested empirically on the GPT model in the Russian language, it evaluates model performance across varied task complexities, revealing critical gaps in current LLM capabilities. Our results indicate that while generative AI tools hold significant promise for education - potentially supporting tasks such as personalized tutoring, real-time feedback, and multilingual learning - their reliability as autonomous teachers' assistants right now remain rather limited, particularly in tasks requiring deeper cognitive engagement.

benchmark, large language model, machine learning, (18 more...)

2411.00045

Country:

Asia > Russia (0.14)
North America > United States > New York (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.66)

Industry:

Education > Educational Setting (1.00)
Education > Assessment & Standards (1.00)
Education > Curriculum > Subject-Specific Education (0.93)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Large Language Models for Manufacturing

Li, Yiwei, Zhao, Huaqin, Jiang, Hanqi, Pan, Yi, Liu, Zhengliang, Wu, Zihao, Shu, Peng, Tian, Jie, Yang, Tianze, Xu, Shaochen, Lyu, Yanjun, Blenk, Parker, Pence, Jacob, Rupram, Jason, Banu, Eliza, Liu, Ninghao, Wang, Linbing, Song, Wenzhan, Zhai, Xiaoming, Song, Kenan, Zhu, Dajiang, Li, Beiwen, Wang, Xianqiao, Liu, Tianming

The rapid advances in Large Language Models (LLMs) have the potential to transform manufacturing industry, offering new opportunities to optimize processes, improve efficiency, and drive innovation. This paper provides a comprehensive exploration of the integration of LLMs into the manufacturing domain, focusing on their potential to automate and enhance various aspects of manufacturing, from product design and development to quality control, supply chain optimization, and talent management. Through extensive evaluations across multiple manufacturing tasks, we demonstrate the remarkable capabilities of state-of-the-art LLMs, such as GPT-4V, in understanding and executing complex instructions, extracting valuable insights from vast amounts of data, and facilitating knowledge sharing. We also delve into the transformative potential of LLMs in reshaping manufacturing education, automating coding processes, enhancing robot control systems, and enabling the creation of immersive, data-rich virtual environments through the industrial metaverse. By highlighting the practical applications and emerging use cases of LLMs in manufacturing, this paper aims to provide a valuable resource for professionals, researchers, and decision-makers seeking to harness the power of these technologies to address real-world challenges, drive operational excellence, and unlock sustainable growth in an increasingly competitive landscape.

large language model, machine learning, natural language, (19 more...)

2410.21418

Country: North America > United States (1.00)

Genre:

Overview (1.00)
Instructional Material (1.00)
Workflow (0.93)
Research Report > Promising Solution (0.92)

Industry:

Semiconductors & Electronics (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.68)

A Tutorial on Clinical Speech AI Development: From Data Collection to Model Validation

Ng, Si-Ioi, Xu, Lingfeng, Siegert, Ingo, Cummins, Nicholas, Benway, Nina R., Liss, Julie, Berisha, Visar

There has been a surge of interest in leveraging speech as a marker of health for a wide spectrum of conditions. The underlying premise is that any neurological, mental, or physical deficits that impact speech production can be objectively assessed via automated analysis of speech. Recent advances in speech-based Artificial Intelligence (AI) models for diagnosing and tracking mental health, cognitive, and motor disorders often use supervised learning, similar to mainstream speech technologies like recognition and verification. However, clinical speech AI has distinct challenges, including the need for specific elicitation tasks, small available datasets, diverse speech representations, and uncertain diagnostic labels. As a result, application of the standard supervised learning paradigm may lead to models that perform well in controlled settings but fail to generalize in real-world clinical deployments. With translation into real-world clinical scenarios in mind, this tutorial paper provides an overview of the key components required for robust development of clinical speech AI. Specifically, this paper will cover the design of speech elicitation tasks and protocols most appropriate for different clinical conditions, collection of data and verification of hardware, development and validation of speech representations designed to measure clinical constructs of interest, development of reliable and robust clinical prediction models, and ethical and participant considerations for clinical speech AI. The goal is to provide comprehensive guidance on building models whose inputs and outputs link to the more interpretable and clinically meaningful aspects of speech, that can be interrogated and clinically validated on clinical datasets, and that adhere to ethical, privacy, and security considerations by design.

data mining, large language model, machine learning, (24 more...)

2410.2164

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
Asia > China > Hong Kong (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(10 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material > Course Syllabus & Notes (0.99)
Overview (0.86)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
(8 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(6 more...)

Towards Multi-dimensional Explanation Alignment for Medical Classification

Hu, Lijie, Lai, Songning, Chen, Wenshuo, Xiao, Hongru, Lin, Hongbin, Yu, Lu, Zhang, Jingfeng, Wang, Di

The lack of interpretability in the field of medical image analysis has significant ethical and legal implications. Existing interpretable methods in this domain encounter several challenges, including dependency on specific models, difficulties in understanding and visualization, as well as issues related to efficiency. To address these limitations, we propose a novel framework called Med-MICN (Medical Multi-dimensional Interpretable Concept Network). Med-MICN provides interpretability alignment for various angles, including neural symbolic reasoning, concept semantics, and saliency maps, which are superior to current interpretable methods. Its advantages include high prediction accuracy, interpretability across multiple dimensions, and automation through an end-to-end concept labeling process that reduces the need for extensive human training effort when working with new datasets. To demonstrate the effectiveness and interpretability of Med-MICN, we apply it to four benchmark datasets and compare it with baselines. The results clearly demonstrate the superior performance and interpretability of our Med-MICN.

data mining, machine learning, natural language, (19 more...)

2410.21494

Country:

Asia > Vietnam (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
Europe > Switzerland (0.04)
(2 more...)

Genre:

Research Report (1.00)
Instructional Material > Online (0.41)
Instructional Material > Course Syllabus & Notes (0.41)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)

AutoGLM: Autonomous Foundation Agents for GUIs

Liu, Xiao, Qin, Bo, Liang, Dongzhu, Dong, Guang, Lai, Hanyu, Zhang, Hanchen, Zhao, Hanlin, Iong, Iat Long, Sun, Jiadai, Wang, Jiaqi, Gao, Junjie, Shan, Junjun, Liu, Kangning, Zhang, Shudan, Yao, Shuntian, Cheng, Siyi, Yao, Wentao, Zhao, Wenyi, Liu, Xinghan, Liu, Xinyi, Chen, Xinying, Yang, Xinyue, Yang, Yang, Xu, Yifan, Yang, Yu, Wang, Yujia, Xu, Yulin, Qi, Zehan, Dong, Yuxiao, Tang, Jie

We present AutoGLM, a new series in the ChatGLM family, designed to serve as foundation agents for autonomous control of digital devices through Graphical User Interfaces (GUIs). While foundation models excel at acquiring human knowledge, they often struggle with decision-making in dynamic real-world environments, limiting their progress toward artificial general intelligence. This limitation underscores the importance of developing foundation agents capable of learning through autonomous environmental interactions by reinforcing existing models. Focusing on Web Browser and Phone as representative GUI scenarios, we have developed AutoGLM as a practical foundation agent system for real-world GUI interactions. Our approach integrates a comprehensive suite of techniques and infrastructures to create deployable agent systems suitable for user delivery. Through this development, we have derived two key insights: First, the design of an appropriate "intermediate interface" for GUI control is crucial, enabling the separation of planning and grounding behaviors, which require distinct optimization for flexibility and accuracy respectively. Second, we have developed a novel progressive training framework that enables self-evolving online curriculum reinforcement learning for AutoGLM. Our evaluations demonstrate AutoGLM's effectiveness across multiple domains. For web browsing, AutoGLM achieves a 55.2% success rate on VAB-WebArena-Lite (improving to 59.1% with a second attempt) and 96.2% on OpenTable evaluation tasks. In Android device control, AutoGLM attains a 36.2% success rate on AndroidLab (VAB-Mobile) and 89.7% on common tasks in popular Chinese APPs.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

2411.0082

Country:

North America > United States (0.04)
Asia > China > Guangdong Province > Zhuhai (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report (0.50)
Instructional Material (0.35)

Industry: Information Technology > Software (0.34)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)