AITopics

2411.08341

Country:

Asia > Singapore (0.04)
North America > United States > New York (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report (0.84)
Overview (0.68)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.85)

arXiv.org Artificial IntelligenceNov-13-2024

OML: Open, Monetizable, and Loyal AI

Cheng, Zerui, Contente, Edoardo, Finch, Ben, Golev, Oleg, Hayase, Jonathan, Miller, Andrew, Moshrefi, Niusha, Nasery, Anshul, Nailwal, Sandeep, Oh, Sewoong, Tyagi, Himanshu, Viswanath, Pramod

Artificial Intelligence (AI) has steadily improved across a wide range of tasks. However, the development and deployment of AI are almost entirely controlled by a few powerful organizations that are racing to create Artificial General Intelligence (AGI). The centralized entities make decisions with little public oversight, shaping the future of humanity, often with unforeseen consequences. In this paper, we propose OML, which stands for Open, Monetizable, and Loyal AI, an approach designed to democratize AI development. OML is realized through an interdisciplinary framework spanning AI, blockchain, and cryptography. We present several ideas for constructing OML using technologies such as Trusted Execution Environments (TEE), traditional cryptographic primitives like fully homomorphic encryption and functional encryption, obfuscation, and AI-native solutions rooted in the sample complexity and intrinsic hardness of AI tasks. A key innovation of our work is introducing a new scientific field: AI-native cryptography. Unlike conventional cryptography, which focuses on discrete data and binary security guarantees, AI-native cryptography exploits the continuous nature of AI data representations and their low-dimensional manifolds, focusing on improving approximate performance. One core idea is to transform AI attack methods, such as data poisoning, into security tools. This novel approach serves as a foundation for OML 1.0 which uses model fingerprinting to protect the integrity and ownership of AI models. The spirit of OML is to establish a decentralized, open, and transparent platform for AI development, enabling the community to contribute, monetize, and take ownership of AI models. By decentralizing control and ensuring transparency through blockchain technology, OML prevents the concentration of power and provides accountability in AI development that has not been possible before.

fingerprint, model owner, protocol, (16 more...)

2411.03887

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(3 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Mrkos, Jan, Komenda, Antonín, Fiedler, David, Vokřínek, Jiří

Online Dynamic Pricing for Electric Vehicle Charging Stations with Reservations

arXiv.org Artificial IntelligenceNov-13-2024

The transition to electric vehicles (EVs), coupled with the rise of renewable energy sources, will significantly impact the electric grid. Unlike conventional fuel sources, electricity for EVs is constrained by grid capacity, price fluctuations, and long EV charging times, requiring new pricing solutions to manage demand and supply. This paper proposes a model for online dynamic pricing of reserved EV charging services, including reservation, parking, and charging as a bundled service priced as a whole. Our approach focuses on the individual charging station operator, employing a stochastic demand model and online dynamic pricing based on expected demand. The proposed model uses a Markov Decision Process (MDP) formulation to optimize sequential pricing decisions for charging session requests. A key contribution is the novel definition and quantification of discretization error introduced by the discretization of the Poisson process for use in the MDP. The model's viability is demonstrated with a heuristic solution method based on Monte-Carlo tree search, offering a viable path for real-world application.

poisson process, pricing, timestep, (15 more...)

2410.05538

Country:

Europe > Czechia > Prague (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Slovenia > Central Slovenia > Municipality of Komenda > Komenda (0.04)
(3 more...)

Genre:

Overview (0.67)
Research Report > New Finding (0.46)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.89)

Gong, Shijin, Liu, Huihang, Zhang, Xinyu

Deep Generative Demand Learning for Newsvendor and Pricing

arXiv.org Machine LearningNov-13-2024

We consider data-driven inventory and pricing decisions in the feature-based newsvendor problem, where demand is influenced by both price and contextual features and is modeled without any structural assumptions. The unknown demand distribution results in a challenging conditional stochastic optimization problem, further complicated by decision-dependent uncertainty and the integration of features. Inspired by recent advances in deep generative learning, we propose a novel approach leveraging conditional deep generative models (cDGMs) to address these challenges. cDGMs learn the demand distribution and generate probabilistic demand forecasts conditioned on price and features. This generative approach enables accurate profit estimation and supports the design of algorithms for two key objectives: (1) optimizing inventory for arbitrary prices, and (2) jointly determining optimal pricing and inventory levels. We provide theoretical guarantees for our approach, including the consistency of profit estimation and convergence of our decisions to the optimal solution. Extensive simulations-ranging from simple to complex scenarios, including one involving textual features-and a real-world case study demonstrate the effectiveness of our approach. Our method opens a new paradigm in management science and operations research, is adaptable to extensions of the newsvendor and pricing problems, and holds potential for solving other conditional stochastic optimization problems.

cdgm, newsvendor problem, scenario, (16 more...)

arXiv.org Machine Learning

2411.08631

Country:

Asia > China > Anhui Province > Hefei (0.04)
Oceania > New Zealand (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Industry: Banking & Finance > Trading (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Shi, Xiangyu, Ding, Hongcheng, Faroog, Salaar, Dewi, Deshinta Arrova, Abdullah, Shamsul Nahar, Malek, Bahiah A

EUR/USD Exchange Rate Forecasting incorporating Text Mining Based on Pre-trained Language Models and Deep Learning Methods

This study introduces a novel approach for EUR/USD exchange rate forecasting that integrates deep learning, textual analysis, and particle swarm optimization (PSO). By incorporating online news and analysis texts as qualitative data, the proposed PSO-LSTM model demonstrates superior performance compared to traditional econometric and machine learning models. The research employs advanced text mining techniques, including sentiment analysis using the RoBERTa-Large model and topic modeling with LDA. Empirical findings underscore the significant advantage of incorporating textual data, with the PSO-LSTM model outperforming benchmark models such as SVM, SVR, ARIMA, and GARCH. Ablation experiments reveal the contribution of each textual data category to the overall forecasting performance. The study highlights the transformative potential of artificial intelligence in finance and paves the way for future research in real-time forecasting and the integration of alternative data sources.

artificial intelligence, machine learning, natural language, (18 more...)

2411.0756

Country:

North America > United States (0.28)
Europe (0.28)
Asia (0.28)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.25)

Genre:

Overview (1.00)
Research Report > New Finding (0.88)

Industry:

Government (1.00)
Banking & Finance > Trading (1.00)
Banking & Finance > Economy (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Anand, Avinash, Gupta, Akshit, Yadav, Nishchay, Bajaj, Shaurya

A Comprehensive Survey of AI-Driven Advancements and Techniques in Automated Program Repair and Code Generation

Bug fixing and code generation have been core research topics in software development for many years. The recent explosive growth in Large Language Models has completely transformed these spaces, putting in reach incredibly powerful tools for both. In this survey, 27 recent papers have been reviewed and split into two groups: one dedicated to Automated Program Repair (APR) and LLM integration and the other to code generation using LLMs. The first group consists of new methods for bug detection and repair, which include locating semantic errors, security vulnerabilities, and runtime failure bugs. The place of LLMs in reducing manual debugging efforts is emphasized in this work by APR toward context-aware fixes, with innovations that boost accuracy and efficiency in automatic debugging. The second group dwells on code generation, providing an overview of both general-purpose LLMs fine-tuned for programming and task-specific models. It also presents methods to improve code generation, such as identifier-aware training, fine-tuning at the instruction level, and incorporating semantic code structures. This survey work contrasts the methodologies in APR and code generation to identify trends such as using LLMs, feedback loops to enable iterative code improvement and open-source models. It also discusses the challenges of achieving functional correctness and security and outlines future directions for research in LLM-based software development.

large language model, machine learning, natural language, (18 more...)

2411.07586

Country:

Asia > India > NCT > New Delhi (0.04)
Asia > India > NCT > Delhi (0.04)

Genre: Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Deceiving Question-Answering Models: A Hybrid Word-Level Adversarial Approach

Li, Jiyao, Ni, Mingze, Gong, Yongshun, Liu, Wei

Deep learning underpins most of the currently advanced natural language processing (NLP) tasks such as textual classification, neural machine translation (NMT), abstractive summarization and question-answering (QA). However, the robustness of the models, particularly QA models, against adversarial attacks is a critical concern that remains insufficiently explored. This paper introduces QA-Attack (Question Answering Attack), a novel word-level adversarial strategy that fools QA models. Our attention-based attack exploits the customized attention mechanism and deletion ranking strategy to identify and target specific words within contextual passages. It creates deceptive inputs by carefully choosing and substituting synonyms, preserving grammatical integrity while misleading the model to produce incorrect responses. Our approach demonstrates versatility across various question types, particularly when dealing with extensive long textual inputs. Extensive experiments on multiple benchmark datasets demonstrate that QA-Attack successfully deceives baseline QA models and surpasses existing adversarial techniques regarding success rate, semantics changes, BLEU score, fluency and grammar error rate.

computational linguistic, machine learning, question answering, (22 more...)

2411.08248

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(16 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Media (1.00)
Information Technology > Security & Privacy (1.00)
Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Zampella, Maria, Otamendi, Urtzi, Belaunzaran, Xabier, Artetxe, Arkaitz, Olaizola, Igor G., Longo, Giuseppe, Sierra, Basilio

Exploring Multi-Agent Reinforcement Learning for Unrelated Parallel Machine Scheduling

Scheduling problems pose significant challenges in resource, industry, and operational management. This paper addresses the Unrelated Parallel Machine Scheduling Problem (UPMS) with setup times and resources using a Multi-Agent Reinforcement Learning (MARL) approach. The study introduces the Reinforcement Learning environment and conducts empirical analyses, comparing MARL with Single-Agent algorithms. The experiments employ various deep neural network policies for single- and Multi-Agent approaches. Results demonstrate the efficacy of the Maskable extension of the Proximal Policy Optimization (PPO) algorithm in Single-Agent scenarios and the Multi-Agent PPO algorithm in Multi-Agent setups. While Single-Agent algorithms perform adequately in reduced scenarios, Multi-Agent approaches reveal challenges in cooperative learning but a scalable capacity. This research contributes insights into applying MARL techniques to scheduling optimization, emphasizing the need for algorithmic sophistication balanced with scalability for intelligent scheduling solutions.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2411.07634

Country:

Asia > Middle East > Jordan (0.04)
Europe > Spain > Basque Country (0.04)
Europe > Italy (0.04)

Genre:

Research Report (1.00)
Overview (0.68)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Morovati, Mohammad Mehdi, Nikanjam, Amin, Khomh, Foutse

Fault Localization in Deep Learning-based Software: A System-level Approach

Over the past decade, Deep Learning (DL) has become an integral part of our daily lives. This surge in DL usage has heightened the need for developing reliable DL software systems. Given that fault localization is a critical task in reliability assessment, researchers have proposed several fault localization techniques for DL-based software, primarily focusing on faults within the DL model. While the DL model is central to DL components, there are other elements that significantly impact the performance of DL components. As a result, fault localization methods that concentrate solely on the DL model overlook a large portion of the system. To address this, we introduce FL4Deep, a system-level fault localization approach considering the entire DL development pipeline to effectively localize faults across the DL-based systems. In an evaluation using 100 faulty DL scripts, FL4Deep outperformed four previous approaches in terms of accuracy for three out of six DL-related faults, including issues related to data (84%), mismatched libraries between training and deployment (100%), and loss function (69%). Additionally, FL4Deep demonstrated superior precision and recall in fault localization for five categories of faults including three mentioned fault types in terms of accuracy, plus insufficient training iteration and activation function.

artificial intelligence, information, machine learning, (16 more...)

2411.08172

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Oregon (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.93)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment > Sports > Baseball (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Training Data for Large Language Model

Ju, Yiming, Ma, Huanhuan

In 2022, with the release of ChatGPT, large-scale language models gained widespread attention. ChatGPT not only surpassed previous models in terms of parameters and the scale of its pretraining corpus but also achieved revolutionary performance improvements through fine-tuning on a vast amount of high-quality, human-annotated data. This progress has led enterprises and research institutions to recognize that building smarter and more powerful models relies on rich and high-quality datasets. Consequently, the construction and optimization of datasets have become a critical focus in the field of artificial intelligence. This paper summarizes the current state of pretraining and fine-tuning data for training large-scale language models, covering aspects such as data scale, collection methods, data types and characteristics, processing workflows, and provides an overview of available open-source datasets.

huggingface, large language model, machine learning, (14 more...)

2411.07715

Genre:

Overview (0.88)
Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)