AITopics

2312.13397

Country:

Europe > Germany > Berlin (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Nepal (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Vertical Symbolic Regression

Jiang, Nan, Nasim, Md, Xue, Yexiang

Automating scientific discovery has been a grand goal of Artificial Intelligence (AI) and will bring tremendous societal impact. Learning symbolic expressions from experimental data is a vital step in AI-driven scientific discovery. Despite exciting progress, most endeavors have focused on the horizontal discovery paths, i.e., they directly search for the best expression in the full hypothesis space involving all the independent variables. Horizontal paths are challenging due to the exponentially large hypothesis space involving all the independent variables. We propose Vertical Symbolic Regression (VSR) to expedite symbolic regression. The VSR starts by fitting simple expressions involving a few independent variables under controlled experiments where the remaining variables are held constant. It then extends the expressions learned in previous rounds by adding new independent variables and using new control variable experiments allowing these variables to vary. The first few steps in vertical discovery are significantly cheaper than the horizontal path, as their search is in reduced hypothesis spaces involving a small set of variables. As a consequence, vertical discovery has the potential to supercharge state-of-the-art symbolic regression approaches in handling complex equations with many contributing factors. Theoretically, we show that the search space of VSR can be exponentially smaller than that of horizontal approaches when learning a class of expressions. Experimentally, VSR outperforms several baselines in learning symbolic expressions involving many independent variables.

artificial intelligence, evolutionary algorithm, machine learning, (19 more...)

2312.11955

Country:

North America > United States (0.45)
Europe > United Kingdom (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Studying the Practices of Testing Machine Learning Software in the Wild

Openja, Moses, Khomh, Foutse, Foundjem, Armstrong, Ming, Zhen, Jiang, null, Abidi, Mouna, Hassan, Ahmed E.

Background: We are witnessing an increasing adoption of machine learning (ML), especially deep learning (DL) algorithms in many software systems, including safety-critical systems such as health care systems or autonomous driving vehicles. Ensuring the software quality of these systems is yet an open challenge for the research community, mainly due to the inductive nature of ML software systems. Traditionally, software systems were constructed deductively, by writing down the rules that govern the behavior of the system as program code. However, for ML software, these rules are inferred from training data. Few recent research advances in the quality assurance of ML systems have adapted different concepts from traditional software testing, such as mutation testing, to help improve the reliability of ML software systems. However, it is unclear if any of these proposed testing techniques from research are adopted in practice. There is little empirical evidence about the testing strategies of ML engineers. Aims: To fill this gap, we perform the first fine-grained empirical study on ML testing practices in the wild, to identify the ML properties being tested, the followed testing strategies, and their implementation throughout the ML workflow. Method: First, we systematically summarized the different testing strategies (e.g., Oracle Approximation), the tested ML properties (e.g., Correctness, Bias, and Fairness), and the testing methods (e.g., Unit test) from the literature. Then, we conducted a study to understand the practices of testing ML software. Results: In our findings: 1) we identified four (4) major categories of testing strategy including Grey-box, White-box, Black-box, and Heuristic-based techniques that are used by the ML engineers to find software bugs. 2) We identified 16 ML properties that are tested in the ML workflow.

ml work ow activity, test pyramid, value range analysis, (13 more...)

2312.12604

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
(12 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Banking & Finance (0.92)
(2 more...)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

CGS-Mask: Making Time Series Predictions Intuitive for All

Lu, Feng, Li, Wei, Sun, Yifei, Song, Cheng, Ren, Yufei, Zomaya, Albert Y.

Artificial intelligence (AI) has immense potential in time series prediction, but most explainable tools have limited capabilities in providing a systematic understanding of important features over time. These tools typically rely on evaluating a single time point, overlook the time ordering of inputs, and neglect the time-sensitive nature of time series applications. These factors make it difficult for users, particularly those without domain knowledge, to comprehend AI model decisions and obtain meaningful explanations. We propose CGS-Mask, a post-hoc and model-agnostic cellular genetic strip mask-based saliency approach to address these challenges. CGS-Mask uses consecutive time steps as a cohesive entity to evaluate the impact of features on the final prediction, providing binary and sustained feature importance scores over time. Our algorithm optimizes the mask population iteratively to obtain the optimal mask in a reasonable time. We evaluated CGS-Mask on synthetic and real-world datasets, and it outperformed state-of-the-art methods in elucidating the importance of features over time. According to our pilot user study via a questionnaire survey, CGS-Mask is the most effective approach in presenting easily understandable time series prediction results, enabling users to comprehend the decision-making process of AI models with ease.

cg-mask, experiment, prediction, (17 more...)

2312.09513

Country:

Asia > China (0.04)
Asia > Middle East > Israel (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Health Care Providers & Services (0.46)
Materials > Chemicals (0.46)
Health & Medicine > Health Care Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Tran, Cong Dao, Hy, Truong Son

Graph Attention-based Deep Reinforcement Learning for solving the Chinese Postman Problem with Load-dependent costs

Recently, Deep reinforcement learning (DRL) models have shown promising results in solving routing problems. However, most DRL solvers are commonly proposed to solve node routing problems, such as the Traveling Salesman Problem (TSP). Meanwhile, there has been limited research on applying neural methods to arc routing problems, such as the Chinese Postman Problem (CPP), since they often feature irregular and complex solution spaces compared to TSP. To fill these gaps, this paper proposes a novel DRL framework to address the CPP with load-dependent costs (CPP-LC) (Corberan et al., 2018), which is a complex arc routing problem with load constraints. The novelty of our method is two-fold. First, we formulate the CPP-LC as a Markov Decision Process (MDP) sequential model. Subsequently, we introduce an autoregressive model based on DRL, namely Arc-DRL, consisting of an encoder and decoder to address the CPP-LC challenge effectively. Such a framework allows the DRL model to work efficiently and scalably to arc routing problems. Furthermore, we propose a new bio-inspired meta-heuristic solution based on Evolutionary Algorithm (EA) for CPP-LC. Extensive experiments show that Arc-DRL outperforms existing meta-heuristic methods such as Iterative Local Search (ILS) and Variable Neighborhood Search (VNS) proposed by (Corberan et al., 2018) on large benchmark datasets for CPP-LC regarding both solution quality and running time; while the EA gives the best solution quality with much more running time. We release our C++ implementations for metaheuristics such as EA, ILS and VNS along with the code for data generation and our generated data at https://github.com/HySonLab/Chinese_Postman_Problem

algorithm, cpp-lc, node, (10 more...)

2310.15516

Country:

North America > United States > Indiana > Vigo County > Terre Haute (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)

Genre: Research Report (0.50)

Industry: Transportation (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Martin, Carlos, Sandholm, Tuomas

Finding Nash equilibria by minimizing approximate exploitability with learned best responses

There has been substantial progress on finding game-theoretic equilibria. Most of that work has focused on games with finite, discrete action spaces. However, many games involving space, time, money, and other fine-grained quantities have continuous action spaces (or are best modeled as such). We study the problem of finding an approximate Nash equilibrium of games with continuous action sets. The standard measure of closeness to Nash equilibrium is exploitability, which measures how much players can benefit from unilaterally changing their strategy. We propose two new methods that minimize an approximation of the exploitability with respect to the strategy profile. The first method uses a learned best-response function, which takes the current strategy profile as input and returns candidate best responses for each player. The strategy profile and best-response functions are trained simultaneously, with the former trying to minimize exploitability while the latter tries to maximize it. The second method maintains an ensemble of candidate best responses for each player. In each iteration, the best-performing elements of each ensemble are used to update the current strategy profile. The strategy profile and best-response ensembles are simultaneously trained to minimize and maximize the approximate exploitability, respectively. We evaluate our methods on various continuous games, showing that they outperform prior methods.

equilibria, equilibrium, nash equilibrium, (16 more...)

2301.0883

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Jersey (0.14)
Asia > Middle East > Jordan (0.04)
(6 more...)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(5 more...)

Imitation of Life: A Search Engine for Biologically Inspired Design

Emuna, Hen, Borenstein, Nadav, Qian, Xin, Kang, Hyeonsu, Chan, Joel, Kittur, Aniket, Shahaf, Dafna

Biologically Inspired Design (BID), or Biomimicry, is a problem-solving methodology that applies analogies from nature to solve engineering challenges. For example, Speedo engineers designed swimsuits based on shark skin. Finding relevant biological solutions for real-world problems poses significant challenges, both due to the limited biological knowledge engineers and designers typically possess and to the limited BID resources. Existing BID datasets are hand-curated and small, and scaling them up requires costly human annotations. In this paper, we introduce BARcode (Biological Analogy Retriever), a search engine for automatically mining bio-inspirations from the web at scale. Using advances in natural language understanding and data programming, BARcode identifies potential inspirations for engineering challenges. Our experiments demonstrate that BARcode can retrieve inspirations that are valuable to engineers and designers tackling real-world problems, as well as recover famous historical BID examples. We release data and code; we view BARcode as a step towards addressing the challenges that have historically hindered the practical application of BID to engineering innovation.

bar code, query, right attr, (16 more...)

2312.12681

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.70)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.61)
(2 more...)

Noordhoek, Kyle, Bartel, Christopher J.

Accelerating the prediction of inorganic surfaces with machine learning interatomic potentials

arXiv.org Artificial IntelligenceDec-18-2023

The surface properties of solid-state materials often dictate their functionality, especially for applications where nanoscale effects become important. The relevant surface(s) and their properties are determined, in large part, by the material's synthesis or operating conditions. These conditions dictate thermodynamic driving forces and kinetic rates responsible for yielding the observed surface structure and morphology. Computational surface science methods have long been applied to connect thermochemical conditions to surface phase stability, particularly in the heterogeneous catalysis and thin film growth communities. This review provides a brief introduction to first-principles approaches to compute surface phase diagrams before introducing emerging data-driven approaches. The remainder of the review focuses on the application of machine learning, predominantly in the form of learned interatomic potentials, to study complex surfaces. As machine learning algorithms and large datasets on which to train them become more commonplace in materials science, computational methods are poised to become even more predictive and powerful for modeling the complexities of inorganic surfaces at the nanoscale.

facet, phy, surface energy, (17 more...)

2312.11708

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
Europe > United Kingdom > Wales (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Materials (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.93)

arXiv.org Artificial IntelligenceDec-18-2023

Multi-Goal Optimal Route Planning Using the Cell Mapping Technique

Karagounis, Athanasios

This manuscript explores the complexities of multi-objective path planning, aiming to optimize routes against a backdrop of conflicting performance criteria. The study integrates the cell mapping approach as its foundational concept. A two-pronged search strategy is introduced; initially, the cell mapping technique is utilized to develop a comprehensive database, encompassing all cells within the specified area. This database records the performance metrics for the most efficient routes from each cell to the designated target. The second phase involves analyzing this database to pinpoint the extent and count of all Pareto optimal routes from a selected starting cell to the target. This analysis contributes to solving the overarching multi-objective optimization challenge inherent in path planning. To validate this approach, case studies are included, and the results are benchmarked against the well-established multi-objective A* (MOA*) method. The study discovers that while the cell mapping method achieves similar outcomes to the MOA* method for routes originating from a single point, it demonstrates superior computational benefits, particularly when the starting and ending points are in separate, non-overlapping areas.

algorithm, optimal path, path planning, (14 more...)

2312.11025

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Greece (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.96)

Dyoub, Abeer, Letteri, Ivan

Dataset Optimization for Chronic Disease Prediction with Bio-Inspired Feature Selection

arXiv.org Artificial IntelligenceDec-17-2023

In this study, we investigated the application of bio-inspired optimization algorithms, including Genetic Algorithm, Particle Swarm Optimization, and Whale Optimization Algorithm, for feature selection in chronic disease prediction. The primary goal was to enhance the predictive accuracy of models streamline data dimensionality, and make predictions more interpretable and actionable. The research encompassed a comparative analysis of the three bio-inspired feature selection approaches across diverse chronic diseases, including diabetes, cancer, kidney, and cardiovascular diseases. Performance metrics such as accuracy, precision, recall, and f1 score are used to assess the effectiveness of the algorithms in reducing the number of features needed for accurate classification. The results in general demonstrate that the bio-inspired optimization algorithms are effective in reducing the number of features required for accurate classification. However, there have been variations in the performance of the algorithms on different datasets. The study highlights the importance of data pre-processing and cleaning in ensuring the reliability and effectiveness of the analysis. This study contributes to the advancement of predictive analytics in the realm of chronic diseases. The potential impact of this work extends to early intervention, precision medicine, and improved patient outcomes, providing new avenues for the delivery of healthcare services tailored to individual needs. The findings underscore the potential benefits of using bio-inspired optimization algorithms for feature selection in chronic disease prediction, offering valuable insights for improving healthcare outcomes.

accuracy, algorithm, dataset, (12 more...)

2401.0538

Country:

North America > United States > Wisconsin (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(9 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)