Goto

Collaborating Authors

 variable


SyMANTIC: An Efficient Symbolic Regression Method for Interpretable and Parsimonious Model Discovery in Science and Beyond

arXiv.org Artificial Intelligence

Symbolic regression (SR) is an emerging branch of machine learning focused on discovering simple and interpretable mathematical expressions from data. Although a wide-variety of SR methods have been developed, they often face challenges such as high computational cost, poor scalability with respect to the number of input dimensions, fragility to noise, and an inability to balance accuracy and complexity. This work introduces SyMANTIC, a novel SR algorithm that addresses these challenges. SyMANTIC efficiently identifies (potentially several) low-dimensional descriptors from a large set of candidates (from $\sim 10^5$ to $\sim 10^{10}$ or more) through a unique combination of mutual information-based feature selection, adaptive feature expansion, and recursively applied $\ell_0$-based sparse regression. In addition, it employs an information-theoretic measure to produce an approximate set of Pareto-optimal equations, each offering the best-found accuracy for a given complexity. Furthermore, our open-source implementation of SyMANTIC, built on the PyTorch ecosystem, facilitates easy installation and GPU acceleration. We demonstrate the effectiveness of SyMANTIC across a range of problems, including synthetic examples, scientific benchmarks, real-world material property predictions, and chaotic dynamical system identification from small datasets. Extensive comparisons show that SyMANTIC uncovers similar or more accurate models at a fraction of the cost of existing SR methods.


Reviews: Learning Treewidth-Bounded Bayesian Networks with Thousands of Variables

Neural Information Processing Systems

The proposed method is very similar to previous work by Nie et al. -- both use k-trees to search for low-treewidth Bayesian networks, both start with a randomly chosen initial clique, and both propose using an A* method for finding the best tree. The differences are that Nie et al. score k-trees using a mutual information score and use BDeu for choosing the final consistent Bayesian network, while this paper proposes using BIC and incrementally building the Bayesian network along with the k-tree, using the BN to score the k-tree. This paper also includes the additional restriction that the complete variable (partial) order is chosen randomly, while in Nie et al. The main justification for these differences is the ability to scale to large treewidths. However, in the experiments, the previous S2 algorithm also can scale to large treewidths.


Applying Convolutional Neural Networks to Data on Unstructured Meshes with Space-Filling Curves

arXiv.org Artificial Intelligence

This paper presents the first classical Convolutional Neural Network (CNN) that can be applied directly to data from unstructured finite element meshes or control volume grids. CNNs have been hugely influential in the areas of image classification and image compression, both of which typically deal with data on structured grids. Unstructured meshes are frequently used to solve partial differential equations and are particularly suitable for problems that require the mesh to conform to complex geometries or for problems that require variable mesh resolution. Central to the approach are space-filling curves, which traverse the nodes or cells of a mesh tracing out a path that is as short as possible (in terms of numbers of edges) and that visits each node or cell exactly once. The space-filling curves (SFCs) are used to find an ordering of the nodes or cells that can transform multi-dimensional solutions on unstructured meshes into a one-dimensional (1D) representation, to which 1D convolutional layers can then be applied. Although developed in two dimensions, the approach is applicable to higher dimensional problems. To demonstrate the approach, the network we choose is a convolutional autoencoder (CAE) although other types of CNN could be used. The approach is tested by applying CAEs to data sets that have been reordered with an SFC. Sparse layers are used at the input and output of the autoencoder, and the use of multiple SFCs is explored. We compare the accuracy of the SFC-based CAE with that of a classical CAE applied to two idealised problems on structured meshes, and then apply the approach to solutions of flow past a cylinder obtained using the finite-element method and an unstructured mesh.


Thinking Backward for Knowledge Acquisition

AI Magazine

This article examines the direction in which knowledge bases are constructed for diagnosis and decision making When building an expert system, it is traditional to elicit knowledge from an expert in the direction in which the knowledge is to be applied, namely, from observable evidence toward unobservable hypotheses However, experts usually find it simpler to reason in the opposite direction-from hypotheses to unobservable evidence-because this direction reflects causal relationships Therefore, we argue that a knowledge base be constructed following the expert's natural reasoning direction, and then reverse the direction for use This choice of representation direction facilitates knowledge acquisition in deterministic domains and is essential when a problem involves uncertainty We illustrate this concept with influence diagrams, a methodology for graphically representing a joint probability distribution Influence diagrams provide a practical means by which an expert can characterize the qualitative and quantitative relationships among evidence and hypotheses in the appropriate direction Once constructed, the relationships can easily be reversed into the less intuitive direction in order to perform inference and diagnosis, In this way, knowledge acquisition is made cognitively simple; the machine carries the burden of translating the representation "OK," we replied, "If the tiger were present, what is the probability that you would see that image? On the other hand, if the tiger were not present, what is the probability you would see it?" Before we could say "what is the probability there is a tiger in the first place?" Since then, we have pondered this question. Why is it that we want to look at problems of evidential reasoning backward?


A Prototype Expert System

AI Magazine

During the past year, a prototype expert system for tactical data fusion has been under development This compute1 program combines various messages concerning electronic intelligence (ELINT) to aid in decision making concerning enemy actions and intentions The prototype system is written in Prolog, a language that has proved to be very powerful and easy to use for problem/rule development The resulting prototype system (called EXPRS - Expert PRolog System) uses English-like rule constructs of Prolog code This approach enables the system to generate answers automatically to "why" a rule fired, and "how" that rule fired In addition, a rule clause construct is provided which allows direct access to Prolog code routines This paper describes the structure of the rules used and provides typical useI interactions IN THE MODERN MILITARY ENVIRONMENT, Multiple sensor inputs need to be interpreted in a timely manner to assess developing battlefield conditions. The high volume of data from such sensor systems, as well as their high rate of data transfer, make this timely interpretation difficult and very demanding of human resources. THE AI MAGAZINE Summer 1984 37 is inherently probabilistic as well as time varying and nonmonotonic. The fusion process can also require numerical analysis to be done on the raw sensor data. This "number crunching" analysis is best done (and is currently being done) with languages such as This form of representation is very general, offering good future growth potential for the system.


Distributed Problem Solving

AI Magazine

In this article, we illustrate the motivations for distributed problem solving and provide an overview of two distributed problem-solving models, namely distributed constraint-satisfaction problems (DCSPs) and distributed constraint-optimization problems (DCOPs), and some of their algorithms. These agents are often assumed to be cooperative, that is, they are part of a team or they are self-interested but incentives or disincentives have been applied such that the individual agent rewards are aligned with the team reward. We illustrate the motivations for distributed problem solving with an example. Imagine a decentralized channel-allocation problem in a wireless local area network (WLAN), where each access point (agent) in the WLAN needs to allocate itself a channel to broadcast such that no two access points with overlapping broadcast regions (neighboring agents) are allocated the same channel to avoid interference. Figure 1 shows example mobile WLAN access points, where each access point is a Create robot fitted with a wireless CenGen radio card. Figure 2a shows an illustration of such a problem with three access points in a WLAN, where each oval ring represents the broadcast region of an access point. This problem can, in principle, be solved with a centralized approach by having each and every agent transmit all the relevant information, that is, the set of possible channels that the agent can allocate itself and its set of neighboring agents, to a centralized server.


Design Prototypes: A Knowledge Representation Schema for Design

AI Magazine

Although there are designers who claim design is a mysterious activity not amenable to scientific examination, research into design continues Although there are publications by designers on how to design dating back to Roman times, notably by Vitruvius, the nineteenthcentury design thinkers actually began work on articulating design as a process (Durand 1802). However, it was not until the 1960s that major research programs were initiated. These programs were originally founded on the systems view and used concepts from operations research (Jones and Thornley 1963). More recently, information-processing models founded on AI concepts have provided an impetus for renewed research into design in its various aspects (Simon 1969; Coyne et al. 1990). Many foundational ideas in AI are proving to be useful in developing formal models of design as an activity.


Algorithms for Constraint-Satisfaction Problems: A Survey

AI Magazine

A large number of problems in AI and other areas of computer science can be viewed as special cases of the constraint-satisfaction problem. Some examples are machine vision, belief maintenance, scheduling, temporal reasoning, graph problems, floor plan design, the planning of genetic experiments, and the satisfiability problem. A number of different approaches have been developed for solving these problems. Some of them use constraint propagation to simplify the original problem. Others use backtracking to directly search for possible solutions.


A Constraint-Based Dental School Timetabling System

AI Magazine

This system has been deployed since 2010. Dental school timetabling differs from other university course scheduling in that certain clinic sessions can be used by multiple courses at the same time, provided a limit on room capacity is satisfied. Starting from a constraint-programming solution using a web interface, we have moved to a mixed integer programming-based solver to deal with multiple objective functions, along with a dedicated Java application, which provides a rich user interface. Solutions for the years 2010, 2011, and 2012 have been used in the dental school, replacing a manual timetabling process, which could no longer cope with increasing student numbers and resulting resource bottlenecks. The use of the automated system allowed the dental school to increase the number of students enrolled to the maximum possible given the available resources.


?utm_content=bufferffe48&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer

@machinelearnbot

This powerful quote by William Shakespeare applies well to techniques used in data science & analytics as well. Allow me to prove it using a short story. In May ' 2015, we conducted a Data Hackathon ( a data science competition) in Delhi-NCR, India. We gave participants the challenge to identify Human Activity Recognition Using Smartphones Data Set. The data set had 561 variables for training model used for the identification of Human activity in test data set.