Toward the Integration of Exploration and Modeling in a Planning Framework

AAAI Conferences

For example, the abstract operation of fitting a robust line to a relationship may expand to partitioning the relationship, calculating medians, and combining the results. Selection of the most effective exploratory strategy is akin to selection of an appropriate plan to satisfy a given goal. We must often evaluate different paths to find the most effective one. Just as plans fail and require repair, an exploratory operation may require iteration for adequate results. Retrying an operation is analogous to retrying an action as a part of plan failure recovery.

Intelligent Data Analysis

AAAI Conferences

Experimental Knowledge Systems Laboratory Department of Computer Science University of Massachusetts Amherst, MA Abstract We believe the problem of automating the process of building models from empirical data is a critical issue for both Artificial Intelligence and other scientific computing researchers. Although both fields require models of the behavior of complex systems, as AI researchers we may more directly address our particular needs. AI researchers require models that let us determine the influence of design decisions and environmental factors on the performance of AI programs so as to inform the design of the next generation of intelligent agents. Our research includes the complementary projects of building a blackboard-based automated modelbuilding assistant and analyzing the efficacy of heuristics used in function finding programs. Introduction As Artificial Intelligence researchers, our basic objective is to increase our knowledge of how and why intelligent agents work the way they do.

Application of the TETRAD II Program to the Study of Student Retention in U.S. Colleges

AAAI Conferences

Our analysis focuses on possible causes of low freshmen retention in U.S. colleges. TETRAD II finds a set of causal structures that are compatible with the data. One apparently robust finding is that student retention is directly related to the average test scores and high school class standing of the incoming freshmen. When test scores and class standing are controlled for, factors such as student faculty ratio, faculty salary, and university's educational expenses per student are all independent of both retention and graduation rates, and, therefore, do not seem to directly influence student retention. Furthermore, simple linear regression applied to test scores, class standing, and retention data showed that the test scores and class standing explain 52.6% of the variance in freshmen retention rate and 62.5% of the variance in graduation rate (test scores alone explain 50.5% and 62.0% respectively). This result becomes even stronger when computed for the group of top ranking colleges -- regression applied to a group of 41 top ranking colleges showed explanation of 68.3% of the variance in freshmen retention rate and 77.0% in graduation rate (66.6% and 75% respectively for test scores alone). As the test scores and classtanding are both indicators of the overall quality of the incoming students, we predicthat one of the most effective ways of improving student retention in an individual college is increasing the college's selectivity. High selectivity will lead to higher quality of the incoming students and, effectively, to higher retention rate.

Improving Model Robustness Using Causal Knowledge Machine Learning

For decades, researchers in fields, such as the natural and social sciences, have been verifying causal relationships and investigating hypotheses that are now well-established or understood as truth. These causal mechanisms are properties of the natural world, and thus are invariant conditions regardless of the collection domain or environment. We show in this paper how prior knowledge in the form of a causal graph can be utilized to guide model selection, i.e., to identify from a set of trained networks the models that are the most robust and invariant to unseen domains. Our method incorporates prior knowledge (which can be incomplete) as a Structural Causal Model (SCM) and calculates a score based on the likelihood of the SCM given the target predictions of a candidate model and the provided input variables. We show on both publicly available and synthetic datasets that our method is able to identify more robust models in terms of generalizability to unseen out-of-distribution test examples and domains where covariates have shifted.