Industry
Networks and Natural Language Processing
Radev, Dragomir R. (University of Michigan) | Mihalcea, Rada (University of North Texas)
Over the last few years, a number of areas of natural language processing have begun applying graph-based techniques. These include, among others, text summarization, syntactic parsing, word-sense disambiguation, ontology construction, sentiment and subjectivity analysis, and text clustering. In this paper, we present some of the most successful graph-based representations and algorithms used in language processing and try to explain how and why they work.
Introduction to the Special Issue on AI and Networks
Jardins, Marie des (University of Maryland) | Gaston, Matthew E. (Viz) | Radev, Dragomir R. (University of Michigan)
This introduction to AI Magazine's Special Issueon Networks and AI summarizes the seven articles in thespecial issue by characterizing the nature of thenetworks that are the focus of each of the papers.A short tutorial on graph theory and network structuresis included for those less familiar with the topic.
Normalized Information Distance
Vitanyi, Paul M. B., Balbach, Frank J., Cilibrasi, Rudi L., Li, Ming
The normalized information distance is a universal distance measure for objects of all kinds. It is based on Kolmogorov complexity and thus uncomputable, but there are ways to utilize it. First, compression algorithms can be used to approximate the Kolmogorov complexity if the objects have a string representation. Second, for names and abstract concepts, page count statistics from the World Wide Web can be used. These practical realizations of the normalized information distance can then be applied to machine learning tasks, expecially clustering, to perform feature-free and parameter-free data mining. This chapter discusses the theoretical foundations of the normalized information distance and both practical realizations. It presents numerous examples of successful real-world applications based on these distance measures, ranging from bioinformatics to music clustering to machine translation.
Randomized Distributed Configuration Management of Wireless Networks: Multi-layer Markov Random Fields and Near-Optimality
Distributed configuration management is imperative for wireless infrastructureless networks where each node adjusts locally its physical and logical configuration through information exchange with neighbors. Two issues remain open. The first is the optimality. The second is the complexity. We study these issues through modeling, analysis, and randomized distributed algorithms. Modeling defines the optimality. We first derive a global probabilistic model for a network configuration which characterizes jointly the statistical spatial dependence of a physical- and a logical-configuration. We then show that a local model which approximates the global model is a two-layer Markov Random Field or a random bond model. The complexity of the local model is the communication range among nodes. The local model is near-optimal when the approximation error to the global model is within a given error bound. We analyze the trade-off between an approximation error and complexity, and derive sufficient conditions on the near-optimality of the local model. We validate the model, the analysis and the randomized distributed algorithms also through simulation.
ECOLANG - Communications Language for Ecological Simulations Network
This document ("ECOLANG_v_1_3c_Eng.doc") describes the communication language used in one multi-agent systems environment for ecological simulations, based on the EcoDynamo simulator application (Pereira and Duarte 2005) linked with several intelligent agents and visualisation applications and extends the initial definition of the language (Pereira et al. 2005). The agents' actions and perceptions are translated into messages exchanged with the simulator application and other agents. The concepts' definitions used follow the BNF notation (Backus et al. 1960) and it's inspired in the Coach Unilang language (Reis and Lau 2002). ECOLANG notation is an extension to the original BNF formalism adding the following meta-symbols: { } used for repetitive items (one or more times); [ ] encloses types of values; Terminal symbols use bold face letters.
Tableau-based decision procedures for logics of strategic ability in multi-agent systems
Goranko, Valentin, Shkatov, Dmitry
Multiagent systems ([10], [31], [33], [26]) are an increasingly important and active area of interdisciplinary research on the border of computer science, artificial intelligence, and game theory, as they model a wide variety of phenomena in these fields, including open and interactive systems, distributed computations, security protocols, knowledge and information exchange, coalitional abilities in games, etc. Not surprisingly, a number of logical formalisms have been proposed for specification, verification, and reasoning about multiagent systems.
Variable Neighborhood Search for the University Lecturer-Student Assignment Problem
Geiger, Martin Josef, Wenger, Wolf
The paper presents a study of local search heuristics in general and variable neighborhood search in particular for the resolution of an assignment problem studied in the practical work of universities. Here, students have to be assigned to scientific topics which are proposed and supported by members of staff. The problem involves the optimization under given preferences of students which may be expressed when applying for certain topics. It is possible to observe that variable neighborhood search leads to superior results for the tested problem instances. One instance is taken from an actual case, while others have been generated based on the real world data to support the analysis with a deeper analysis. An extension of the problem has been formulated by integrating a second objective function that simultaneously balances the workload of the members of staff while maximizing utility of the students. The algorithmic approach has been prototypically implemented in a computer system. One important aspect in this context is the application of the research work to problems of other scientific institutions, and therefore the provision of decision support functionalities.
A framework for the interactive resolution of multi-objective vehicle routing problems
Geiger, Martin Josef, Wenger, Wolf
The article presents a framework for the resolution of rich vehicle routing problems which are difficult to address with standard optimization techniques. We use local search on the basis on variable neighborhood search for the construction of the solutions, but embed the techniques in a flexible framework that allows the consideration of complex side constraints of the problem such as time windows, multiple depots, heterogeneous fleets, and, in particular, multiple optimization criteria. In order to identify a compromise alternative that meets the requirements of the decision maker, an interactive procedure is integrated in the resolution of the problem, allowing the modification of the preference information articulated by the decision maker. The framework is prototypically implemented in a computer system. First results of test runs on multiple depot vehicle routing problems with time windows are reported.
A Computational Study of Genetic Crossover Operators for Multi-Objective Vehicle Routing Problem with Soft Time Windows
The article describes an investigation of the effectiveness of genetic algorithms for multi-objective combinatorial optimization (MOCO) by presenting an application for the vehicle routing problem with soft time windows. The work is motivated by the question, if and how the problem structure influences the effectiveness of different configurations of the genetic algorithm. Computational results are presented for different classes of vehicle routing problems, varying in their coverage with time windows, time window size, distribution and number of customers. The results are compared with a simple, but effective local search approach for multi-objective combinatorial optimization problems.
From Data to the p-Adic or Ultrametric Model
We model anomaly and change in data by embedding the data in an ultrametric space. Taking our initial data as cross-tabulation counts (or other input data formats), Correspondence Analysis allows us to endow the information space with a Euclidean metric. We then model anomaly or change by an induced ultrametric. The induced ultrametric that we are particularly interested in takes a sequential - e.g. temporal - ordering of the data into account. We apply this work to the flow of narrative expressed in the film script of the Casablanca movie; and to the evolution between 1988 and 2004 of the Colombian social conflict and violence.