Goto

Collaborating Authors

 University of Groningen


A Set of Recommendations for Assessing Human–Machine Parity in Language Translation

Journal of Artificial Intelligence Research

The quality of machine translation has increased remarkably over the past years, to the degree that it was found to be indistinguishable from professional human translation in a number of empirical investigations. We reassess Hassan et al.'s 2018 investigation into Chinese to English news translation, showing that the finding of human–machine parity was owed to weaknesses in the evaluation design—which is currently considered best practice in the field. We show that the professional human translations contained significantly fewer errors, and that perceived quality in human evaluation depends on the choice of raters, the availability of linguistic context, and the creation of reference translations. Our results call for revisiting current best practices to assess strong machine translation systems in general and human–machine parity in particular, for which we offer a set of recommendations based on our empirical findings.


Cognitive Architectures: Innate or Learned?

AAAI Conferences

Cognitive architectures are generally considered to be theories of the innate capabilities of the (human) cognitive system.Any knowledge that is not innate is encoded in the architectures memory systems, either by the modeler or learned by the architecture itself. However, in humanintelligent behavior few things are innate. An alternative is to acknowledge that learning occurs at different levels of abstraction. A standard model of the mind should therefore span multiple levels of abstraction, encouraging research efforts to establish learning mechanism that connect them.


HowNutsAreTheDutch: Personalized Feedback on a National Scale

AAAI Conferences

A paradigm shift is taking place in the field of men- tal healthcare and patient wellbeing. Traditionally, the attempts at sustaining and enhancing wellbeing were mainly based on the comparison of the individual with the population average. Recently, attention has shifted towards a more personal, idiographic approach. Such shift calls for new solutions to get data about individu- als, create personalized models of wellbeing and trans- lating these into personalized advice. Idiographic research can be conducted on a large scale by letting people measure themselves. Repeated collec- tion of data, for example by means of questionnaires, provides individuals feedback on and insight into their wellbeing. A way to partially automate this feedback process is by creating software that statistically ana- lyzes, using a method known as vector autoregression, repetitive questionnaire data to determine cause-effect relationships between the measured features. In this pa- per we describe a means to facilitate these repetitive measurements and to partially automate the feedback process. The paper provides an overview and technical description of such automated analyses software, named Autovar, and its use in an online self-measurement plat- form.



ICAIL 2013: The Fourteenth International Conference on Artificial Intelligence and Law

AI Magazine

ICAIL 2013: The Fourteenth International Conference on Artificial Intelligence and Law Abstract The 14th International Conference on AI and Law (ICAIL 2013) was held in Rome, Italy, June 10-14, 2013. The 14th International Conference on AI and Law (ICAIL 2013) was held in Rome, Italy, June 10-14, 2013.


ICAIL 2013: The Fourteenth International Conference on Artificial Intelligence and Law

AI Magazine

In order to emphasize the importance of implemented systems for the field, we also called for system demonstrations; 7 were accepted for the conference, 1 of them associated with a research abstract and 6 of them described in a demonstration extended abstract. At this edition of ICAIL, the Donald H. Berman best student paper award was won by Tran Thi Oanh (Japan Advanced Institute of Science and Technology; JAIST) for the paper entitled "Reference Resolution in Legal Texts" that she wrote with Minh Le Nguyen and Akira Shimazu. Traditionally, ICAIL hosts a lively and varied program of tutorials and workshops. At this conference, there were tutorials covering an introduction to artificial intelligence and law, web ontology and data design, LegalRuleML, and textual information extraction. There were workshops on argumentation, coherence, open and smart data, evidence, e-discovery, e-justice, and network analysis. Also, the international workshop series, Computational Models of Natural Argument, joined ICAIL for its 13th edition (CMNA XIII). The conference was held under the auspices of the Senate of the Italian Republic with as hosting institution the Consiglio Nazionale delle Ricerche (National Research Council of Italy), central unit in Rome. Both AAAI and ACM SIGART were in cooperation. Conference officials were Bart Verheij (program chair), Enrico Francesconi (conference chair), and Anne Gardner (secretary/treasurer).


The Gap between Architecture and Model: Strategies for Executive Control

AAAI Conferences

One major limitation of current cognitive architectures is that models are typically constructed in an "empty" architecture, and that the knowledge specifications (typically production rules) are specific to the particular task. This means that general executive control strategies have to be implemented for each specific model, which means a lack of consistency and constraint. Alternatively, they are implemented as part of the architecture itself, which is often implausible, because strategies are learned and differ among individuals. The alternative is to assume executive control consists of strategies that can transfer from one task to another. The PRIMs theory (Taatgen 2013) provides a modeling framework for this transfer. The approach is discussed using the example of working memory control.


Benchmarking Intelligent Service Robots through Scientific Competitions: The RoboCup@Home Approach

AAAI Conferences

The dynamical and uncertain environments of domestic service robots, which include humans, require rethinking of the benchmarking principles for testing these robots. In RoboCup@Home, statistical procedures are used to track and steer the progress of domestic service robots since 2006. This paper explains the procedures and shows outcomes of these international benchmarking efforts. Although aspects such as shopping in a supermarket receive a fair amount of attention in the robotics community, the authors think that a recently started test is the most important outcome of RoboCup@Home, namely the benchmarking of robot cognition.


Continual Planning with Sensing for Web Service Composition

AAAI Conferences

Web Service (WS) domains constitute an application field where automated planning can significantly contribute towards achieving customisable and adaptable compositions. Following the vision of using domain-independent planning and declarative complex goals to generate compositions based on atomic service descriptions, we apply a planning framework based on Constraint Satisfaction techniques to a domain consisting of WSs with diverse functionalities. One of the key requirements of such domains is the ability to address the incomplete knowledge problem, as well as recovering from failures that may occur during execution. We propose an algorithm for interleaving planning, monitoring and execution, where continual planning via altering the CSP is performed, under the light of the feedback acquired at runtime. The system is evaluated against a number of scenarios including real WSs, demonstrating the leverage of situations that can be effectively tackled with respect to previous approaches.


Modeling Deliberation in Teamwork

AAAI Conferences

Cooperation in multiagent systems essentially hinges on appropriate communication. This paper shows how to model communication in teamwork within TeamLog, the first multi-modal framework wholly capturing a methodology for working together. Taking off from the dialogue theory of Walton and Krabbe, the paper focuses on deliberation, the main type of dialogue during team planning. We provide a four-stage schema of deliberation dialogue along with semantics of adequate speech acts, filling the gap in logical modeling of communication during planning.