AITopics | Law

Collaborating Authors

Law

Utilisation of Metadata Fields and Query Expansion in Cross-Lingual Search of User-Generated Internet Video

Khwileh, Ahmad, Ganguly, Debasis, J. F. Jones, Gareth

Journal of Artificial Intelligence ResearchJan-27-2016

Recent years have seen significant efforts in the area of Cross Language Information Retrieval (CLIR) for text retrieval. This work initially focused on formally published content, but more recently research has begun to concentrate on CLIR for informal social media content. However, despite the current expansion in online multimedia archives, there has been little work on CLIR for this content. While there has been some limited work on Cross-Language Video Retrieval (CLVR) for professional videos, such as documentaries or TV news broadcasts, there has to date, been no significant investigation of CLVR for the rapidly growing archives of informal user generated (UGC) content. Key differences between such UGC and professionally produced content are the nature and structure of the textual UGC metadata associated with it, as well as the form and quality of the content itself. In this setting, retrieval effectiveness may not only suffer from translation errors common to all CLIR tasks, but also recognition errors associated with the automatic speech recognition (ASR) systems used to transcribe the spoken content of the video and with the informality and inconsistency of the associated user-created metadata for each video. This work proposes and evaluates techniques to improve CLIR effectiveness of such noisy UGC content. Our experimental investigation shows that different sources of evidence, e.g. the content from different fields of the structured metadata, significantly affect CLIR effectiveness. Results from our experiments also show that each metadata field has a varying robustness to query expansion (QE) and hence can have a negative impact on the CLIR effectiveness. Our work proposes a novel adaptive QE technique that predicts the most reliable source for expansion and shows how this technique can be effective for improving the CLIR effectiveness for UGC content.

effectiveness, query, retrieval, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4775

AI Access Foundation

10979

Journal of Artificial Intelligence Research

Country:

North America > United States > Maryland (0.04)
Europe > Ireland (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law > Statutes (0.92)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.71)

Add feedback

Bayesian Estimation of Bipartite Matchings for Record Linkage

Sadinle, Mauricio

arXiv.org Machine LearningJan-25-2016

The bipartite record linkage task consists of merging two disparate datafiles containing information on two overlapping sets of entities. This is non-trivial in the absence of unique identifiers and it is important for a wide variety of applications given that it needs to be solved whenever we have to combine information from different sources. Most statistical techniques currently used for record linkage are derived from a seminal paper by Fellegi and Sunter (1969). These techniques usually assume independence in the matching statuses of record pairs to derive estimation procedures and optimal point estimators. We argue that this independence assumption is unreasonable and instead target a bipartite matching between the two datafiles as our parameter of interest. Bayesian implementations allow us to quantify uncertainty on the matching decisions and derive a variety of point estimators using different loss functions. We propose partial Bayes estimates that allow uncertain parts of the bipartite matching to be left unresolved. We evaluate our approach to record linkage using a variety of challenging scenarios and show that it outperforms the traditional methodology. We illustrate the advantages of our methods merging two datafiles on casualties from the civil war of El Salvador.

artificial intelligence, bipartite, machine learning, (17 more...)

arXiv.org Machine Learning

1601.0663

Country:

North America > United States (1.00)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)

Genre:

Research Report (0.64)
Overview (0.46)

Industry:

Law (1.00)
Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Client Profiling for an Anti-Money Laundering System

Alexandre, Claudio, Balsa, João

arXiv.org Artificial IntelligenceJan-11-2016

Acts of prevention and fight against money laundering (ML) crimes are prioritized by almost every government in the world, at the same level of the most relevant global issues. Money laundering is a crime that typically consists in making a certain illegal financial gain into a legal gain. According to the United Nations Office on Drugs and Crimes (UNODC) the annual global estimate of laundered money is about 2% - 5% of the Gross World Product, or US$800 billion - US$2 trillion [1]. As if the financial volume were not enough, another reason for governments to focus on this crime is for the fact that it is clearly connected to other types of crimes such as illegal drug trade, fraud, corruption, kidnapping, terrorism, arms smuggling, among others. Most countries' financial authorities, usually Central Banks, are responsible for controlling and defining antimoney laundering (AML) regulations, demanding from financial institutions the implementation of procedures that apply the defined norms.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1510.00878

Country:

Europe > Portugal > Lisbon > Lisbon (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > New Zealand > North Island > Waikato (0.04)
(6 more...)

Genre: Research Report (0.64)

Industry:

Law (1.00)
Banking & Finance (1.00)
Law Enforcement & Public Safety > Fraud (0.84)
Government > Intergovernmental Programs (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.96)

Add feedback

Research Priorities for Robust and Beneficial Artificial Intelligence

Russell, Stuart (University of California, Berkeley) | Dewey, Daniel (Oxford University) | Tegmark, Max (Massachusetts Institute of Technology)

AI MagazineDec-31-2015

Success in the quest for artificial intelligence has the potential to bring unprecedented benefits to humanity, and it is therefore worthwhile to investigate how to maximize these benefits while avoiding potential pitfalls. This article gives numerous examples (which should by no means be construed as an exhaustive list) of such worthwhile research aimed at ensuring that AI remains robust and beneficial.

ai system, data mining, machine learning, (15 more...)

AI Magazine

Country:

Europe (0.93)
North America > United States > California > Santa Clara County (0.28)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Military (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(3 more...)

Add feedback

Human Memory Search as Initial-Visit Emitting Random Walk

Jun, Kwang-Sung, Zhu, Jerry, Rogers, Timothy T., Yang, Zhuoran, yuan, ming

Neural Information Processing SystemsDec-31-2015

Imagine a random walk that outputs a state only when visiting it for the first time. The observed output is therefore a repeat-censored version of the underlying walk, and consists of a permutation of the states or a prefix of it. We call this model initial-visit emitting random walk (INVITE). Prior work has shown that the random walks with such a repeat-censoring mechanism explain well human behavior in memory search tasks, which is of great interest in both the study of human cognition and various clinical applications. However, parameter estimation in INVITE is challenging, because naive likelihood computation by marginalizing over infinitely many hidden random walk trajectories is intractable. In this paper, we propose the first efficient maximum likelihood estimate (MLE) for INVITE by decomposing the censored output into a series of absorbing random walks. We also prove theoretical properties of the MLE including identifiability and consistency. We show that INVITE outperforms several existing methods on real-world human response data from memory search tasks.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Law > Civil Rights & Constitutional Law (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

Add feedback

CrossCat: A Fully Bayesian Nonparametric Method for Analyzing Heterogeneous, High Dimensional Data

Mansinghka, Vikash, Shafto, Patrick, Jonas, Eric, Petschulat, Cap, Gasner, Max, Tenenbaum, Joshua B.

arXiv.org Machine LearningDec-3-2015

There is a widespread need for statistical methods that can analyze high-dimensional datasets with- out imposing restrictive or opaque modeling assumptions. This paper describes a domain-general data analysis method called CrossCat. CrossCat infers multiple non-overlapping views of the data, each consisting of a subset of the variables, and uses a separate nonparametric mixture to model each view. CrossCat is based on approximately Bayesian inference in a hierarchical, nonparamet- ric model for data tables. This model consists of a Dirichlet process mixture over the columns of a data table in which each mixture component is itself an independent Dirichlet process mixture over the rows; the inner mixture components are simple parametric models whose form depends on the types of data in the table. CrossCat combines strengths of mixture modeling and Bayesian net- work structure learning. Like mixture modeling, CrossCat can model a broad class of distributions by positing latent variables, and produces representations that can be efficiently conditioned and sampled from for prediction. Like Bayesian networks, CrossCat represents the dependencies and independencies between variables, and thus remains accurate when there are multiple statistical signals. Inference is done via a scalable Gibbs sampling scheme; this paper shows that it works well in practice. This paper also includes empirical results on heterogeneous tabular data of up to 10 million cells, such as hospital cost and quality measures, voting records, unemployment rates, gene expression measurements, and images of handwritten digits. CrossCat infers structure that is consistent with accepted findings and common-sense knowledge in multiple domains and yields predictive accuracy competitive with generative, discriminative, and model-free alternatives.

artificial intelligence, crosscat, machine learning, (19 more...)

arXiv.org Machine Learning

1512.01272

Country: North America > United States > Texas (0.28)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(3 more...)

Add feedback

Ethical Artificial Intelligence

Hibbard, Bill

arXiv.org Artificial IntelligenceNov-17-2015

This book-length article combines several peer reviewed papers and new material to analyze the issues of ethical artificial intelligence (AI). The behavior of future AI systems can be described by mathematical equations, which are adapted to analyze possible unintended AI behaviors and ways that AI designs can avoid them. This article makes the case for utility-maximizing agents and for avoiding infinite sets in agent definitions. It shows how to avoid agent self-delusion using model-based utility functions and how to avoid agents that corrupt their reward generators (sometimes called "perverse instantiation") using utility functions that evaluate outcomes at one point in time from the perspective of humans at a different point in time. It argues that agents can avoid unintended instrumental actions (sometimes called "basic AI drives" or "instrumental goals") by accurately learning human values. This article defines a self-modeling agent framework and shows how it can avoid problems of resource limits, being predicted by other agents, and inconsistency between the agent's utility function and its definition (one version of this problem is sometimes called "motivated value selection"). This article also discusses how future AI will differ from current AI, the politics of AI, and the ultimate use of AI to help understand the nature of the universe and our place in it.

logic & formal reasoning, machine learning, programming language, (21 more...)

arXiv.org Artificial Intelligence

1411.1373

Country:

North America > United States (1.00)
Europe (0.67)

Genre:

Summary/Review (1.00)
Research Report (0.63)

Industry:

Transportation > Ground > Road (1.00)
Leisure & Entertainment > Games > Chess (1.00)
Law Enforcement & Public Safety (1.00)
(13 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
(6 more...)

Add feedback

Expressiveness of Two-Valued Semantics for Abstract Dialectical Frameworks

Strass, Hannes

Journal of Artificial Intelligence ResearchNov-1-2015

By expressiveness we mean the ability to encode a desired set of two-valued interpretations over a given propositional vocabulary A using only atoms from A. We also compare ADFs' expressiveness with that of (the two-valued semantics of) abstract argumentation frameworks, normal logic programs and propositional logic. While the computational complexity of the two-valued model existence problem for all these languages is (almost) the same, we show that the languages form a neat hierarchy with respect to their expressiveness. We then demonstrate that this hierarchy collapses once we allow to introduce a linear number of new vocabulary elements. We finally also analyse and compare the representational succinctness of ADFs (for two-valued model semantics), that is, their capability to represent two-valued interpretation sets in a space-efficient manner.

adf, formula, logic program, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4879

AI Access Foundation

10962

Journal of Artificial Intelligence Research

Country:

Europe > Austria > Vienna (0.14)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
Europe > Germany > Saxony > Leipzig (0.04)
(4 more...)

Industry: Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.94)

Add feedback

Extracting Structured Information via Automatic + Human Computation

Pavlick, Ellie (University of Pennsylvania) | Callison-Burch, Chris (University of Pennsylvania)

AAAI ConferencesNov-1-2015

We present a system for extracting structured information from unstructured text using a combination of information retrieval, natural language processing, machine learning, and crowdsourcing. We test our pipeline by building a structured database of gun violence incidents in the United States. The results of our pilot study demonstrate that the proposed methodology is a viable way of collecting large-scale, up-to-date data for public health, public policy, and social science research.

database, human computation, information, (11 more...)

AAAI Conferences

Third AAAI Conference on Human Computation and Crowdsourcing

Country: North America > United States > Pennsylvania (0.05)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine (1.00)
Law (0.92)
Government (0.91)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (0.38)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.35)

Add feedback

Machine Interface for Contracting Assistance

Summers, Jason E. ( Applied Research in Acoustics LLC ) | Redmond, Daniel T. (Applied Research in Acoustics LLC) | Gaumond, Charles F. (Applied Research in Acoustics LLC)

AAAI ConferencesNov-1-2015

We describe a cognitive assistant in early-stage development for the United States Air Force as an aid to contracting officers and potential commercial offerors for navigating the government-contracting process. The goal is easing compliance and affording flexibility and transparency so as to support an innovative and rapid acquisition process. The motivation, use cases, and technical approach for MICA, a Machine Interface for Contracting Assistance, are discussed here along with the technical challenges posed.

artificial intelligence, machine learning, natural language, (19 more...)

AAAI Conferences

2015 AAAI Fall Symposium Series

Country: North America > United States > District of Columbia > Washington (0.04)

Industry:

Law > Statutes (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.70)

Add feedback