Goto

Collaborating Authors

Results


The New Intelligence Game

#artificialintelligence

The relevance of the video is that the browser identified the application being used by the IAI as Google Earth and, according to the OSC 2006 report, the Arabic-language caption reads Islamic Army in Iraq/The Military Engineering Unit – Preparations for Rocket Attack, the video was recorded in 5/1/2006, we provide, in Appendix A, a reproduction of the screenshot picture made available in the OSC report. Now, prior to the release of this video demonstration of the use of Google Earth to plan attacks, in accordance with the OSC 2006 report, in the OSC-monitored online forums, discussions took place on the use of Google Earth as a GEOINT tool for terrorist planning. On August 5, 2005 the user "Al-Illiktrony" posted a message to the Islamic Renewal Organization forum titled A Gift for the Mujahidin, a Program To Enable You to Watch Cities of the World Via Satellite, in this post the author dedicated Google Earth to the mujahidin brothers and to Shaykh Muhammad al-Mas'ari, the post was replied in the forum by "Al-Mushtaq al-Jannah" warning that Google programs retain complete information about their users. This is a relevant issue, however, there are two caveats, given the amount of Google Earth users, it may be difficult for Google to flag a jihadist using the functionality in time to prevent an attack plan, one possible solution would be for Google to flag computers based on searched websites and locations, for instance to flag computers that visit certain critical sites, but this is a problem when landmarks are used, furthermore, and this is the second caveat, one may not use one's own computer to produce the search or even mask the IP address. On October 3, 2005, as described in the OSC 2006 report, in a reply to a posting by Saddam Al-Arab on the Baghdad al-Rashid forum requesting the identification of a roughly sketched map, "Almuhannad" posted a link to a site that provided a free download of Google Earth, suggesting that the satellite imagery from Google's service could help identify the sketch.


OpenAI rolls out new text-generating models that it claims are less toxic

#artificialintelligence

Did you miss a session from the Future of Work Summit? Large language models (LLMs) such as OpenAI's GPT-3, which can "write" sentences that read nearly like they were written by a human, can be prompted to perform a range of writing tasks given only a few examples of the tasks. For example, LLMs have been used to create marketing materials and video game levels in addition to recipes, poetry, and movie scripts. But because LLMs learn to write from examples taken from sometimes toxic communities, they can fall victim to parroting misinformation, sexism, ageism, racism, and conspiracies. Efforts have been made to combat toxicity in LLMs -- with mixed results.


Challenges of Artificial Intelligence -- From Machine Learning and Computer Vision to Emotional Intelligence

arXiv.org Artificial Intelligence

Artificial intelligence (AI) has become a part of everyday conversation and our lives. It is considered as the new electricity that is revolutionizing the world. AI is heavily invested in both industry and academy. However, there is also a lot of hype in the current AI debate. AI based on so-called deep learning has achieved impressive results in many problems, but its limits are already visible. AI has been under research since the 1940s, and the industry has seen many ups and downs due to over-expectations and related disappointments that have followed. The purpose of this book is to give a realistic picture of AI, its history, its potential and limitations. We believe that AI is a helper, not a ruler of humans. We begin by describing what AI is and how it has evolved over the decades. After fundamentals, we explain the importance of massive data for the current mainstream of artificial intelligence. The most common representations for AI, methods, and machine learning are covered. In addition, the main application areas are introduced. Computer vision has been central to the development of AI. The book provides a general introduction to computer vision, and includes an exposure to the results and applications of our own research. Emotions are central to human intelligence, but little use has been made in AI. We present the basics of emotional intelligence and our own research on the topic. We discuss super-intelligence that transcends human understanding, explaining why such achievement seems impossible on the basis of present knowledge,and how AI could be improved. Finally, a summary is made of the current state of AI and what to do in the future. In the appendix, we look at the development of AI education, especially from the perspective of contents at our own university.


GPT-3 Creative Fiction

#artificialintelligence

What if I told a story here, how would that story start?" Thus, the summarization prompt: "My second grader asked me what this passage means: …" When a given prompt isn't working and GPT-3 keeps pivoting into other modes of completion, that may mean that one hasn't constrained it enough by imitating a correct output, and one needs to go further; writing the first few words or sentence of the target output may be necessary.


Imposing Regulation on Advanced Algorithms

arXiv.org Artificial Intelligence

This book discusses the necessity and perhaps urgency for the regulation of algorithms on which new technologies rely; technologies that have the potential to re-shape human societies. From commerce and farming to medical care and education, it is difficult to find any aspect of our lives that will not be affected by these emerging technologies. At the same time, artificial intelligence, deep learning, machine learning, cognitive computing, blockchain, virtual reality and augmented reality, belong to the fields most likely to affect law and, in particular, administrative law. The book examines universally applicable patterns in administrative decisions and judicial rulings. First, similarities and divergence in behavior among the different cases are identified by analyzing parameters ranging from geographical location and administrative decisions to judicial reasoning and legal basis. As it turns out, in several of the cases presented, sources of general law, such as competition or labor law, are invoked as a legal basis, due to the lack of current specialized legislation. This book also investigates the role and significance of national and indeed supranational regulatory bodies for advanced algorithms and considers ENISA, an EU agency that focuses on network and information security, as an interesting candidate for a European regulator of advanced algorithms. Lastly, it discusses the involvement of representative institutions in algorithmic regulation.


Natural language processing for word sense disambiguation and information extraction

arXiv.org Artificial Intelligence

This research work deals with Natural Language Processing (NLP) and extraction of essential information in an explicit form. The most common among the information management strategies is Document Retrieval (DR) and Information Filtering. DR systems may work as combine harvesters, which bring back useful material from the vast fields of raw material. With large amount of potentially useful information in hand, an Information Extraction (IE) system can then transform the raw material by refining and reducing it to a germ of original text. A Document Retrieval system collects the relevant documents carrying the required information, from the repository of texts. An IE system then transforms them into information that is more readily digested and analyzed. It isolates relevant text fragments, extracts relevant information from the fragments, and then arranges together the targeted information in a coherent framework. The thesis presents a new approach for Word Sense Disambiguation using thesaurus. The illustrative examples supports the effectiveness of this approach for speedy and effective disambiguation. A Document Retrieval method, based on Fuzzy Logic has been described and its application is illustrated. A question-answering system describes the operation of information extraction from the retrieved text documents. The process of information extraction for answering a query is considerably simplified by using a Structured Description Language (SDL) which is based on cardinals of queries in the form of who, what, when, where and why. The thesis concludes with the presentation of a novel strategy based on Dempster-Shafer theory of evidential reasoning, for document retrieval and information extraction. This strategy permits relaxation of many limitations, which are inherent in Bayesian probabilistic approach.


On the Integration of LinguisticFeatures into Statistical and Neural Machine Translation

arXiv.org Artificial Intelligence

New machine translations (MT) technologies are emerging rapidly and with them, bold claims of achieving human parity such as: (i) the results produced approach "accuracy achieved by average bilingual human translators" (Wu et al., 2017b) or (ii) the "translation quality is at human parity when compared to professional human translators" (Hassan et al., 2018) have seen the light of day (Laubli et al., 2018). Aside from the fact that many of these papers craft their own definition of human parity, these sensational claims are often not supported by a complete analysis of all aspects involved in translation. Establishing the discrepancies between the strengths of statistical approaches to MT and the way humans translate has been the starting point of our research. By looking at MT output and linguistic theory, we were able to identify some remaining issues. The problems range from simple number and gender agreement errors to more complex phenomena such as the correct translation of aspectual values and tenses. Our experiments confirm, along with other studies (Bentivogli et al., 2016), that neural MT has surpassed statistical MT in many aspects. However, some problems remain and others have emerged. We cover a series of problems related to the integration of specific linguistic features into statistical and neural MT, aiming to analyse and provide a solution to some of them. Our work focuses on addressing three main research questions that revolve around the complex relationship between linguistics and MT in general. We identify linguistic information that is lacking in order for automatic translation systems to produce more accurate translations and integrate additional features into the existing pipelines. We identify overgeneralization or 'algorithmic bias' as a potential drawback of neural MT and link it to many of the remaining linguistic issues.



Probabilistic Graphical Models for Credibility Analysis in Evolving Online Communities

arXiv.org Machine Learning

One of the major hurdles preventing the full exploitation of information from online communities is the widespread concern regarding the quality and credibility of user-contributed content. Prior works in this domain operate on a static snapshot of the community, making strong assumptions about the structure of the data (e.g., relational tables), or consider only shallow features for text classification. To address the above limitations, we propose probabilistic graphical models that can leverage the joint interplay between multiple factors in online communities --- like user interactions, community dynamics, and textual content --- to automatically assess the credibility of user-contributed online content, and the expertise of users and their evolution with user-interpretable explanation. To this end, we devise new models based on Conditional Random Fields for different settings like incorporating partial expert knowledge for semi-supervised learning, and handling discrete labels as well as numeric ratings for fine-grained analysis. This enables applications such as extracting reliable side-effects of drugs from user-contributed posts in healthforums, and identifying credible content in news communities. Online communities are dynamic, as users join and leave, adapt to evolving trends, and mature over time. To capture this dynamics, we propose generative models based on Hidden Markov Model, Latent Dirichlet Allocation, and Brownian Motion to trace the continuous evolution of user expertise and their language model over time. This allows us to identify expert users and credible content jointly over time, improving state-of-the-art recommender systems by explicitly considering the maturity of users. This also enables applications such as identifying helpful product reviews, and detecting fake and anomalous reviews with limited information.


Notes on a New Philosophy of Empirical Science

arXiv.org Machine Learning

This book presents a methodology and philosophy of empirical science based on large scale lossless data compression. In this view a theory is scientific if it can be used to build a data compression program, and it is valuable if it can compress a standard benchmark database to a small size, taking into account the length of the compressor itself. This methodology therefore includes an Occam principle as well as a solution to the problem of demarcation. Because of the fundamental difficulty of lossless compression, this type of research must be empirical in nature: compression can only be achieved by discovering and characterizing empirical regularities in the data. Because of this, the philosophy provides a way to reformulate fields such as computer vision and computational linguistics as empirical sciences: the former by attempting to compress databases of natural images, the latter by attempting to compress large text databases. The book argues that the rigor and objectivity of the compression principle should set the stage for systematic progress in these fields. The argument is especially strong in the context of computer vision, which is plagued by chronic problems of evaluation. The book also considers the field of machine learning. Here the traditional approach requires that the models proposed to solve learning problems be extremely simple, in order to avoid overfitting. However, the world may contain intrinsically complex phenomena, which would require complex models to understand. The compression philosophy can justify complex models because of the large quantity of data being modeled (if the target database is 100 Gb, it is easy to justify a 10 Mb model). The complex models and abstractions learned on the basis of the raw data (images, language, etc) can then be reused to solve any specific learning problem, such as face recognition or machine translation.