Goto

Collaborating Authors


Wise Sliding Window Segmentation: A classification-aided approach for trajectory segmentation

arXiv.org Machine Learning

Large amounts of mobility data are being generated from many different sources, and several data mining methods have been proposed for this data. One of the most critical steps for trajectory data mining is segmentation. This task can be seen as a pre-processing step in which a trajectory is divided into several meaningful consecutive sub-sequences. This process is necessary because trajectory patterns may not hold in the entire trajectory but on trajectory parts. In this work, we propose a supervised trajectory segmentation algorithm, called Wise Sliding Window Segmentation (WS-II). It processes the trajectory coordinates to find behavioral changes in space and time, generating an error signal that is further used to train a binary classifier for segmenting trajectory data. This algorithm is flexible and can be used in different domains. We evaluate our method over three real datasets from different domains (meteorology, fishing, and individuals movements), and compare it with four other trajectory segmentation algorithms: OWS, GRASP-UTS, CB-SMoT, and SPD. We observed that the proposed algorithm achieves the highest performance for all datasets with statistically significant differences in terms of the harmonic mean of purity and coverage.


A 20-Year Community Roadmap for Artificial Intelligence Research in the US

arXiv.org Artificial Intelligence

Decades of research in artificial intelligence (AI) have produced formidable technologies that are providing immense benefit to industry, government, and society. AI systems can now translate across multiple languages, identify objects in images and video, streamline manufacturing processes, and control cars. The deployment of AI systems has not only created a trillion-dollar industry that is projected to quadruple in three years, but has also exposed the need to make AI systems fair, explainable, trustworthy, and secure. Future AI systems will rightfully be expected to reason effectively about the world in which they (and people) operate, handling complex tasks and responsibilities effectively and ethically, engaging in meaningful communication, and improving their awareness through experience. Achieving the full potential of AI technologies poses research challenges that require a radical transformation of the AI research enterprise, facilitated by significant and sustained investment. These are the major recommendations of a recent community effort coordinated by the Computing Community Consortium and the Association for the Advancement of Artificial Intelligence to formulate a Roadmap for AI research and development over the next two decades.


Top Data Sources for Journalists in 2018 (350 Sources)

@machinelearnbot

There are many different types of sites that provide a wealth of free, freemium and paid data that can help audience developers and journalists with their reporting and storytelling efforts, The team at State of Digital Publishing would like to acknowledge these, as derived from manual searches and recognition from our existing audience. Kaggle's a site that allows users to discover machine learning while writing and sharing cloud-based code. Relying primarily on the enthusiasm of its sizable community, the site hosts dataset competitions for cash prizes and as a result it has massive amounts of data compiled into it. Whether you're looking for historical data from the New York Stock Exchange, an overview of candy production trends in the US, or cutting edge code, this site is chockful of information. It's impossible to be on the Internet for long without running into a Wikipedia article.


Truth Is a Lie: Crowd Truth and the Seven Myths of Human Annotation

AI Magazine

Big data is having a disruptive impact across the sciences. Human annotation of semantic interpretation tasks is a critical part of big data semantics, but it is based on an antiquated ideal of a single correct truth that needs to be similarly disrupted. We expose seven myths about human annotation, most of which derive from that antiquated ideal of truth, and dispell these myths with examples from our research. We propose a new theory of truth, crowd truth, that is based on the intuition that human interpretation is subjective, and that measuring annotations on the same objects of interpretation (in our examples, sentences) across a crowd will provide a useful representation of their subjectivity and the range of reasonable interpretations.