Goto

Collaborating Authors

 South America


Multi-Agent Deep Reinforcement Learning for Request Dispatching in Distributed-Controller Software-Defined Networking

arXiv.org Artificial Intelligence

Recently, distributed controller architectures have been quickly gaining popularity in Software-Defined Networking (SDN). However, the use of distributed controllers introduces a new and important Request Dispatching (RD) problem with the goal for every SDN switch to properly dispatch their requests among all controllers so as to optimize network performance. This goal can be fulfilled by designing an RD policy to guide distribution of requests at each switch. In this paper, we propose a Multi-Agent Deep Reinforcement Learning (MA-DRL) approach to automatically design RD policies with high adaptability and performance. This is achieved through a new problem formulation in the form of a Multi-Agent Markov Decision Process (MA-MDP), a new adaptive RD policy design and a new MA-DRL algorithm called MA-PPO. Extensive simulation studies show that our MA-DRL technique can effectively train RD policies to significantly outperform man-made policies, model-based policies, as well as RD policies learned via single-agent DRL algorithms.


The Arc of the Data Scientific Universe

arXiv.org Artificial Intelligence

In this paper I explore the scaffolding of normative assumptions that supports Sabina Leonelli's implicit appeal to the values of epistemic integrity and the global public good that conjointly animate the ethos of responsible and sustainable data work in the context of COVID-19. Drawing primarily on the writings of sociologist Robert K. Merton, the thinkers of the Vienna Circle, and Charles Sanders Peirce, I make some of these assumptions explicit by telling a longer story about the evolution of social thinking about the normative structure of science from Merton's articulation of his well-known norms (those of universalism, communism, organized skepticism, and disinterestedness) to the present. I show that while Merton's norms and his intertwinement of these with the underlying mechanisms of democratic order provide us with an especially good starting point to explore and clarify the commitments and values of science, Leonelli's broader, more context-responsive, and more holistic vision of the epistemic integrity of data scientific understanding, and her discernment of the global and biospheric scope of its moral-practical reach, move beyond Merton's schema in ways that effectively draw upon important critiques. Stepping past Merton, I argue that a combination of situated universalism, methodological pluralism, strong objectivity, and unbounded communalism must guide the responsible and sustainable data work of the future.


Vampire With a Brain Is a Good ITP Hammer

arXiv.org Artificial Intelligence

Vampire has been for a long time the strongest first-order automated theorem prover, widely used for hammer-style proof automation in ITPs such as Mizar, Isabelle, HOL and Coq. In this work, we considerably improve the performance of Vampire in hammering over the full Mizar library by enhancing its saturation procedure with efficient neural guidance. In particular, we employ a recursive neural network classifying the generated clauses based only on their derivation history. Compared to previous neural methods based on considering the logical content of the clauses, this leads to large real-time speedup of the neural guidance. The resulting system shows good learning capability and achieves state-of-the-art performance on the Mizar library, while proving many theorems that the related ENIGMA system could not prove in a similar hammering evaluation.


Equilibrium Learning in Combinatorial Auctions: Computing Approximate Bayesian Nash Equilibria via Pseudogradient Dynamics

arXiv.org Artificial Intelligence

While the complexity of computing Bayes-Nash equilibria Applications of combinatorial auctions (CA) as market mechanisms (BNE) is not well understood, Cai and Papadimitriou [14] show that are prevalent in practice, yet their Bayesian Nash equilibria (BNE) BNE computation for a specific combinatorial auction is already (at remain poorly understood. Analytical solutions are known only for least) PP-hard. Furthermore, finding an -approximation to a BNE is a few cases where the problem can be reformulated as a tractable still NP-hard. Explicit solutions exist for very few specific environments, partial differential equation (PDE). In the general case, finding BNE but in general, we neither know whether a BNE exists nor is known to be computationally hard. Previous work on numerical do we have a solution theory. Combinatorial auctions have become computation of BNE in auctions has relied either on solving such a pivotal research problem in algorithmic game theory [29] and PDEs explicitly, calculating pointwise best-responses in strategy they are widely used in the field [8, 15]. Thus, understanding their space, or iteratively solving restricted subgames. In this study, we equilibria is paramount, and access to scalable numerical methods present a generic yet scalable alternative multi-agent equilibrium for computing or approximating BNE can have a significant impact.


AI Can Stop Mass Shootings, and More

arXiv.org Artificial Intelligence

We propose to build directly upon our longstanding, prior r&d in AI/machine ethics in order to attempt to make real the blue-sky idea of AI that can thwart mass shootings, by bringing to bear its ethical reasoning. The r&d in question is overtly and avowedly logicist in form, and since we are hardly the only ones who have established a firm foundation in the attempt to imbue AI's with their own ethical sensibility, the pursuit of our proposal by those in different methodological camps should, we believe, be considered as well. We seek herein to make our vision at least somewhat concrete by anchoring our exposition to two simulations, one in which the AI saves the lives of innocents by locking out a malevolent human's gun, and a second in which this malevolent agent is allowed by the AI to be neutralized by law enforcement. Along the way, some objections are anticipated, and rebutted.


Applications of Machine Learning in Document Digitisation

arXiv.org Machine Learning

Data acquisition forms the primary step in all empirical research. The availability of data directly impacts the quality and extent of conclusions and insights. In particular, larger and more detailed datasets provide convincing answers even to complex research questions. The main problem is that 'large and detailed' usually implies 'costly and difficult', especially when the data medium is paper and books. Human operators and manual transcription have been the traditional approach for collecting historical data. We instead advocate the use of modern machine learning techniques to automate the digitisation process. We give an overview of the potential for applying machine digitisation for data collection through two illustrative applications. The first demonstrates that unsupervised layout classification applied to raw scans of nurse journals can be used to construct a treatment indicator. Moreover, it allows an assessment of assignment compliance. The second application uses attention-based neural networks for handwritten text recognition in order to transcribe age and birth and death dates from a large collection of Danish death certificates. We describe each step in the digitisation pipeline and provide implementation insights.


Hyperspherical embedding for novel class classification

arXiv.org Artificial Intelligence

Deep learning models have become increasingly useful in many different industries. On the domain of image classification, convolutional neural networks proved the ability to learn robust features for the closed set problem, as shown in many different datasets, such as MNIST FASHIONMNIST, CIFAR10, CIFAR100, and IMAGENET. These approaches use deep neural networks with dense layers with softmax activation functions in order to learn features that can separate classes in a latent space. However, this traditional approach is not useful for identifying classes unseen on the training set, known as the open set problem. A similar problem occurs in scenarios involving learning on small data. To tackle both problems, few-shot learning has been proposed. In particular, metric learning learns features that obey constraints of a metric distance in the latent space in order to perform classification. However, while this approach proves to be useful for the open set problem, current implementation requires pair-wise training, where both positive and negative examples of similar images are presented during the training phase, which limits the applicability of these approaches in large data or large class scenarios given the combinatorial nature of the possible inputs.In this paper, we present a constraint-based approach applied to the representations in the latent space under the normalized softmax loss, proposed by[18]. We experimentally validate the proposed approach for the classification of unseen classes on different datasets using both metric learning and the normalized softmax loss, on disjoint and joint scenarios. Our results show that not only our proposed strategy can be efficiently trained on larger set of classes, as it does not require pairwise learning, but also present better classification results than the metric learning strategies surpassing its accuracy by a significant margin.


Speedy robots gather spectra for sky surveys

Science

It was one of the stranger and more monotonous jobs in astronomy: plugging optical fibers into hundreds of holes in aluminum plates. Every day, technicians with the Sloan Digital Sky Survey (SDSS) prepped up to 10 plates that would be placed that night at the focus of the survey's telescopes in Chile and New Mexico. The holes matched the exact positions of stars, galaxies, or other bright objects in the telescopes' view. Light from each object fell directly on a fiber and was whisked off to a spectrograph, which split the light into its component wavelengths, revealing key details such as what the object is made of and how it is moving. Now, after 20 years, the SDSS is going robotic. For the project's upcoming fifth set of surveys, known as the SDSS-V, plug plates are being replaced by 500 tiny robot arms, each holding fiber tips that patrol a small area of the telescope's focal plane. They can be reconfigured for a new sky map in 2 minutes. Other sky surveys are also adopting the speedy robots. They will not only save valuable observation time, but also allow the surveys to keep up with Europe's Gaia satellite, the upcoming Vera C. Rubin Observatory in Chile, and other efforts that produce huge catalogs of objects needing spectroscopic study. “It's driven by the science of enormous imaging surveys,” says astronomer Richard Ellis of University College London. COVID-19 has delayed the SDSS's robotic makeover. The survey's northern telescope at Apache Point Observatory in New Mexico began to take SDSS-V data in October 2020 using plug plates. It aims to switch over to the robots by mid-2021. The southern scope at Las Campanas Observatory in Chile will follow later in the year. “It's bananas,” says SDSS-V Director Juna Kollmeier of the Carnegie Observatories, “but we're seeing the end of the tunnel.” The robots mark a new chapter for the SDSS. For 10 years much of its time went to the study of dark energy, the mysterious force that is accelerating the universe's expansion. The SDSS prised apart the light of millions of galaxies to determine their distance, via a redshift—a Doppler shift in their light due to the expansion of the universe, like the wail of a receding siren. Results from the galaxy survey, released in July 2020, traced the universe's expansion back through 80% of its history with 1% precision, confirming the effects of dark energy, perhaps the biggest mystery in cosmology. Cracking it will require looking further back in time to fainter galaxies, which is beyond the capabilities of the survey's 2.5-meter telescopes. Instead, the scopes will carry out three new surveys. Milky Way Mapper will gather spectra from 6 million stars, probing their composition to find out how long they've been burning and forging heavy elements. “Stars are all clocks,” Kollmeier explains. With age estimates, astronomers can work out when parts of the Milky Way formed. Subtle shifts in composition can also reveal whether a group of stars originated in another galaxy or star cluster that has been subsumed into ours—an unwinding of Milky Way history called galactic archaeology. In a second survey, Black Hole Mapper, the optical fibers will gather light from bright galaxies to learn about the supermassive black holes they harbor. Doppler shifts in the spectra of glowing gases surrounding these black holes could reveal how fast they fling this material around—and thus how heavy they are. Shifts in the spectra could trace how they gobble up and spit out streams of this gas. By tracking the gases over time, Kollmeier says, astronomers may learn how the black holes grow, seemingly in concert with their galaxies. The third survey, Local Volume Mapper, will bunch fibers together like a multipixel detector to get spectra from clouds of interstellar gas within nearby galaxies. “We're mapping a whole galaxy in exquisite detail at one time,” Kollmeier says. By determining the motions and composition of the gas clouds, the SDSS team hopes to identify why some collapse into stars and others don't. Meanwhile, the dark energy quest pioneered by the SDSS will move to the Dark Energy Spectroscopic Instrument, a 5000-fiber robotic spectrograph on a 4-meter telescope in Arizona. It will soon begin to track the distances to tens of millions of galaxies in the remote universe ( Science , 13 September 2019, p. [1066][1]). ![Figure][2] In the coming months, the William Herschel Telescope, a 4.2-meter telescope in the Canary Islands, will join the robot revolution by sending light to a 1000-fiber spectrograph called the WHT Enhanced Area Velocity Explorer (WEAVE). Instead of using robots to hold fibers in place, WEAVE has two of them working offline, picking and placing magnetic fiber ends onto a metal plate—automating what the SDSS's plate pluggers did. One of WEAVE's goals is to gather Doppler shifts from the billion stars Gaia has mapped, nailing down their full 3D motions. Then, “We can run the clock backwards and see where they came from,” says project scientist Scott Trager of the University of Groningen. It's another way to do galactic archeology. Next year, the European Southern Observatory's (ESO's) 4-metre Multi-Object Spectroscopic Telescope in Chile will be fitted with yet another robotic technology. Its 2400 fibers will be fed through controllable “spines” that stick up into the telescope's focal plane and can be made to move, like wheat stalks in a breeze. Like WEAVE, it will follow up on sources identified by European spacecraft, including Gaia and Euclid, an upcoming dark energy mission. It and other fiber spectrographs will also help with studies of fast-moving cosmic events such as supernovae or the violent collisions that produce gravitational waves. The Rubin Observatory will spot many of them. From 2023, it's expected to detect 10 million fast-changing objects every night. For the thousands that demand scrutiny, “spectra are really critical for understanding what a source is,” says Eric Bellm of the University of Washington, Seattle, who is the science lead for Rubin's alert stream. Even some of the world's largest scopes, in the 8-meter range, are adding robotic spectrographs. Japan's Subaru and ESO's Very Large Telescope are both developing systems that will vacuum up spectra from faint, distant objects. Ellis says a fiber spectrograph combined with Subaru's 8.2-meter mirror would be able to pick out spectra of individual stars in the Andromeda galaxy, the Milky Way's nearby twin. “With a big telescope, we can do galactic archaeology in our nearest neighbor,” he says. [1]: http://www.sciencemag.org/content/365/6458/1066 [2]: pending:yes


RECol: Reconstruction Error Columns for Outlier Detection

arXiv.org Machine Learning

Detecting outliers or anomalies is a common data analysis task. As a sub-field of unsupervised machine learning, a large variety of approaches exist, but the vast majority treats the input features as independent and often fails to recognize even simple (linear) relationships in the input feature space. Hence, we introduce RECol, a generic data pre-processing approach to generate additional columns in a leave-one-out-fashion: For each column, we try to predict its values based on the other columns, generating reconstruction error columns. We run experiments across a large variety of common baseline approaches and benchmark datasets with and without our RECol pre-processing method and show that the generated reconstruction error feature space generally seems to support common outlier detection methods and often considerably improves their ROC-AUC and PR-AUC values.


The effect of differential victim crime reporting on predictive policing systems

arXiv.org Machine Learning

Police departments around the world have been experimenting with forms of place-based data-driven proactive policing for over two decades. Modern incarnations of such systems are commonly known as hot spot predictive policing. These systems predict where future crime is likely to concentrate such that police can allocate patrols to these areas and deter crime before it occurs. Previous research on fairness in predictive policing has concentrated on the feedback loops which occur when models are trained on discovered crime data, but has limited implications for models trained on victim crime reporting data. We demonstrate how differential victim crime reporting rates across geographical areas can lead to outcome disparities in common crime hot spot prediction models. Our analysis is based on a simulation patterned after district-level victimization and crime reporting survey data for Bogot\'a, Colombia. Our results suggest that differential crime reporting rates can lead to a displacement of predicted hotspots from high crime but low reporting areas to high or medium crime and high reporting areas. This may lead to misallocations both in the form of over-policing and under-policing.