Goto

Collaborating Authors

 South America


Adaptive Multi-Resolution Attention with Linear Complexity

arXiv.org Artificial Intelligence

Transformers have improved the state-of-the-art across numerous tasks in sequence modeling. Besides the quadratic computational and memory complexity w.r.t the sequence length, the self-attention mechanism only processes information at the same scale, i.e., all attention heads are in the same resolution, resulting in the limited power of the Transformer. To remedy this, we propose a novel and efficient structure named Adaptive Multi-Resolution Attention (AdaMRA for short), which scales linearly to sequence length in terms of time and space. Specifically, we leverage a multi-resolution multi-head attention mechanism, enabling attention heads to capture long-range contextual information in a coarse-to-fine fashion. Moreover, to capture the potential relations between query representation and clues of different attention granularities, we leave the decision of which resolution of attention to use to query, which further improves the model's capacity compared to vanilla Transformer. In an effort to reduce complexity, we adopt kernel attention without degrading the performance. Extensive experiments on several benchmarks demonstrate the effectiveness and efficiency of our model by achieving a state-of-the-art performance-efficiency-memory trade-off. To facilitate AdaMRA utilization by the scientific community, the code implementation will be made publicly available.


Evaluation of Load Prediction Techniques for Distributed Stream Processing

arXiv.org Artificial Intelligence

Distributed Stream Processing (DSP) systems enable processing large streams of continuous data to produce results in near to real time. They are an essential part of many data-intensive applications and analytics platforms. The rate at which events arrive at DSP systems can vary considerably over time, which may be due to trends, cyclic, and seasonal patterns within the data streams. A priori knowledge of incoming workloads enables proactive approaches to resource management and optimization tasks such as dynamic scaling, live migration of resources, and the tuning of configuration parameters during run-times, thus leading to a potentially better Quality of Service. In this paper we conduct a comprehensive evaluation of different load prediction techniques for DSP jobs. We identify three use-cases and formulate requirements for making load predictions specific to DSP jobs. Automatically optimized classical and Deep Learning methods are being evaluated on nine different datasets from typical DSP domains, i.e. the IoT, Web 2.0, and cluster monitoring. We compare model performance with respect to overall accuracy and training duration. Our results show that the Deep Learning methods provide the most accurate load predictions for the majority of the evaluated datasets.


Data Driven VRP: A Neural Network Model to Learn Hidden Preferences for VRP

arXiv.org Artificial Intelligence

But more often, the objective involves multiple criteria including not only the total distance of the tour but also other factors such as travel costs, travel time, and fuel consumption. Moreover, in reality, there are numerous implicit preferences ingrained in the minds of the route planners and the drivers. Drivers, for instance, have familiarity with certain neighborhoods and knowledge of the state of roads, and often consider the best places for rest and lunch breaks. This knowledge is difficult to formulate and balance when operational routing decisions have to be made. This motivates us to learn the implicit preferences from past solutions and to incorporate these learned preferences in the optimization process. These preferences are in the form of arc probabilities, i.e., the more preferred a route is, the higher is the joint probability. The novelty of this work is the use of a neural network model to estimate the arc probabilities, which allows for additional features and automatic parameter estimation. This first requires identifying suitable features, neural architectures and loss functions, taking into account that there is typically few data available. We investigate the difference with a prior weighted Markov counting approach, and study the applicability of neural networks in this setting.


PyEuroVoc: A Tool for Multilingual Legal Document Classification with EuroVoc Descriptors

arXiv.org Artificial Intelligence

EuroVoc is a multilingual thesaurus that was built for organizing the legislative documentary of the European Union institutions. It contains thousands of categories at different levels of specificity and its descriptors are targeted by legal texts in almost thirty languages. In this work we propose a unified framework for EuroVoc classification on 22 languages by fine-tuning modern Transformer-based pretrained language models. We study extensively the performance of our trained models and show that they significantly improve the results obtained by a similar tool - JEX - on the same dataset. The code and the fine-tuned models were open sourced, together with a programmatic interface that eases the process of loading the weights of a trained model and of classifying a new document.


Insights on the Artificial Intelligence (AI) In Radiology Global Market to 2027 – by Offering, Application, End-use Industry and Geography - KhelPanda

#artificialintelligence

The Artificial Intelligence (AI) In Radiology market report also offers a thorough analysis of the key market drivers and restraints that play a vital role in influencing the growth of the industry. The report considers the currently unfolding coronavirus pandemic as one of the key influencing factors. Some of the key and emerging players operating in the market are profiled in the report to offer a comprehensive overview of the competitive landscape. The report further covers the strategic initiatives taken by the companies such as mergers and acquisitions, product launches and brand promotions, collaborations and joint ventures, agreements and partnerships, and government deals, among others. The strategic initiatives offer the companies a chance to expand their foothold in the industry and gain a significant global position.


Mobile Artificial Intelligence (MAI) Market to Witness Notable Growth by 2027 - Digital Journal

#artificialintelligence

This comprehensive analysis in this Mobile Artificial Intelligence (MAI) market report describes data on a variety of topics, including growth strategies and restrictions. This market report contains critical information about the market landscape that considerably aids key stakeholders in making the best decision possible before investing in a business. In order to deliver the most accurate estimations and forecasts possible, this market study adopts a systematic and progressive research process focused on reducing deviation. For segregating and evaluating quantitative features of the market, the market report incorporates elements of bottom-up and top-down methodologies. On a great scale, raw market statistics is collected and analyzed in this Mobile Artificial Intelligence (MAI) market report.


The State of AI Ethics Report (Volume 5)

arXiv.org Artificial Intelligence

This report from the Montreal AI Ethics Institute covers the most salient progress in research and reporting over the second quarter of 2021 in the field of AI ethics with a special emphasis on "Environment and AI", "Creativity and AI", and "Geopolitics and AI." The report also features an exclusive piece titled "Critical Race Quantum Computer" that applies ideas from quantum physics to explain the complexities of human characteristics and how they can and should shape our interactions with each other. The report also features special contributions on the subject of pedagogy in AI ethics, sociology and AI ethics, and organizational challenges to implementing AI ethics in practice. Given MAIEI's mission to highlight scholars from around the world working on AI ethics issues, the report also features two spotlights sharing the work of scholars operating in Singapore and Mexico helping to shape policy measures as they relate to the responsible use of technology. The report also has an extensive section covering the gamut of issues when it comes to the societal impacts of AI covering areas of bias, privacy, transparency, accountability, fairness, interpretability, disinformation, policymaking, law, regulations, and moral philosophy.


Automated Olfactory Bulb Segmentation on High Resolutional T2-Weighted MRI

arXiv.org Artificial Intelligence

The neuroimage analysis community has neglected the automated segmentation of the olfactory bulb (OB) despite its crucial role in olfactory function. The lack of an automatic processing method for the OB can be explained by its challenging properties. Nonetheless, recent advances in MRI acquisition techniques and resolution have allowed raters to generate more reliable manual annotations. Furthermore, the high accuracy of deep learning methods for solving semantic segmentation problems provides us with an option to reliably assess even small structures. In this work, we introduce a novel, fast, and fully automated deep learning pipeline to accurately segment OB tissue on sub-millimeter T2-weighted (T2w) whole-brain MR images. To this end, we designed a three-stage pipeline: (1) Localization of a region containing both OBs using FastSurferCNN, (2) Segmentation of OB tissue within the localized region through four independent AttFastSurferCNN - a novel deep learning architecture with a self-attention mechanism to improve modeling of contextual information, and (3) Ensemble of the predicted label maps. The OB pipeline exhibits high performance in terms of boundary delineation, OB localization, and volume estimation across a wide range of ages in 203 participants of the Rhineland Study. Moreover, it also generalizes to scans of an independent dataset never encountered during training, the Human Connectome Project (HCP), with different acquisition parameters and demographics, evaluated in 30 cases at the native 0.7mm HCP resolution, and the default 0.8mm pipeline resolution. We extensively validated our pipeline not only with respect to segmentation accuracy but also to known OB volume effects, where it can sensitively replicate age effects.


Intelligence as information processing: brains, swarms, and computers

arXiv.org Artificial Intelligence

There is no agreed definition of intelligence, so it is problematic to simply ask whether brains, swarms, computers, or other systems are intelligent or not. To compare the potential intelligence exhibited by different cognitive systems, I use the common approach used by artificial intelligence and artificial life: Instead of studying the substrate of systems, let us focus on their organization. This organization can be measured with information. Thus, I apply an informationist epistemology to describe cognitive systems, including brains and computers. This allows me to frame the usefulness and limitations of the brain-computer analogy in different contexts. I also use this perspective to discuss the evolution and ecology of intelligence.


On the Hyperparameters in Stochastic Gradient Descent with Momentum

arXiv.org Machine Learning

Following the same routine as [SSJ20], we continue to present the theoretical analysis for stochastic gradient descent with momentum (SGD with momentum) in this paper. Differently, for SGD with momentum, we demonstrate it is the two hyperparameters together, the learning rate and the momentum coefficient, that play the significant role for the linear rate of convergence in non-convex optimization. Our analysis is based on the use of a hyperparameters-dependent stochastic differential equation (hp-dependent SDE) that serves as a continuous surrogate for SGD with momentum. Similarly, we establish the linear convergence for the continuous-time formulation of SGD with momentum and obtain an explicit expression for the optimal linear rate by analyzing the spectrum of the Kramers-Fokker-Planck operator. By comparison, we demonstrate how the optimal linear rate of convergence and the final gap for SGD only about the learning rate varies with the momentum coefficient increasing from zero to one when the momentum is introduced. Then, we propose a mathematical interpretation why the SGD with momentum converges faster and more robust about the learning rate than the standard SGD in practice. Finally, we show the Nesterov momentum under the existence of noise has no essential difference with the standard momentum.