AITopics | Mori, Greg

Collaborating Authors

Mori, Greg

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Attention as an RNN

Feng, Leo, Tung, Frederick, Hajimirsadeghi, Hossein, Ahmed, Mohamed Osama, Bengio, Yoshua, Mori, Greg

arXiv.org Artificial IntelligenceMay-28-2024

The advent of Transformers marked a significant breakthrough in sequence modelling, providing a highly performant architecture capable of leveraging GPU parallelism. However, Transformers are computationally expensive at inference time, limiting their applications, particularly in low-resource settings (e.g., mobile and embedded devices). Addressing this, we (1) begin by showing that attention can be viewed as a special Recurrent Neural Network (RNN) with the ability to compute its \textit{many-to-one} RNN output efficiently. We then (2) show that popular attention-based models such as Transformers can be viewed as RNN variants. However, unlike traditional RNNs (e.g., LSTMs), these models cannot be updated efficiently with new tokens, an important property in sequence modelling. Tackling this, we (3) introduce a new efficient method of computing attention's \textit{many-to-many} RNN output based on the parallel prefix scan algorithm. Building on the new attention formulation, we (4) introduce \textbf{Aaren}, an attention-based module that can not only (i) be trained in parallel (like Transformers) but also (ii) be updated efficiently with new tokens, requiring only constant memory for inferences (like traditional RNNs). Empirically, we show Aarens achieve comparable performance to Transformers on $38$ datasets spread across four popular sequential problem settings: reinforcement learning, event forecasting, time series classification, and time series forecasting tasks while being more time and memory-efficient.

artificial intelligence, machine learning, transformer, (15 more...)

arXiv.org Artificial Intelligence

2405.13956

Country:

North America > United States > California (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine (1.00)
Information Technology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Pretext Training Algorithms for Event Sequence Data

Wang, Yimu, Zhao, He, Deng, Ruizhi, Tung, Frederick, Mori, Greg

arXiv.org Artificial IntelligenceFeb-15-2024

Pretext training followed by task-specific fine-tuning has been a successful approach in vision and language domains. This paper proposes a self-supervised pretext training framework tailored to event sequence data. We introduce a novel alignment verification task that is specialized to event sequences, building on good practices in masked reconstruction and contrastive learning. Our pretext tasks unlock foundational representations that are generalizable across different down-stream tasks, including next-event prediction for temporal point process models, event sequence classification, and missing event interpolation. Experiments on popular public benchmarks demonstrate the potential of the proposed method across different tasks and data domains.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2402.10392

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Education > Educational Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

OPSurv: Orthogonal Polynomials Quadrature Algorithm for Survival Analysis

Bialokozowicz, Lilian W., Le, Hoang M., Sylvain, Tristan, Forsyth, Peter A. I., Nagisetty, Vineel, Mori, Greg

arXiv.org Artificial IntelligenceFeb-2-2024

This paper introduces the Orthogonal Polynomials Quadrature Algorithm for Survival Analysis (OPSurv), a new method providing time-continuous functional outputs for both single and competing risks scenarios in survival analysis. OPSurv utilizes the initial zero condition of the Cumulative Incidence function and a unique decomposition of probability densities using orthogonal polynomials, allowing it to learn functional approximation coefficients for each risk event and construct Cumulative Incidence Function estimates via Gauss--Legendre quadrature. This approach effectively counters overfitting, particularly in competing risks scenarios, enhancing model expressiveness and control. The paper further details empirical validations and theoretical justifications of OPSurv, highlighting its robust performance as an advancement in survival analysis with competing risks.

artificial intelligence, machine learning, opsurv, (10 more...)

arXiv.org Artificial Intelligence

2402.01955

Country: North America (0.15)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.46)

Add feedback

Continuous-time Particle Filtering for Latent Stochastic Differential Equations

Deng, Ruizhi, Mori, Greg, Lehrmann, Andreas M.

arXiv.org Artificial IntelligenceAug-31-2022

Particle filtering is a standard Monte-Carlo approach for a wide range of sequential inference tasks. The key component of a particle filter is a set of particles with importance weights that serve as a proxy of the true posterior distribution of some stochastic process. In this work, we propose continuous latent particle filters, an approach that extends particle filtering to the continuous-time domain. We demonstrate how continuous latent particle filters can be used as a generic plug-in replacement for inference techniques relying on a learned variational posterior. Our experiments with different model families based on latent neural stochastic differential equations demonstrate superior performance of continuous-time particle filtering in inference tasks like likelihood estimation and sequential prediction for a variety of stochastic processes.

artificial intelligence, machine learning, particle, (14 more...)

arXiv.org Artificial Intelligence

2209.00173

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Continuous Latent Process Flows

Deng, Ruizhi, Brubaker, Marcus A., Mori, Greg, Lehrmann, Andreas M.

arXiv.org Machine LearningJun-29-2021

Partial observations of continuous time-series dynamics at arbitrary time stamps exist in many disciplines. Fitting this type of data using statistical models with continuous dynamics is not only promising at an intuitive level but also has practical benefits, including the ability to generate continuous trajectories and to perform inference on previously unseen time stamps. Despite exciting progress in this area, the existing models still face challenges in terms of their representational power and the quality of their variational approximations. We tackle these challenges with continuous latent process flows (CLPF), a principled architecture decoding continuous latent processes into continuous observable processes using a time-dependent normalizing flow driven by a stochastic differential equation. To optimize our model using maximum likelihood, we propose a novel piecewise construction of a variational posterior process and derive the corresponding variational lower bound using trajectory re-weighting. Our ablation studies demonstrate the effectiveness of our contributions in various inference tasks on irregular time grids. Comparisons to state-of-the-art baselines show our model's favourable performance on both synthetic and real-world time-series data.

artificial intelligence, neural network, trajectory, (16 more...)

arXiv.org Machine Learning

2106.1558

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Variational Selective Autoencoder: Learning from Partially-Observed Heterogeneous Data

Gong, Yu, Hajimirsadeghi, Hossein, He, Jiawei, Durand, Thibaut, Mori, Greg

arXiv.org Machine LearningFeb-24-2021

Learning from heterogeneous data poses challenges such as combining data from various sources and of different types. Meanwhile, heterogeneous data are often associated with missingness in real-world applications due to heterogeneity and noise of input sources. In this work, we propose the variational selective autoencoder (VSAE), a general framework to learn representations from partially-observed heterogeneous data. VSAE learns the latent dependencies in heterogeneous data by modeling the joint distribution of observed data, unobserved data, and the imputation mask which represents how the data are missing. It results in a unified model for various downstream tasks including data generation and imputation. Evaluation on both low-dimensional and high-dimensional heterogeneous datasets for these two tasks shows improvement over state-of-the-art models.

artificial intelligence, deep learning, neural network, (16 more...)

arXiv.org Machine Learning

2102.12679

Country: North America > United States > California (0.14)

Genre: Research Report (0.84)

Industry: Information Technology (0.46)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Neural fidelity warping for efficient robot morphology design

Hu, Sha, Yang, Zeshi, Mori, Greg

arXiv.org Artificial IntelligenceDec-9-2020

We consider the problem of optimizing a robot morphology to achieve the best performance for a target task, under computational resource limitations. The evaluation process for each morphological design involves learning a controller for the design, which can consume substantial time and computational resources. To address the challenge of expensive robot morphology evaluation, we present a continuous multi-fidelity Bayesian Optimization framework that efficiently utilizes computational resources via low-fidelity evaluations. We identify the problem of non-stationarity over fidelity space. Our proposed fidelity warping mechanism can learn representations of learning epochs and tasks to model non-stationary covariances between continuous fidelity evaluations which prove challenging for off-the-shelf stationary kernels. Various experiments demonstrate that our method can utilize the low-fidelity evaluations to efficiently search for the optimal robot morphology, outperforming state-of-the-art methods.

artificial intelligence, evaluation, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

2012.04195

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Action is in the Eye of the Beholder: Eye-gaze Driven Model for Spatio-Temporal Action Localization

Shapovalova, Nataliya, Raptis, Michalis, Sigal, Leonid, Mori, Greg

Neural Information Processing SystemsFeb-15-2020, 19:42:20 GMT

We propose a new weakly-supervised structured learning approach for recognition and spatio-temporal localization of actions in video. As part of the proposed approach we develop a generalization of the Max-Path search algorithm, which allows us to efficiently search over a structured space of multiple spatio-temporal paths, while also allowing to incorporate context information into the model. Instead of using spatial annotations, in the form of bounding boxes, to guide the latent model during training, we utilize human gaze data in the form of a weak supervisory signal. This is achieved by incorporating gaze, along with the classification, into the structured loss within the latent SVM learning framework. Experiments on a challenging benchmark dataset, UCF-Sports, show that our model is more accurate, in terms of classification, and achieves state-of-the-art results in localization.

artificial intelligence, localization, machine learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.89)

Add feedback

Learning a discriminative hidden part model for human action recognition

Wang, Yang, Mori, Greg

Neural Information Processing SystemsFeb-15-2020, 03:56:07 GMT

We present a discriminative part-based approach for human action recognition from video sequences using motion features. Our model is based on the recently proposed hidden conditional random field (hCRF) for object recognition. Similar to hCRF for object recognition, we model a human action by a flexible constellation of parts conditioned on image observations. Different from object recognition, our model combines both large-scale global features and local patch features to distinguish various actions. Our experimental results show that our model is comparable to other state-of-the-art approaches in action recognition.

artificial intelligence, human action recognition, machine learning, (3 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

Point Process Flows

Mehrasa, Nazanin, Deng, Ruizhi, Ahmed, Mohamed Osama, Chang, Bo, He, Jiawei, Durand, Thibaut, Brubaker, Marcus, Mori, Greg

arXiv.org Machine LearningOct-18-2019

Event sequences can be modeled by temporal point processes (TPPs) to capture their asynchronous and probabilistic nature. We contribute an intensity-free framework that directly models the point process as a non-parametric distribution by utilizing normalizing flows. This approach is capable of capturing highly complex temporal distributions and does not rely on restrictive parametric forms. Comparisons with state-of-the-art baseline models on both synthetic and challenging real-life datasets show that the proposed framework is effective at modeling the stochasticity of discrete event sequences.

dataset, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1910.08281

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Communications > Social Media (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback