AITopics | osil

Collaborating Authors

osil

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

OSIL: Learning Offline Safe Imitation Policies with Safety Inferred from Non-preferred Trajectories

Burnwal, Returaj, Bhatt, Nirav Pravinbhai, Ravindran, Balaraman

arXiv.org Machine LearningFeb-12-2026

This work addresses the problem of offline safe imitation learning (IL), where the goal is to learn safe and reward-maximizing policies from demonstrations that do not have per-timestep safety cost or reward information. In many real-world domains, online learning in the environment can be risky, and specifying accurate safety costs can be difficult. However, it is often feasible to collect trajectories that reflect undesirable or unsafe behavior, implicitly conveying what the agent should avoid. We refer to these as non-preferred trajectories. We propose a novel offline safe IL algorithm, OSIL, that infers safety from non-preferred demonstrations. We formulate safe policy learning as a Constrained Markov Decision Process (CMDP). Instead of relying on explicit safety cost and reward annotations, OSIL reformulates the CMDP problem by deriving a lower bound on reward maximizing objective and learning a cost model that estimates the likelihood of non-preferred behavior. Our approach allows agents to learn safe and reward-maximizing behavior entirely from offline demonstrations. We empirically demonstrate that our approach can learn safer policies that satisfy cost constraints without degrading the reward performance, thus outperforming several baselines.

machine learning, reinforcement learning, trajectory, (17 more...)

arXiv.org Machine Learning

2602.11018

Country:

Asia > India > Tamil Nadu > Chennai (0.04)
Europe > Middle East > Cyprus > Pafos > Paphos (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Semi-Supervised One-Shot Imitation Learning

Wu, Philipp, Hakhamaneshi, Kourosh, Du, Yuqing, Mordatch, Igor, Rajeswaran, Aravind, Abbeel, Pieter

arXiv.org Artificial IntelligenceAug-9-2024

One-shot Imitation Learning~(OSIL) aims to imbue AI agents with the ability to learn a new task from a single demonstration. To supervise the learning, OSIL typically requires a prohibitively large number of paired expert demonstrations -- i.e. trajectories corresponding to different variations of the same semantic task. To overcome this limitation, we introduce the semi-supervised OSIL problem setting, where the learning agent is presented with a large dataset of trajectories with no task labels (i.e. an unpaired dataset), along with a small dataset of multiple demonstrations per semantic task (i.e. a paired dataset). This presents a more realistic and practical embodiment of few-shot learning and requires the agent to effectively leverage weak supervision from a large dataset of trajectories. Subsequently, we develop an algorithm specifically applicable to this semi-supervised OSIL setting. Our approach first learns an embedding space where different tasks cluster uniquely. We utilize this embedding space and the clustering it supports to self-generate pairings between trajectories in the large unpaired dataset. Through empirical results on simulated control tasks, we demonstrate that OSIL models trained on such self-generated pairings are competitive with OSIL models trained with ground-truth labels, presenting a major advancement in the label-efficiency of OSIL.

demonstration, learning, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2408.05285

Country:

North America > United States > California > Alameda County > Berkeley (0.05)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.50)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

Characterization and Development of Average Silhouette Width Clustering

Batool, Fatima, Hennig, Christian

arXiv.org Machine LearningOct-24-2019

The purpose of this paper is to introduced a new clustering methodology. This paper is divided into three parts. In the first part we have developed the axiomatic theory for the average silhouette width (ASW) index. There are different ways to investigate the quality and characteristics of clustering methods such as validation indices using simulations and real data experiments, model-based theory, and non-model-based theory known as the axiomatic theory. In this work we have not only taken the empirical approach of validation of clustering results through simulations, but also focus on the development of the axiomatic theory. In the second part we have presented a novel clustering methodology based on the optimization of the ASW index. We have considered the problem of estimation of number of clusters and finding clustering against this number simultaneously. Two algorithms are proposed. The proposed algorithms are evaluated against several partitioning and hierarchical clustering methods. An intensive empirical comparison of the different distance metrics on the various clustering methods is conducted. In the third part we have considered two application domains\textemdash novel single cell RNA sequencing datasets and rainfall data to cluster weather stations.

algorithm, complexity, dimension, (14 more...)

arXiv.org Machine Learning

1910.11339

Country:

Europe > United Kingdom (0.28)
Europe > Austria > Vienna (0.14)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
(7 more...)

Genre: Research Report (0.63)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Clustering by Optimizing the Average Silhouette Width

Batool, Fatima, Hennig, Christian

arXiv.org Machine LearningOct-18-2019

In this paper, we propose a unified clustering approach that can estimate number of clusters and produce clustering against this number simultaneously. Average silhouette width (ASW) is a widely used standard cluster quality index. We define a distance based objective function that optimizes ASW for clustering. The proposed algorithm named as OSil, only, needs data observations as an input without any prior knowledge of the number of clusters. This work is about thorough investigation of the proposed methodology, its usefulness and limitations. A vast spectrum of clustering structures were generated, and several well-known clustering methods including partitioning, hierarchical, density based, and spatial methods were consider as the competitor of the proposed methodology. Simulation reveals that OSil algorithm has shown superior perform in terms of clustering quality than all clustering methods included in the study. OSil can find well separated, compact clusters and have shown better performance for the estimation of number of clusters than several methods. Apart from the proposal of the new methodology and it's investigation this papers offer a systematic analysis on the estimation of cluster indices, some of which never appeared together in comparative simulation setup before. The study offers many insightful findings useful for the selection of the clustering methods and indices.

dimension, estimation, osil, (16 more...)

arXiv.org Machine Learning

1910.08644

Country:

Europe > United Kingdom (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback