Goto

Collaborating Authors

 pedal


Expanded Comprehensive Robotic Cholecystectomy Dataset (CRCD)

Oh, Ki-Hwan, Borgioli, Leonardo, Mangano, Alberto, Valle, Valentina, Di Pangrazio, Marco, Toti, Francesco, Pozza, Gioia, Ambrosini, Luciano, Ducas, Alvaro, Žefran, Miloš, Chen, Liaohai, Giulianotti, Pier Cristoforo

arXiv.org Artificial Intelligence

In recent years, the application of machine learning to minimally invasive surgery (MIS) has attracted considerable interest. Datasets are critical to the use of such techniques. This paper presents a unique dataset recorded during ex vivo pseudo-cholecystectomy procedures on pig livers using the da Vinci Research Kit (dVRK). Unlike existing datasets, it addresses a critical gap by providing comprehensive kinematic data, recordings of all pedal inputs, and offers a time-stamped record of the endoscope's movements. This expanded version also includes segmentation and keypoint annotations of images, enhancing its utility for computer vision applications. Contributed by seven surgeons with varied backgrounds and experience levels that are provided as a part of this expanded version, the dataset is an important new resource for surgical robotics research. It enables the development of advanced methods for evaluating surgeon skills, tools for providing better context awareness, and automation of surgical tasks. Our work overcomes the limitations of incomplete recordings and imprecise kinematic data found in other datasets. To demonstrate the potential of the dataset for advancing automation in surgical robotics, we introduce two models that predict clutch usage and camera activation, a 3D scene reconstruction example, and the results from our keypoint and segmentation models.


Elon Musk's Tesla Cybercab is a hollow promise of a robotaxi future

New Scientist

At a glitzy event held at Warner Bros. Studios Burbank in California, Tesla CEO Elon Musk unveiled the Cybercab: a robotic, self-driving taxi. Musk said that the vehicle, which has two seats, no steering wheel and no pedals, would be available before 2027. "I think it's going to be a glorious future," he told the crowd on 10 October. Meanwhile, just a few kilometres south in Los Angeles, people are already being ferried about by autonomous vehicles operated by Waymo.


Inside Tesla's futuristic Robotaxi: Elon Musk's long-awaited driverless cab features NO steering wheel or pedals - and could cost less than 30,000

Daily Mail - Science & tech

You could easily mistake it for a prop from the latest science-fiction blockbuster. But after years of promises, Elon Musk now says Tesla's driverless Robotaxi will soon be a reality. The futuristic autonomous car has no steering wheel, pedals, or rear window and has just enough room for two passengers. Launched at Tesla's'We, Robot' event last night, the all-electric vehicle will cost less than 30,000 ( 23,000) and only 20 cents (15p) per mile to run. Even better, tech fans may not have to wait long to see it take to the streets, as the billionaire SpaceX founder claims the Robotaxi will be available before the end of 2027.


Is THIS what Tesla's Robotaxi will look like? Elon Musk's long-awaited driverless vehicle could feature NO steering wheel or pedals - and take passengers on Uber-style trips for 'less money than a bus ticket'

Daily Mail - Science & tech

After years of teases, Tesla is finally about to pull back the curtain on one of its quirkiest products yet. Tesla CEO Elon Musk will unveil the'Robotaxi' at an event in Los Angeles on Thursday (October 10). Also referred to as'Cybercab', the taxi is expected to be fully driverless – with no steering wheel or pedals – and offer a new Tesla-operated ride-hailing service. Ahead of its official unveiling, ChatGPT has provided a glimpse at what Tesla's Robotaxi could look like. The chatbot's artistic impression features two seats for passengers, a silver steel body and a camera on the roof for sensing its surroundings.


Urban context and delivery performance: Modelling service time for cargo bikes and vans across diverse urban environments

Schrader, Maxwell, Kumar, Navish, Sørig, Esben, Yoon, Soonmyeong, Srivastava, Akash, Xu, Kai, Astefanoaei, Maria, Collignon, Nicolas

arXiv.org Artificial Intelligence

Light goods vehicles (LGV) used extensively in the last mile of delivery are one of the leading polluters in cities. Cargo-bike logistics and Light Electric Vehicles (LEVs) have been put forward as a high impact candidate for replacing LGVs. Studies have estimated over half of urban van deliveries being replaceable by cargo-bikes, due to their faster speeds, shorter parking times and more efficient routes across cities. However, the logistics sector suffers from a lack of publicly available data, particularly pertaining to cargo-bike deliveries, thus limiting the understanding of their potential benefits. Specifically, service time (which includes cruising for parking, and walking to destination) is a major, but often overlooked component of delivery time modelling. The aim of this study is to establish a framework for measuring the performance of delivery vehicles, with an initial focus on modelling service times of vans and cargo-bikes across diverse urban environments. We introduce two datasets that allow for in-depth analysis and modelling of service times of cargo bikes and use existing datasets to reason about differences in delivery performance across vehicle types. We introduce a modelling framework to predict the service times of deliveries based on urban context. We employ Uber's H3 index to divide cities into hexagonal cells and aggregate OpenStreetMap tags for each cell, providing a detailed assessment of urban context. Leveraging this spatial grid, we use GeoVex to represent micro-regions as points in a continuous vector space, which then serve as input for predicting vehicle service times. We show that geospatial embeddings can effectively capture urban contexts and facilitate generalizations to new contexts and cities. Our methodology addresses the challenge of limited comparative data available for different vehicle types within the same urban settings.


PEDAL: Enhancing Greedy Decoding with Large Language Models using Diverse Exemplars

Prabhu, Sumanth

arXiv.org Artificial Intelligence

Self-ensembling techniques with diverse reasoning paths such as Self-Consistency have demonstrated remarkable performance gains in text generation with Large Language Models (LLMs). However, such techniques depend on the availability of an accurate answer extraction process to aggregate across multiple outputs. Moreover, they acquire higher inference cost, in comparison to Greedy Decoding, due to generation of relatively higher number of output tokens. Research has shown that the free form text outputs from Self-Consistency can be aggregated reliably using LLMs to produce the final output. Additionally, recent advancements in LLM inference have demonstrated that usage of diverse exemplars in prompts have the ability to induce diversity in the LLM outputs. Such proven techniques can be easily extended to self-ensembling based approaches to achieve enhanced results in text generation. In this paper, we introduce PEDAL (Prompts based on Exemplar Diversity Aggregated using LLMs), a hybrid self-ensembling approach, that combines the strengths of diverse exemplar based prompts and LLM based aggregation to achieve improvement in overall performance. On the publicly available SVAMP and ARC datasets, our experiments reveal that PEDAL can achieve better accuracy than Greedy Decoding based strategies with lower inference cost compared to Self Consistency based approaches.


VR Isle Academy: A VR Digital Twin Approach for Robotic Surgical Skill Development

Filippidis, Achilleas, Marmaras, Nikolaos, Maravgakis, Michael, Plexousaki, Alexandra, Kamarianakis, Manos, Papagiannakis, George

arXiv.org Artificial Intelligence

Contemporary progress in the field of robotics, marked by improved efficiency and stability, has paved the way for the global adoption of surgical robotic systems (SRS). While these systems enhance surgeons' skills by offering a more accurate and less invasive approach to operations, they come at a considerable cost. Moreover, SRS components often involve heavy machinery, making the training process challenging due to limited access to such equipment. In this paper we introduce a cost-effective way to facilitate training for a simulator of a SRS via a portable, device-agnostic, ultra realistic simulation with hand tracking and feet tracking support. Error assessment is accessible in both real-time and offline, which enables the monitoring and tracking of users' performance. The VR application has been objectively evaluated by several untrained testers showcasing significant reduction in error metrics as the number of training sessions increases. This indicates that the proposed VR application denoted as VR Isle Academy operates efficiently, improving the robot - controlling skills of the testers in an intuitive and immersive way towards reducing the learning curve at minimal cost.


Scoring Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription

Yan, Yujia, Duan, Zhiyao

arXiv.org Artificial Intelligence

The neural semi-Markov Conditional Random Field (semi-CRF) framework has demonstrated promise for event-based piano transcription. In this framework, all events (notes or pedals) are represented as closed intervals tied to specific event types. The neural semi-CRF approach requires an interval scoring matrix that assigns a score for every candidate interval. However, designing an efficient and expressive architecture for scoring intervals is not trivial. In this paper, we introduce a simple method for scoring intervals using scaled inner product operations that resemble how attention scoring is done in transformers. We show theoretically that, due to the special structure from encoding the non-overlapping intervals, under a mild condition, the inner product operations are expressive enough to represent an ideal scoring matrix that can yield the correct transcription result. We then demonstrate that an encoder-only non-hierarchical transformer backbone, operating only on a low-time-resolution feature map, is capable of transcribing piano notes and pedals with high accuracy and time precision. The experiment shows that our approach achieves the new state-of-the-art performance across all subtasks in terms of the F1 measure on the Maestro dataset.


Tesla Recalls Cybertruck Over Trapped Pedals--Its Worst Flaw Yet

WIRED

Tesla's Cybertruck has been widely derided. Its panel gaps are wide and amateurish, it's prone to rust, and it looks like an ergonomic cheese grater. Its most serious flaw to date, though, has resulted in a recall of nearly 4,000 vehicles. The US National Highway Traffic Safety Association has recalled 3,878 Cybertrucks, which comprises any that were manufactured between November 13 of last year and April 4. At issue is the accelerator pedal: Its pad can become dislodged, resulting in the pedal becoming trapped in the trim above it. This is, needless to say, quite bad.

  Country: Asia > China (0.06)
  Industry: Transportation > Ground > Road (1.00)

Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity

Varley, Jake, Singh, Sumeet, Jain, Deepali, Choromanski, Krzysztof, Zeng, Andy, Chowdhury, Somnath Basu Roy, Dubey, Avinava, Sindhwani, Vikas

arXiv.org Artificial Intelligence

We present an embodied AI system which receives open-ended natural language instructions from a human, and controls two arms to collaboratively accomplish potentially long-horizon tasks over a large workspace. Our system is modular: it deploys state of the art Large Language Models for task planning,Vision-Language models for semantic perception, and Point Cloud transformers for grasping. With semantic and physical safety in mind, these modules are interfaced with a real-time trajectory optimizer and a compliant tracking controller to enable human-robot proximity. We demonstrate performance for the following tasks: bi-arm sorting, bottle opening, and trash disposal tasks. These are done zero-shot where the models used have not been trained with any real world data from this bi-arm robot, scenes or workspace.Composing both learning- and non-learning-based components in a modular fashion with interpretable inputs and outputs allows the user to easily debug points of failures and fragilities. One may also in-place swap modules to improve the robustness of the overall platform, for instance with imitation-learned policies.