Goto

Collaborating Authors

 lighthouse


LIGHTHOUSE: Fast and precise distance to shoreline calculations from anywhere on earth

Beukema, Patrick, Herzog, Henry, Zhang, Yawen, Pitelka, Hunter, Bastani, Favyen

arXiv.org Artificial Intelligence

We introduce a new dataset and algorithm for fast and efficient coastal distance calculations from Anywhere on Earth (AoE). Existing global coastal datasets are only available at coarse resolution (e.g. 1-4 km) which limits their utility. Publicly available satellite imagery combined with computer vision enable much higher precision. We provide a global coastline dataset at 10 meter resolution, a 100+ fold improvement in precision over existing data. To handle the computational challenge of querying at such an increased scale, we introduce a new library: Layered Iterative Geospatial Hierarchical Terrain-Oriented Unified Search Engine (Lighthouse). Lighthouse is both exceptionally fast and resource-efficient, requiring only 1 CPU and 2 GB of RAM to achieve millisecond online inference, making it well suited for real-time applications in resource-constrained environments.


Using machine learning for fault detection in lighthouse light sensors

Kampouridis, Michael, Vastardis, Nikolaos, Rayment, George

arXiv.org Artificial Intelligence

Lighthouses play a crucial role in ensuring maritime safety by signaling hazardous areas such as dangerous coastlines, shoals, reefs, and rocks, along with aiding harbor entries and aerial navigation. This is achieved through the use of photoresistor sensors that activate or deactivate based on the time of day. However, a significant issue is the potential malfunction of these sensors, leading to the gradual misalignment of the light's operational timing. This paper introduces an innovative machine learning-based approach for automatically detecting such malfunctions. We evaluate four distinct algorithms: decision trees, random forest, extreme gradient boosting, and multi-layer perceptron. Our findings indicate that the multi-layer perceptron is the most effective, capable of detecting timing discrepancies as small as 10-15 minutes. This accuracy makes it a highly efficient tool for automating the detection of faults in lighthouse light sensors.


Lighthouse: A User-Friendly Library for Reproducible Video Moment Retrieval and Highlight Detection

Nishimura, Taichi, Nakada, Shota, Munakata, Hokuto, Komatsu, Tatsuya

arXiv.org Artificial Intelligence

We propose Lighthouse, a user-friendly library for reproducible video moment retrieval and highlight detection (MR-HD). Although researchers proposed various MR-HD approaches, the research community holds two main issues. The first is a lack of comprehensive and reproducible experiments across various methods, datasets, and video-text features. This is because no unified training and evaluation codebase covers multiple settings. The second is user-unfriendly design. Because previous works use different libraries, researchers set up individual environments. In addition, most works release only the training codes, requiring users to implement the whole inference process of MR-HD. Lighthouse addresses these issues by implementing a unified reproducible codebase that includes six models, three features, and five datasets. In addition, it provides an inference API and web demo to make these methods easily accessible for researchers and developers. Our experiments demonstrate that Lighthouse generally reproduces the reported scores in the reference papers. The code is available at https://github.com/line/lighthouse.


Learning to Follow Object-Centric Image Editing Instructions Faithfully

Chakrabarty, Tuhin, Singh, Kanishk, Saakyan, Arkadiy, Muresan, Smaranda

arXiv.org Artificial Intelligence

Natural language instructions are a powerful interface for editing the outputs of text-to-image diffusion models. However, several challenges need to be addressed: 1) underspecification (the need to model the implicit meaning of instructions) 2) grounding (the need to localize where the edit has to be performed), 3) faithfulness (the need to preserve the elements of the image not affected by the edit instruction). Current approaches focusing on image editing with natural language instructions rely on automatically generated paired data, which, as shown in our investigation, is noisy and sometimes nonsensical, exacerbating the above issues. Building on recent advances in segmentation, Chain-of-Thought prompting, and visual question answering, we significantly improve the quality of the paired data. In addition, we enhance the supervision signal by highlighting parts of the image that need to be changed by the instruction. The model fine-tuned on the improved data is capable of performing fine-grained object-centric edits better than state-of-the-art baselines, mitigating the problems outlined above, as shown by automatic and human evaluations. Moreover, our model is capable of generalizing to domains unseen during training, such as visual metaphors.


Design, Implementation and Evaluation of an External Pose-Tracking System for Underwater Cameras

Winkel, Birger, Nakath, David, Woelk, Felix, Köser, Kevin

arXiv.org Artificial Intelligence

PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science manuscript No. (will be inserted by the editor) Abstract In order to advance underwater computer vision 3 mm and 0.3 . Finally, the usability of the system for and robotics from lab environments and clear water underwater applications is demonstrated. In particular, truth HTC Vive determining the camera pose is essential for many underwater robotic or photogrammetric applications and known ground truth is mandatory to evaluate the performance 1 Introduction of e.g., simultaneous localization and mapping approaches in such extreme environments. This paper Pose estimation is a mandatory prerequisite in multiple presents the conception, calibration and implementation disciplines, where each case demands the selection of a of an external reference system for determining suitable tracking method. GPS) for example, the underwater camera pose in real-time. The approach, is suitable for determining the position of vehicles based on an HTC Vive tracking system in air, calculates that have a direct line of sight to multiple satellites, the underwater camera pose by fusing the poses of two but this method cannot be used underwater due to the controllers tracked above the water surface of a tank. Instead, acoustic methods or It is shown that the mean deviation of this approach optical markers can be used here (Kinsey et al. 2006). Figure 1 Application in a deep sea AUV scenario: from left to right: the whole system with the external reference system, underwater camera, and artificial lights; an overview of the scene, without external light sources; and finally a view through the underwater camera. To reduce the already argued in Nakath et al. (2022), underwater validation tracking and transformation errors, two controllers are scenarios severely suffer from the lack of exactly used on the upper end of the camera stick.


Tell Me a Story! Narrative-Driven XAI with Large Language Models

Martens, David, Dams, Camille, Hinns, James, Vergouwen, Mark

arXiv.org Artificial Intelligence

In today's critical domains, the predominance of black-box machine learning models amplifies the demand for Explainable AI (XAI). The widely used SHAP values, while quantifying feature importance, are often too intricate and lack human-friendly explanations. Furthermore, counterfactual (CF) explanations present `what ifs' but leave users grappling with the 'why'. To bridge this gap, we introduce XAIstories. Leveraging Large Language Models, XAIstories provide narratives that shed light on AI predictions: SHAPstories do so based on SHAP explanations to explain a prediction score, while CFstories do so for CF explanations to explain a decision. Our results are striking: over 90% of the surveyed general audience finds the narrative generated by SHAPstories convincing. Data scientists primarily see the value of SHAPstories in communicating explanations to a general audience, with 92% of data scientists indicating that it will contribute to the ease and confidence of nonspecialists in understanding AI predictions. Additionally, 83% of data scientists indicate they are likely to use SHAPstories for this purpose. In image classification, CFstories are considered more or equally convincing as users own crafted stories by over 75% of lay user participants. CFstories also bring a tenfold speed gain in creating a narrative, and improves accuracy by over 20% compared to manually created narratives. The results thereby suggest that XAIstories may provide the missing link in truly explaining and understanding AI predictions.


Lighthouses and Global Graph Stabilization: Active SLAM for Low-compute, Narrow-FoV Robots

Deshpande, Mohit, Kim, Richard, Kumar, Dhruva, Park, Jong Jin, Zamiska, Jim

arXiv.org Artificial Intelligence

Autonomous exploration to build a map of an unknown environment is a fundamental robotics problem. However, the quality of the map directly influences the quality of subsequent robot operation. Instability in a simultaneous localization and mapping (SLAM) system can lead to poorquality maps and subsequent navigation failures during or after exploration. This becomes particularly noticeable in consumer robotics, where compute budget and limited field-of-view are very common. In this work, we propose (i) the concept of lighthouses: panoramic views with high visual information content that can be used to maintain the stability of the map locally in their neighborhoods and (ii) the final stabilization strategy for global pose graph stabilization. We call our novel exploration strategy SLAM-aware exploration (SAE) and evaluate its performance on real-world home environments.


How Artificial Intelligence Can Empower Retail Frontline Workers

#artificialintelligence

Shoppers are returning to stores especially as society learns to live with the pandemic. During the start of the 2022 holiday season, the number of shoppers going back to physical stores well exceeded the expectations of the National Retail Federation. But stores continue to face the reality of not having enough store associates to manage the foot traffic. A study from UKG finds many U.S.-based retail stores are struggling to meet sales goals because they are short staffed (80 percent, up from 68 percent in 2021). The study says customers will likely feel the impact of these labor challenges when shopping for the holidays (72 percent).


Generating image captions with external encyclopedic knowledge

Nikiforova, Sofia, Deoskar, Tejaswini, Paperno, Denis, Winter, Yoad

arXiv.org Artificial Intelligence

Accurately reporting what objects are depicted in an image is largely a solved problem in automatic caption generation. The next big challenge on the way to truly humanlike captioning is being able to incorporate the context of the image and related real world knowledge. We tackle this challenge by creating an end-to-end caption generation system that makes extensive use of image-specific encyclopedic data. Our approach includes a novel way of using image location to identify relevant open-domain facts in an external knowledge base, with their subsequent integration into the captioning pipeline at both the encoding and decoding stages. Our system is trained and tested on a new dataset with naturally produced knowledge-rich captions, and achieves significant improvements over multiple baselines. We empirically demonstrate that our approach is effective for generating contextualized captions with encyclopedic knowledge that is both factually accurate and relevant to the image.


Automatic Calibration of a Six-Degrees-of-Freedom Pose Estimation System

Jansen, Wouter, Laurijssen, Dennis, Daems, Walter, Steckel, Jan

arXiv.org Artificial Intelligence

Systems for estimating the six-degrees-of-freedom human body pose have been improving for over two decades. Technologies such as motion capture cameras, advanced gaming peripherals and more recently both deep learning techniques and virtual reality systems have shown impressive results. However, most systems that provide high accuracy and high precision are expensive and not easy to operate. Recently, research has been carried out to estimate the human body pose using the HTC Vive virtual reality system. This system shows accurate results while keeping the cost under a 1000 USD. This system uses an optical approach. Two transmitter devices emit infrared pulses and laser planes are tracked by use of photo diodes on receiver hardware. A system using these transmitter devices combined with low-cost custom-made receiver hardware was developed previously but requires manual measurement of the position and orientation of the transmitter devices. These manual measurements can be time consuming, prone to error and not possible in particular setups. We propose an algorithm to automatically calibrate the poses of the transmitter devices in any chosen environment with custom receiver/calibration hardware. Results show that the calibration works in a variety of setups while being more accurate than what manual measurements would allow. Furthermore, the calibration movement and speed has no noticeable influence on the precision of the results.