AITopics | eye view

Collaborating Authors

eye view

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LLMs are Not Just Next Token Predictors

Downes, Stephen M., Forber, Patrick, Grzankowski, Alex

arXiv.org Artificial IntelligenceAug-6-2024

LLMs are statistical models of language learning through stochastic gradient descent with a next token prediction objective. Prompting a popular view among AI modelers: LLMs are just next token predictors. While LLMs are engineered using next token prediction, and trained based on their success at this task, our view is that a reduction to just next token predictor sells LLMs short. Moreover, there are important explanations of LLM behavior and capabilities that are lost when we engage in this kind of reduction. In order to draw this out, we will make an analogy with a once prominent research program in biology explaining evolution and development from the genes eye view. LLMs are statistical models of language learning through stochastic gradient descent with a next token prediction objective. So, LLMs are'just next token predictors', a popular view among AI modelers, explicitly laid out by Shanahan (2024): "A great many tasks that demand intelligence in humans can be reduced to next-token prediction with a sufficiently performant model" (2024, 68), and "surely what they are doing is more than'just' next-token prediction? Well, it is an engineering fact that this is what an LLM does. The noteworthy thing is that next-token prediction is sufficient for solving previously unseen reasoning problems" (2024, 77).

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2408.04666

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Utah (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
Education > Curriculum > Subject-Specific Education (0.44)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.94)

Add feedback

Improving Bird's Eye View Semantic Segmentation by Task Decomposition

Zhao, Tianhao, Chen, Yongcan, Wu, Yu, Liu, Tianyang, Du, Bo, Xiao, Peilun, Qiu, Shi, Yang, Hongda, Li, Guozhen, Yang, Yi, Lin, Yutian

arXiv.org Artificial IntelligenceApr-2-2024

Semantic segmentation in bird's eye view (BEV) plays a crucial role in autonomous driving. Previous methods usually follow an end-to-end pipeline, directly predicting the BEV segmentation map from monocular RGB inputs. However, the challenge arises when the RGB inputs and BEV targets from distinct perspectives, making the direct point-to-point predicting hard to optimize. In this paper, we decompose the original BEV segmentation task into two stages, namely BEV map reconstruction and RGB-BEV feature alignment. In the first stage, we train a BEV autoencoder to reconstruct the BEV segmentation maps given corrupted noisy latent representation, which urges the decoder to learn fundamental knowledge of typical BEV patterns. The second stage involves mapping RGB input images into the BEV latent space of the first stage, directly optimizing the correlations between the two views at the feature level. Our approach simplifies the complexity of combining perception and generation into distinct steps, equipping the model to handle intricate and challenging scenes effectively. Besides, we propose to transform the BEV segmentation map from the Cartesian to the polar coordinate system to establish the column-wise correspondence between RGB images and BEV maps. Moreover, our method requires neither multi-scale features nor camera intrinsic parameters for depth estimation and saves computational overhead. Extensive experiments on nuScenes and Argoverse show the effectiveness and efficiency of our method. Code is available at https://github.com/happytianhao/TaDe.

bev segmentation map, representation, rgb image, (14 more...)

arXiv.org Artificial Intelligence

2404.01925

Country: Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report (1.00)

Industry: Transportation > Ground > Road (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.89)
Information Technology > Sensing and Signal Processing > Image Processing (0.88)
(2 more...)

Add feedback

Ring announces a new battery-powered doorbell with 3D motion detection and improved visuals

EngadgetFeb-7-2024, 14:15:21 GMT

Ring has announced a refresh of its popular Battery Doorbell Plus outdoor camera. The Battery Doorbell Pro is an upgrade in nearly every way, as is usually the case when companies slap "Pro" at the end of a name. Ring says this new model is its "most advanced battery powered doorbell" ever and that it's packed with features that exceed even its wired doorbells. It boasts radar-powered 3D motion detection, which was also included with the company's Stick Up Cam Pro. Otherwise called "Bird's Eye View", this technology tracks an object's path through the camera's field of view so you can monitor where visitors are going and the route they took to get there.

doorbell, motion detection, new battery-powered doorbell, (8 more...)

Engadget

Industry:

Energy > Energy Storage (1.00)
Electrical Industrial Apparatus (1.00)
Commercial Services & Supplies > Security & Alarm Services (0.74)

Technology:

Information Technology > Sensing and Signal Processing (0.90)
Information Technology > Artificial Intelligence > Vision (0.74)

Add feedback

MotionBEV: Attention-Aware Online LiDAR Moving Object Segmentation with Bird's Eye View based Appearance and Motion Features

Zhou, Bo, Xie, Jiapeng, Pan, Yan, Wu, Jiajie, Lu, Chuanzhao

arXiv.org Artificial IntelligenceOct-10-2023

Identifying moving objects is an essential capability for autonomous systems, as it provides critical information for pose estimation, navigation, collision avoidance, and static map construction. In this paper, we present MotionBEV, a fast and accurate framework for LiDAR moving object segmentation, which segments moving objects with appearance and motion features in the bird's eye view (BEV) domain. Our approach converts 3D LiDAR scans into a 2D polar BEV representation to improve computational efficiency. Specifically, we learn appearance features with a simplified PointNet and compute motion features through the height differences of consecutive frames of point clouds projected onto vertical columns in the polar BEV coordinate system. We employ a dual-branch network bridged by the Appearance-Motion Co-attention Module (AMCM) to adaptively fuse the spatio-temporal information from appearance and motion features. Our approach achieves state-of-the-art performance on the SemanticKITTI-MOS benchmark. Furthermore, to demonstrate the practical effectiveness of our method, we provide a LiDAR-MOS dataset recorded by a solid-state LiDAR, which features non-repetitive scanning patterns and a small field of view.

appearance and motion feature, attention-aware online lidar, object segmentation, (3 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2023.3325687

2305.07336

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Vision > Video Understanding (0.53)

Add feedback

Herd's Eye View: Improving Game AI Agent Learning with Collaborative Perception

Nash, Andrew, Vardy, Andrew, Churchill, David

arXiv.org Artificial IntelligenceAug-15-2023

We present a novel perception model named Herd's Eye View (HEV) that adopts a global perspective derived from multiple agents to boost the decision-making capabilities of reinforcement learning (RL) agents in multi-agent environments, specifically in the context of game AI. The HEV approach utilizes cooperative perception to empower RL agents with a global reasoning ability, enhancing their decision-making. We demonstrate the effectiveness of the HEV within simulated game environments and highlight its superior performance compared to traditional ego-centric perception models. This work contributes to cooperative perception and multi-agent reinforcement learning by offering a more realistic and efficient perspective for global coordination and decision-making within game environments. Moreover, our approach promotes broader AI applications beyond gaming by addressing constraints faced by AI in other fields such as robotics. The code is available at https://github.com/andrewnash/Herds-Eye-View

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2306.06544

Country: North America > Canada > Newfoundland and Labrador > Newfoundland > St. John's (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment > Games > Computer Games (0.69)
Information Technology > Software (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Mind's Eye View by @lunaphage on Pixelz AI Art Generator - Pixelz AI Art Generator

#artificialintelligenceApr-8-2023, 04:20:35 GMT

Pixelz AI Art Generator enables you to create incredible art from text. Stable Diffusion, CLIP Guided Diffusion & PXL·E Realistic Algorithms available!

eye view, lunaphage, pixelz ai art generator, (1 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

AI has a bird's eye view

#artificialintelligenceOct-4-2022, 05:05:30 GMT

What can we learn by looking down from above? To see a city from the sky is to see it as an eagle would. You can fly high up in the sky in breathtaking drone footage to reveal a landscape of hope and rich culture. Drone view is a powerful tool. A new method allows the creation of bird's eye views from a single frontal photo.

bird, eye view

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.77)

Add feedback

Simple-BEV: What Really Matters for Multi-Sensor BEV Perception?

#artificialintelligenceSep-30-2022, 21:23:42 GMT

Building 3D perception systems for autonomous vehicles that do not rely on high-density LiDAR is a critical research problem because of the expense of LiDAR systems compared to cameras and other sensors. Recent research has developed a variety of camera-only methods, where features are differentiably "lifted" from the multi-camera images onto the 2D ground plane, yielding a "bird's eye view" (BEV) feature representation of the 3D space around the vehicle. This line of work has produced a variety of novel "lifting" methods, but we observe that other details in the training setups have shifted at the same time, making it unclear what really matters in top-performing methods. We also observe that using cameras alone is not a real-world constraint, considering that additional sensors like radar have been integrated into real vehicles for years already. We find that batch size and input resolution greatly affect performance, while lifting strategies have a more modest effect--even a simple parameter-free lifter works well.

multi-sensor bev perception, representation, vehicle, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence (0.59)

Add feedback

Ring brings radar detection to its Spotlight Cam Pro

EngadgetSep-28-2022, 16:42:27 GMT

We've already seen Ring add Bird's Eye View -- its fancy 3D motion detection -- to its flagship security camera and its flagship outdoor light camera. Consequently, you get no prizes for guessing that the feature is now coming to the new Ring Spotlight Cam Pro. The new Pro Spotlight Cam is joined by a Spotlight Cam Plus, which offers a slightly nicer design than its predecessor. For the uninitiated, Birds Eye View is a system that offers users a top-down map of their area, showing the path a person took to your front door. It's designed to let you know if someone's been peering into your windows, or anywhere else, while on your porch.

astro, ring bring radar detection, spotlight cam, (3 more...)

Engadget

Industry: Commercial Services & Supplies > Security & Alarm Services (0.78)

Technology: Information Technology > Artificial Intelligence (0.55)

Add feedback

Ring's new video doorbell is a premium (but pricey) way to guard your door

USATODAY - Tech Top StoriesMar-31-2021, 16:05:40 GMT

If you're all in on Ring and love to try the latest gear (we don't blame you) the Pro 2 may appeal, functioning as a premium version of other Ring doorbells I've tested. Audio and video are two of the most important video doorbell features, and the Pro 2 excels at both. The audio is impressively loud and easy to understand, and the video resolution is sharp and on par with the best smart doorbells we've tested. But in a sea of smart doorbells, the Pro 2's extra features just don't add enough value to justify its high price. Ring's new Video Doorbell Pro 2 comes with everything you need to get setup.

doorbell, ring video doorbell, video doorbell, (11 more...)

USATODAY - Tech Top Stories

Industry: Information Technology > Security & Privacy (0.30)

Technology:

Information Technology > Human Computer Interaction > Interfaces (0.92)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.92)

Add feedback