sav
Complexity in Complexity: Understanding Visual Complexity Through Structure, Color, and Surprise
Sarıtaş, Karahan, Dayan, Peter, Shen, Tingke, Nath, Surabhi S
Understanding human perception of visual complexity is crucial in visual cognition. Recently (Shen, et al. 2024) proposed an interpretable segmentation-based model that accurately predicted complexity across various datasets, supporting the idea that complexity can be explained simply. In this work, we investigate the failure of their model to capture structural, color and surprisal contributions to complexity. To this end, we propose Multi-Scale Sobel Gradient which measures spatial intensity variations, Multi-Scale Unique Color which quantifies colorfulness across multiple scales, and surprise scores generated using a Large Language Model. We test our features on existing benchmarks and a novel dataset containing surprising images from Visual Genome. Our experiments demonstrate that modeling complexity accurately is not as simple as previously thought, requiring additional perceptual and semantic factors to address dataset biases. Thus our results offer deeper insights into how humans assess visual complexity.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Europe > Germany > Saxony > Leipzig (0.04)
Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers
Mitra, Chancharik, Huang, Brandon, Chai, Tianning, Lin, Zhiqiu, Arbelle, Assaf, Feris, Rogerio, Karlinsky, Leonid, Darrell, Trevor, Ramanan, Deva, Herzig, Roei
Generative Large Multimodal Models (LMMs) like LLaVA and Qwen-VL excel at a wide variety of vision-language (VL) tasks such as image captioning or visual question answering. Despite strong performance, LMMs are not directly suited for foundational discriminative vision-language tasks (i.e., tasks requiring discrete label predictions) such as image classification and multiple-choice VQA. One key challenge in utilizing LMMs for discriminative tasks is the extraction of useful features from generative models. To overcome this issue, we propose an approach for finding features in the model's latent space to more effectively leverage LMMs for discriminative tasks. Toward this end, we present Sparse Attention Vectors (SAVs) -- a finetuning-free method that leverages sparse attention head activations (fewer than 1\% of the heads) in LMMs as strong features for VL tasks. With only few-shot examples, SAVs demonstrate state-of-the-art performance compared to a variety of few-shot and finetuned baselines on a collection of discriminative tasks. Our experiments also imply that SAVs can scale in performance with additional examples and generalize to similar tasks, establishing SAVs as both effective and robust multimodal feature representations.
- Europe (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Information Technology (0.67)
- Education (0.48)
- Media (0.46)
- Health & Medicine (0.46)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Aerial Grasping with Soft Aerial Vehicle Using Disturbance Observer-Based Model Predictive Control
Cheung, Hiu Ching, Jiang, Bailun, Hu, Yang, Chu, Henry K., Wen, Chih-Yung, Chang, Ching-Wei
Aerial grasping, particularly soft aerial grasping, holds significant promise for drone delivery and harvesting tasks. However, controlling UAV dynamics during aerial grasping presents considerable challenges. The increased mass during payload grasping adversely affects thrust prediction, while unpredictable environmental disturbances further complicate control efforts. In this study, our objective aims to enhance the control of the Soft Aerial Vehicle (SAV) during aerial grasping by incorporating a disturbance observer into a Nonlinear Model Predictive Control (NMPC) SAV controller. By integrating the disturbance observer into the NMPC SAV controller, we aim to compensate for dynamic model idealization and uncertainties arising from additional payloads and unpredictable disturbances. Our approach combines a disturbance observer-based NMPC with the SAV controller, effectively minimizing tracking errors and enabling precise aerial grasping along all three axes. The proposed SAV equipped with Disturbance Observer-based Nonlinear Model Predictive Control (DOMPC) demonstrates remarkable capabilities in handling both static and non-static payloads, leading to the successful grasping of various objects. Notably, our SAV achieves an impressive payload-to-weight ratio, surpassing previous investigations in the domain of soft grasping. Using the proposed soft aerial vehicle weighing 1.002 kg, we achieve a maximum payload of 337 g by grasping.
Simplicity in Complexity : Explaining Visual Complexity using Deep Segmentation Models
Shen, Tingke, Nath, Surabhi S, Brielmann, Aenne, Dayan, Peter
The complexity of visual stimuli plays an important role in many cognitive phenomena, including attention, engagement, memorability, time perception and aesthetic evaluation. Despite its importance, complexity is poorly understood and ironically, previous models of image complexity have been quite complex. There have been many attempts to find handcrafted features that explain complexity, but these features are usually dataset specific, and hence fail to generalise. On the other hand, more recent work has employed deep neural networks to predict complexity, but these models remain difficult to interpret, and do not guide a theoretical understanding of the problem. Here we propose to model complexity using segment-based representations of images. We use state-of-the-art segmentation models, SAM and FC-CLIP, to quantify the number of segments at multiple granularities, and the number of classes in an image respectively. We find that complexity is well-explained by a simple linear model with these two features across six diverse image-sets of naturalistic scene and art images. This suggests that the complexity of images can be surprisingly simple.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)
- Europe > Germany > Saxony > Leipzig (0.04)
On The Impact of Replacing Private Cars with Autonomous Shuttles: An Agent-Based Approach
Bogdoll, Daniel, Karsch, Louis, Amritzer, Jennifer, Zöllner, J. Marius
The European Green Deal aims to achieve climate neutrality by 2050, which demands improved emissions efficiency from the transportation industry. This study uses an agent-based simulation to analyze the sustainability impacts of shared autonomous shuttles. We forecast travel demands for 2050 and simulate regulatory interventions in the form of replacing private cars with a fleet of shared autonomous shuttles in specific areas. We derive driving-related emissions, energy consumption, and non-driving-related emissions to calculate life-cycle emissions. We observe reduced life-cycle emissions from 0.4% to 9.6% and reduced energy consumption from 1.5% to 12.2%.
- North America > United States > Texas > Travis County > Austin (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- Europe > Sweden (0.04)
- (7 more...)
- Transportation > Passenger (1.00)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
- (3 more...)
Same or Different? Diff-Vectors for Authorship Analysis
Corbara, Silvia, Moreo, Alejandro, Sebastiani, Fabrizio
We investigate the effects on authorship identification tasks of a fundamental shift in how to conceive the vectorial representations of documents that are given as input to a supervised learner. In ``classic'' authorship analysis a feature vector represents a document, the value of a feature represents (an increasing function of) the relative frequency of the feature in the document, and the class label represents the author of the document. We instead investigate the situation in which a feature vector represents an unordered pair of documents, the value of a feature represents the absolute difference in the relative frequencies (or increasing functions thereof) of the feature in the two documents, and the class label indicates whether the two documents are from the same author or not. This latter (learner-independent) type of representation has been occasionally used before, but has never been studied systematically. We argue that it is advantageous, and that in some cases (e.g., authorship verification) it provides a much larger quantity of information to the training process than the standard representation. The experiments that we carry out on several publicly available datasets (among which one that we here make available for the first time) show that feature vectors representing pairs of documents (that we here call Diff-Vectors) bring about systematic improvements in the effectiveness of authorship identification tasks, and especially so when training data are scarce (as it is often the case in real-life authorship identification scenarios). Our experiments tackle same-author verification, authorship verification, and closed-set authorship attribution; while DVs are naturally geared for solving the 1st, we also provide two novel methods for solving the 2nd and 3rd that use a solver for the 1st as a building block.
- Europe > Italy > Tuscany > Pisa Province > Pisa (0.04)
- Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
- North America > United States > Massachusetts > Middlesex County > Reading (0.04)
- (6 more...)
Shared Autonomous Vehicle Mobility for a Transportation Underserved City
Meneses-Cime, Karina, Aksun-Guvenc, Bilin, Guvenc, Levent
This paper proposes the use of an on-demand, ride hailed and ride-Shared Autonomous Vehicle (SAV) service as a feasible solution to serve the mobility needs of a small city where fixed route, circulator type public transportation may be too expensive to operate. The presented work builds upon our earlier work that modeled the city of Marysville, Ohio as an example of such a city, with realistic traffic behavior, and trip requests. A simple SAV dispatcher is implemented to model the behavior of the proposed on-demand mobility service. The goal of the service is to optimally distribute SAVs along the network to allocate passengers and shared rides. The pickup and drop-off locations are strategically placed along the network to provide mobility from affordable housing, which are also transit deserts, to locations corresponding to jobs and other opportunities. The study is carried out by varying the behaviors of the SAV driving system from cautious to aggressive along with the size of the SAV fleet and analyzing their corresponding performance. It is found that the size of the network and behavior of AV driving system behavior results in an optimal number of SAVs after which increasing the number of SAVs does not improve overall mobility. For the Marysville network, which is a 9 mile by 8 mile network, this happens at the mark of a fleet of 8 deployed SAVs. The results show that the introduction of the proposed SAV service with a simple optimal shared scheme can provide access to services and jobs to hundreds of people in a small sized city.
- North America > United States > Ohio > Union County > Marysville (0.25)
- North America > United States > Ohio > Franklin County > Columbus (0.14)
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
- (7 more...)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
Will you let Self-Driving Cars Make Moral Decisions?
A few decades ago, even the smartphones of today would've been considered pretty much "impossible". Looking back at the massive leaps in technology, we sometimes forget the incremental changes that have led us to where we are. So it's only reasonable to assume that cars will also evolve to become AVs, and maybe even something we don't even consider as cars. It's that the world most of us live in right now, isn't built to handle them. We have designed this world with the thought of people making the decisions, not machines.
- Automobiles & Trucks (0.87)
- Transportation > Passenger (0.67)
- Transportation > Ground > Road (0.53)
- Information Technology > Robotics & Automation (0.43)
Coverage based testing for V&V and Safety Assurance of Self-driving Autonomous Vehicles: A Systematic Literature Review
Self-driving Autonomous Vehicles (SAVs) are gaining more interest each passing day by the industry as well as the general public. Tech and automobile companies are investing huge amounts of capital in research and development of SAVs to make sure they have a head start in the SAV market in the future. One of the major hurdles in the way of SAVs making it to the public roads is the lack of confidence of public in the safety aspect of SAVs. In order to assure safety and provide confidence to the public in the safety of SAVs, researchers around the world have used coverage-based testing for Verification and Validation (V&V) and safety assurance of SAVs. The objective of this paper is to investigate the coverage criteria proposed and coverage maximizing techniques used by researchers in the last decade up till now, to assure safety of SAVs. We conduct a Systematic Literature Review (SLR) for this investigation in our paper. We present a classification of existing research based on the coverage criteria used. Several research gaps and research directions are also provided in this SLR to enable further research in this domain. This paper provides a body of knowledge in the domain of safety assurance of SAVs. We believe the results of this SLR will be helpful in the progression of V&V and safety assurance of SAVs.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Massachusetts (0.04)
- Europe > United Kingdom > England > Leicestershire > Loughborough (0.04)
- (2 more...)
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)