tfr
A New Semidefinite Relaxation for Linear and Piecewise-Affine Optimal Control with Time Scaling
Yang, Lujie, Marcucci, Tobia, Parrilo, Pablo A., Tedrake, Russ
We introduce a semidefinite relaxation for optimal control of linear systems with time scaling. These problems are inherently nonconvex, since the system dynamics involves bilinear products between the discretization time step and the system state and controls. The proposed relaxation is closely related to the standard second-order semidefinite relaxation for quadratic constraints, but we carefully select a subset of the possible bilinear terms and apply a change of variables to achieve empirically tight relaxations while keeping the computational load light. We further extend our method to handle piecewise-affine (PWA) systems by formulating the PWA optimal-control problem as a shortest-path problem in a graph of convex sets (GCS). In this GCS, different paths represent different mode sequences for the PWA system, and the convex sets model the relaxed dynamics within each mode. By combining a tight convex relaxation of the GCS problem with our semidefinite relaxation with time scaling, we can solve PWA optimal-control problems through a single semidefinite program.
Semi-supervised classification of bird vocalizations
Hexeberg, Simen, Chitre, Mandar, Hoffmann-Kuhnt, Matthias, Low, Bing Wen
Changes in bird populations can indicate broader changes in ecosystems, making birds one of the most important animal groups to monitor. Combining machine learning and passive acoustics enables continuous monitoring over extended periods without direct human involvement. However, most existing techniques require extensive expert-labeled datasets for training and cannot easily detect time-overlapping calls in busy soundscapes. We propose a semi-supervised acoustic bird detector designed to allow both the detection of time-overlapping calls (when separated in frequency) and the use of few labeled training samples. The classifier is trained and evaluated on a combination of community-recorded open-source data and long-duration soundscape recordings from Singapore. It outperforms the state-of-the-art BirdNET classifier on a test set of 103 bird species despite significantly fewer labeled training samples. The detector is further tested on 144 microphone-hours of continuous soundscape data. The rich soundscape in Singapore makes suppression of false positives a challenge on raw, continuous data streams. Nevertheless, we demonstrate that achieving high precision in such environments with minimal labeled training data is possible. Introduction Biodiversity monitoring is a critical aspect of biodiversity conservation, as it helps inform decision making, improves our knowledge and enhances public education and awareness. Birds are one of the most surveyed animal groups in biodiversity monitoring programmes, with point counts and transect surveys being well-established survey techniques for monitoring bird communities [1]. However, birds can be very difficult to detect and identify especially in tropical regions characterised by high avian diversity and numerous rare species [2], [3]. Additionally, such manned survey techniques are manpower-intensive, require highly specialized expertise, and tend to overlook rare species that are sensitive to human presence [4], [5], [6]. Passive monitoring of biodiversity using acoustics is thus an area of great potential, as various animal groups including birds make unique vocalizations, which can be used to validate their presence.
FameBias: Embedding Manipulation Bias Attack in Text-to-Image Models
Roh, Jaechul, Yuan, Andrew, Mao, Jinsong
Text-to-Image (T2I) diffusion models have rapidly advanced, enabling the generation of high-quality images that align closely with textual descriptions. However, this progress has also raised concerns about their misuse for propaganda and other malicious activities. Recent studies reveal that attackers can embed biases into these models through simple fine-tuning, causing them to generate targeted imagery when triggered by specific phrases. This underscores the potential for T2I models to act as tools for disseminating propaganda, producing images aligned with an attacker's objective for end-users. Building on this concept, we introduce FameBias, a T2I biasing attack that manipulates the embeddings of input prompts to generate images featuring specific public figures. Unlike prior methods, Famebias operates solely on the input embedding vectors without requiring additional model training. We evaluate FameBias comprehensively using Stable Diffusion V2, generating a large corpus of images based on various trigger nouns and target public figures. Our experiments demonstrate that FameBias achieves a high attack success rate while preserving the semantic context of the original prompts across multiple trigger-target pairs.
Logic-Constrained Shortest Paths for Flight Planning
Euler, Ricardo, Casas, Pedro Maristany de las, Borndรถrfer, Ralf
The Logic-Constrained Shortest Path Problem (LCSP) combines a one-to-one shortest path problem with satisfiability constraints imposed on the routing graph. This setting arises in flight planning, where air traffic control (ATC) authorities are enforcing a set of traffic flow restrictions (TFRs) on aircraft routes in order to increase safety and throughput. We propose a new branch and bound-based algorithm for the LCSP. The resulting algorithm has three main degrees of freedom: the node selection rule, the branching rule and the conflict. While node selection and branching rules have been long studied in the MIP and SAT communities, most of them cannot be applied out of the box for the LCSP. We review the existing literature and develop tailored variants of the most prominent rules. The conflict, the set of variables to which the branching rule is applied, is unique to the LCSP. We analyze its theoretical impact on the B&B algorithm. In the second part of the paper, we show how to model the Flight Planning Problem with TFRs as an LCSP and solve it using the branch and bound algorithm. We demonstrate the algorithm's efficiency on a dataset consisting of a global flight graph and a set of around 20000 real TFRs obtained from our industry partner Lufthansa Systems GmbH. We make this dataset publicly available. Finally, we conduct an empirical in-depth analysis of node selection rules, branching rules and conflicts. Carefully choosing an appropriate combination yields an improvement of an order of magnitude compared to an uninformed choice.
Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper
Yang, Chih-Kai, Huang, Kuan-Po, Lee, Hung-yi
This research explores how the information of prompts interacts with the high-performing speech recognition model, Whisper. We compare its performances when prompted by prompts with correct information and those corrupted with incorrect information. Our results unexpectedly show that Whisper may not understand the textual prompts in a human-expected way. Additionally, we find that performance improvement is not guaranteed even with stronger adherence to the topic information in textual prompts. It is also noted that English prompts generally outperform Mandarin ones on datasets of both languages, likely due to differences in training data distributions for these languages despite the mismatch with pre-training scenarios. Conversely, we discover that Whisper exhibits awareness of misleading information in language tokens by ignoring incorrect language tokens and focusing on the correct ones. In sum, We raise insightful questions about Whisper's prompt understanding and reveal its counter-intuitive behaviors. We encourage further studies.
EEL: Efficiently Encoding Lattices for Reranking
Singhal, Prasann, Xu, Jiacheng, Ye, Xi, Durrett, Greg
Standard decoding approaches for conditional text generation tasks typically search for an output hypothesis with high model probability, but this may not yield the best hypothesis according to human judgments of quality. Reranking to optimize for "downstream" metrics can better optimize for quality, but many metrics of interest are computed with pre-trained language models, which are slow to apply to large numbers of hypotheses. We explore an approach for reranking hypotheses by using Transformers to efficiently encode lattices of generated outputs, a method we call EEL. With a single Transformer pass over the entire lattice, we can approximately compute a contextualized representation of each token as if it were only part of a single hypothesis in isolation. We combine this approach with a new class of token-factored rerankers (TFRs) that allow for efficient extraction of high reranker-scoring hypotheses from the lattice. Empirically, our approach incurs minimal degradation error compared to the exponentially slower approach of encoding each hypothesis individually. When applying EEL with TFRs across three text generation tasks, our results show both substantial speedup compared to naive reranking and often better performance on downstream metrics than comparable approaches.
Building a Recommender System Using TFRS
The first part of this tutorial was about importing and cleaning the dataset. In this part, we will focus more on feature engineering, training, and evaluating the model. In the following part, we will run both the remove_repeating_subs() and build_training_sequences() functions. Note that for the sake of brevity, we won't include the code for both of these functions. The code for both functions can be found in the link below at the end of the tutorial.
Building a Recommender System Using TFRS
Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. It's free, we don't spam, and we never share your email address.
Tensorflow Releases New Package For Recommendation Systems: TFRS
From Amazon to Netflix to Pinterest, recommendation systems are the cornerstone of a majority of the modern-day billion-dollar industries. However, building recommender systems is not a straightforward task. What if we can build them in a few lines? Dropping the nitty-gritty details and concentrating on implementing algorithms with more ease is what any data scientist would like to get their hands on. Abstraction is a common trait amongst popular machine learning libraries or frameworks like TensorFlow.
Introducing TensorFlow Recommenders
From recommending movies or restaurants to coordinating fashion accessories and highlighting blog posts and news articles, recommender systems are an important application of machine learning, surfacing new discoveries and helping users find what they love. At Google, we have spent the last several years exploring new deep learning techniques to provide better recommendations through multi-task learning, reinforcement learning, better user representations and fairness objectives. These and other advancements have allowed us to greatly improve our recommendations. Today, we're excited to introduce TensorFlow Recommenders (TFRS), an open-source TensorFlow package that makes building, evaluating, and serving sophisticated recommender models easy. Built with TensorFlow 2.x, TFRS makes it possible to: TFRS is based on TensorFlow 2.x and Keras, making it instantly familiar and user-friendly.