pier
Multigranular Evaluation for Brain Visual Decoding
Existing evaluation protocols for brain visual decoding predominantly rely on coarse metrics that obscure inter-model differences, lack neuroscientific foundation, and fail to capture fine-grained visual distinctions. To address these limitations, we introduce BASIC, a unified, multigranular evaluation framework that jointly quantifies structural fidelity, inferential alignment, and contextual coherence between decoded and ground-truth images. For the structural level, we introduce a hierarchical suite of segmentation-based metrics, including foreground, semantic, instance, and component masks, anchored in granularity-aware correspondence across mask structures. For the semantic level, we extract structured scene representations encompassing objects, attributes, and relationships using multimodal large language models, enabling detailed, scalable, and context-rich comparisons with ground-truth stimuli. We benchmark a diverse set of visual decoding methods across multiple stimulus-neuroimaging datasets within this unified evaluation framework. Together, these criteria provide a more discriminative, interpretable, and comprehensive foundation for evaluating brain visual decoding methods.
DiscoSG: Towards Discourse-Level Text Scene Graph Parsing through Iterative Graph Refinement
Lin, Shaoqing, Teng, Chong, Li, Fei, Ji, Donghong, Qu, Lizhen, Li, Zhuang
Vision-Language Models (VLMs) generate discourse-level, multi-sentence visual descriptions, challenging text scene graph parsers built for single-sentence caption-to-graph mapping. Current approaches typically merge sentence-level parsing outputs for discourse input, often missing phenomena like cross-sentence coreference, resulting in fragmented graphs and degraded downstream VLM task performance. We introduce a new task, Discourse-level text Scene Graph parsing (DiscoSG), and release DiscoSG-DS, a dataset of 400 expert-annotated and 8,430 synthesised multi-sentence caption-graph pairs. Each caption averages 9 sentences, and each graph contains at least 3 times more triples than those in existing datasets. Fine-tuning GPT-4o on DiscoSG-DS yields over 40% higher SPICE metric than the best sentence-merging baseline. However, its high inference cost and licensing restrict open-source use. Smaller fine-tuned open-source models (e.g., Flan-T5) perform well on simpler graphs yet degrade on denser, more complex graphs. To bridge this gap, we introduce DiscoSG-Refiner, a lightweight open-source parser that drafts a seed graph and iteratively refines it with a novel learned graph-editing model, achieving 30% higher SPICE than the baseline while delivering 86 times faster inference than GPT-4o. It generalises from simple to dense graphs, thereby consistently improving downstream VLM tasks, including discourse-level caption evaluation and hallucination detection, outperforming alternative open-source parsers. Code and data are available at https://github.com/ShaoqLin/DiscoSG .
Adapting Language Balance in Code-Switching Speech
Ugan, Enes Yavuz, Pham, Ngoc-Quan, Waibel, Alexander
Despite achieving impressive results on standard benchmarks, large foundational models still struggle against code-switching test cases. When data scarcity cannot be used as the usual justification for poor performance, the reason may lie in the infrequent occurrence of code-switched moments, where the embedding of the second language appears subtly. Instead of expecting the models to learn this infrequency on their own, it might be beneficial to provide the training process with labels. Evaluating model performance on code-switching data requires careful localization of code-switching points where recognition errors are most consequential, so that the analysis emphasizes mistakes occurring at those moments. Building on this observation, we leverage the difference between the embedded and the main language to highlight those code-switching points and thereby emphasize learning at those locations. This simple yet effective differentiable surrogate mitigates context bias during generation -- the central challenge in code-switching -- thereby improving the model's robustness. Our experiments with Arabic and Chinese-English showed that the models are able to predict the switching places more correctly, reflected by the reduced substitution error.
If you love AI, you'll love Ken Liu's new cyberpunk thriller
If you love AI, you'll love Ken Liu's new cyberpunk thriller In Ken Liu's All That We See or Seem, a once-famous hacker must find a missing dream-weaver. The latest novel by Ken Liu, All That We See or Seem, is the near-future story of the mysterious disappearance of a professional dream-weaver called Elli. It is being marketed as a cyberpunk thriller . Full disclosure: I don't generally seek out thrillers or cyberpunk books, so I may not be the target audience for this. But I was keen to read it because Liu has not one but two claims to fame: as well as being the author of a celebrated fantasy series called The Dandelion Dynasty, he is also the translator of the sensationally good Remembrance of Earth's Past trilogy by Cixin Liu .
PIER: A Novel Metric for Evaluating What Matters in Code-Switching
Ugan, Enes Yavuz, Pham, Ngoc-Quan, Bรคrmann, Leonard, Waibel, Alex
Code-switching, the alternation of languages within a single discourse, presents a significant challenge for Automatic Speech Recognition. Despite the unique nature of the task, performance is commonly measured with established metrics such as Word-Error-Rate (WER). However, in this paper, we question whether these general metrics accurately assess performance on code-switching. Specifically, using both Connectionist-Temporal-Classification and Encoder-Decoder models, we show fine-tuning on non-code-switched data from both matrix and embedded language improves classical metrics on code-switching test sets, although actual code-switched words worsen (as expected). Therefore, we propose Point-of-Interest Error Rate (PIER), a variant of WER that focuses only on specific words of interest. We instantiate PIER on code-switched utterances and show that this more accurately describes the code-switching performance, showing huge room for improvement in future work. This focused evaluation allows for a more precise assessment of model performance, particularly in challenging aspects such as inter-word and intra-word code-switching.
Santa Monica uses police drone to catch car burglar in the act
Santa Monica Police spotted and stopped a man who was burglarizing vehicles in a parking lot near the pier by using a drone. On July 6, a Santa Monica police officer was directing the department's drone back to the station from a radio call when the officer decided to survey the Fourth of July weekend crowd near the pier and the nearby parking lots. As the drone flew over Lot 1 North, the parking lot next to the pier, he noticed a man wandering the lot, according to a video the department posted on their YouTube account. "As [the pilot] watched, the subject approached an unoccupied parked vehicle, pulled out tools from his sweatshirt and quickly punched open the lock of the driver's side door," the department said in the video. The drone footage shows the suspected burglar break the lock of the driver's side door of a black SUV then climb into the car.
Multi-Robot Rendezvous in Unknown Environment with Limited Communication
Song, Kun, Chen, Gaoming, Liu, Wenhang, Xiong, Zhenhua
Rendezvous aims at gathering all robots at a specific location, which is an important collaborative behavior for multirobot systems. However, in an unknown environment, it is challenging to achieve rendezvous. Previous researches mainly focus on special scenarios where communication is not allowed and each robot executes a random searching strategy, which is highly time-consuming, especially in large-scale environments. In this work, we focus on rendezvous in unknown environments where communication is available. We divide this task into two steps: rendezvous based environment exploration with relative pose (RP) estimation and rendezvous point election. A new strategy called partitioned and incomplete exploration for rendezvous (PIER) is proposed to efficiently explore the unknown environment, where lightweight topological maps are constructed and shared among robots for RP estimation with very few communications. Then, a rendezvous point selection algorithm based on the merged topological map is proposed for efficient rendezvous for multi-robot systems. The effectiveness of the proposed methods is validated in both simulations and real-world experiments.
PIER: Permutation-Level Interest-Based End-to-End Re-ranking Framework in E-commerce
Shi, Xiaowen, Yang, Fan, Wang, Ze, Wu, Xiaoxu, Guan, Muzhi, Liao, Guogang, Wang, Yongkang, Wang, Xingxing, Wang, Dong
Re-ranking draws increased attention on both academics and industries, which rearranges the ranking list by modeling the mutual influence among items to better meet users' demands. Many existing re-ranking methods directly take the initial ranking list as input, and generate the optimal permutation through a well-designed context-wise model, which brings the evaluation-before-reranking problem. Meanwhile, evaluating all candidate permutations brings unacceptable computational costs in practice. Thus, to better balance efficiency and effectiveness, online systems usually use a two-stage architecture which uses some heuristic methods such as beam-search to generate a suitable amount of candidate permutations firstly, which are then fed into the evaluation model to get the optimal permutation. However, existing methods in both stages can be improved through the following aspects. As for generation stage, heuristic methods only use point-wise prediction scores and lack an effective judgment. As for evaluation stage, most existing context-wise evaluation models only consider the item context and lack more fine-grained feature context modeling. This paper presents a novel end-to-end re-ranking framework named PIER to tackle the above challenges which still follows the two-stage architecture and contains two mainly modules named FPSM and OCPM. We apply SimHash in FPSM to select top-K candidates from the full permutation based on user's permutation-level interest in an efficient way. Then we design a novel omnidirectional attention mechanism in OCPM to capture the context information in the permutation. Finally, we jointly train these two modules end-to-end by introducing a comparative learning loss. Offline experiment results demonstrate that PIER outperforms baseline models on both public and industrial datasets, and we have successfully deployed PIER on Meituan food delivery platform.
How artificial intelligence robots can support NJ's underwater infrastructure Video NJTV News
Doctoral students at Stevens Institute of Technology hope the robot they're developing will be able to dive into waters and perform tasks that could be very dangerous for humans. "We would like the robot to be able to do infrastructure inspections, ideally to assess the integrity of underwater infrastructure, make sure everything is intact, working properly, that there are no damages or defects. Or potentially from a security standpoint, that there are no anomalies planted on an underwater piece of infrastructure," said Dr. Brendan Englot, professor of mechanical engineering. To do this, the robot must be able to understand its location in the water and be able to accurately find the structures it needs to assess. Through a process called machine learning, the robot gathers data and improves its own performance.
Review: Mavic 2 Pro drone soars overhead with best image quality
If you're new to drones, here are 10 rules you need to know before flying USA TODAY So when the new DJI Mavic 2 Pro takes off from the edge of the Pier, without bothering any of the local fishing enthusiasts, soars off without a hitch and returns safely, without hitting anything you're a happy camper. And that was before we get a chance to examine the footage, which turns out is way more impressive than what we saw with the previous edition of the Mavic Pro. Newsflash: This is a drone with a camera-size sensor and lens on it, that flies on command. The Mavic 2 is the update to the original Mavic Pro drone, which broke ground in 2016 as the first somewhat affordable, quality drone that was also compact enough to fit into a backpack. More: Does'yes' mean'yes?' Can you give consent to have sex to an app?