auto
d79c8788088c2193f0244d8f1f36d2db-Supplemental.pdf
We modified the original60,000/10,000 train/test split to50,000/10,000/10,000 train/validate/test split, by partitioning away the last10,000training samples to the validation set. First, NALSM exhibited coexistence of small and large synaptic weights, which is necessary for chaotic activity in spiking networks [5]. Asexpected, spiketrain autocorrelation functions of input neurons remained flat showing no decay with respect to lag time. For N-MNIST, the average was 3,033,045.40
TRUST: A Decentralized Framework for Auditing Large Language Model Reasoning
Huang, Morris Yu-Chao, Tan, Zhen, Zhang, Mohan, Li, Pingzhi, Zhang, Zhuo, Chen, Tianlong
Large Language Models generate complex reasoning chains that reveal their decision-making, yet verifying the faithfulness and harmlessness of these intermediate steps remains a critical unsolved problem. Existing auditing methods are centralized, opaque, and hard to scale, creating significant risks for deploying proprietary models in high-stakes domains. We identify four core challenges: (1) Robustness: Centralized auditors are single points of failure, prone to bias or attacks. (2) Scalability: Reasoning traces are too long for manual verification. (3) Opacity: Closed auditing undermines public trust. (4) Privacy: Exposing full reasoning risks model theft or distillation. We propose TRUST, a transparent, decentralized auditing framework that overcomes these limitations via: (1) A consensus mechanism among diverse auditors, guaranteeing correctness under up to $30\%$ malicious participants. (2) A hierarchical DAG decomposition of reasoning traces, enabling scalable, parallel auditing. (3) A blockchain ledger that records all verification decisions for public accountability. (4) Privacy-preserving segmentation, sharing only partial reasoning steps to protect proprietary logic. We provide theoretical guarantees for the security and economic incentives of the TRUST framework. Experiments across multiple LLMs (GPT-OSS, DeepSeek-r1, Qwen) and reasoning tasks (math, medical, science, humanities) show TRUST effectively detects reasoning flaws and remains robust against adversarial auditors. Our work pioneers decentralized AI auditing, offering a practical path toward safe and trustworthy LLM deployment.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > North Carolina (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Structured Task Solving via Modular Embodied Intelligence: A Case Study on Rubik's Cube
Fan, Chongshan, Yuan, Shenghai
This paper presents Auto-RubikAI, a modular autonomous planning framework that integrates a symbolic Knowledge Base (KB), a vision-language model (VLM), and a large language model (LLM) to solve structured manipulation tasks exemplified by Rubik's Cube restoration. Unlike traditional robot systems based on predefined scripts, or modern approaches relying on pretrained networks and large-scale demonstration data, Auto-RubikAI enables interpretable, multi-step task execution with minimal data requirements and no prior demonstrations. The proposed system employs a KB module to solve group-theoretic restoration steps, overcoming LLMs' limitations in symbolic reasoning. A VLM parses RGB-D input to construct a semantic 3D scene representation, while the LLM generates structured robotic control code via prompt chaining. This tri-module architecture enables robust performance under spatial uncertainty. We deploy Auto-RubikAI in both simulation and real-world settings using a 7-DOF robotic arm, demonstrating effective Sim-to-Real adaptation without retraining. Experiments show a 79% end-to-end task success rate across randomized configurations. Compared to CFOP, DeepCubeA, and Two-Phase baselines, our KB-enhanced method reduces average solution steps while maintaining interpretability and safety. Auto-RubikAI provides a cost-efficient, modular foundation for embodied task planning in smart manufacturing, robotics education, and autonomous execution scenarios. Code, prompts, and hardware modules will be released upon publication.
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Why China's auto, tech giants threaten Tesla's self-driving future
Chinese electric-vehicle makers led by BYD beat Tesla in the competition to produce affordable electric vehicles. Now, many of those same fierce competitors are pulling into the passing lane in the global race to produce self-driving cars. BYD shook up China's smart-EV industry earlier this year by offering its "God's Eye" driver-assistance package for free, undercutting the technology Tesla sells for nearly 9,000 in China. "With God's Eye, Tesla's strategy starts to fall apart," said Shenzhen-based BYD investor Taylor Ogan, an American who has owned several Teslas and driven BYD cars with God's Eye, which he called more capable than Tesla's "Full Self-Driving" (FSD). Other Chinese auto and tech companies are offering affordable EVs with FSD-like technology for a relative pittance.
- Transportation > Ground > Road (1.00)
- Transportation > Electric Vehicle (1.00)
- Automobiles & Trucks > Manufacturer (1.00)
Using In-Context Learning for Automatic Defect Labelling of Display Manufacturing Data
Hussain, Babar, Liu, Qiang, Chen, Gang, She, Bihai, Yu, Dahai
This paper presents an AI-assisted auto-labeling system for display panel defect detection that leverages in-context learning capabilities. We adopt and enhance the SegGPT architecture with several domain-specific training techniques and introduce a scribble-based annotation mechanism to streamline the labeling process. Our two-stage training approach, validated on industrial display panel datasets, demonstrates significant improvements over the baseline model, achieving an average IoU increase of 0.22 and a 14% improvement in recall across multiple product types, while maintaining approximately 60% auto-labeling coverage. Experimental results show that models trained on our auto-labeled data match the performance of those trained on human-labeled data, offering a practical solution for reducing manual annotation efforts in industrial inspection systems.
AI-Assisted Decision-Making for Clinical Assessment of Auto-Segmented Contour Quality
Wang, Biling, Maniscalco, Austen, Bai, Ti, Wang, Siqiu, Dohopolski, Michael, Lin, Mu-Han, Shen, Chenyang, Nguyen, Dan, Huang, Junzhou, Jiang, Steve, Wang, Xinlei
Purpose: This study introduces a novel Deep Learning (DL) - based q uality a sses s ment (QA) approach specifically designed for evaluating auto - generated contours (auto - contour s) in auto - segmentation for radiotherapy, with a focus on Online Adaptive Radiotherapy (OART). The proposed method leverages Bayesian Ordinal Classification (BOC), combined with cali brated thresholds derived from uncertainty quantification, to deliver confident QA predictions . This approach address es key challenges in clinical auto - segmentation QA workflows such as the absence of ground truth contours, limited availability of manually labeled data, and inherent uncertainty in AI model predictions . Methods: We developed a BOC model to classify the quality of auto - contour s and quantify uncertainty. To enhance predictive reliability, we implemented a calibration step to determine optimal uncertainty thresholds that meet specific clinical accuracy requirements . The method was validated under three distinct data availability scenarios: absence of manual labels, limited manual labeling, and extensive manual labeling. We specifically tested our method for auto - segmented rectum contours in prostate cancer radiotherapy. Geometric surrogate labels were employed in the absence of manual labels, transfer learning was applied when manual labels were limited, and direct use of manual labels was perf ormed when extensive labeling was available. Results: The BOC model demonstrated robust performance across all data availability scenarios for confident predictions, with significant accuracy gains when pre - trained with surrogate labels and fine - tuned with limited manual ly label ed data . Specifically, fine - tuning the pretrained model with just 30 manually labeled cases and calibrating with 34 subjects achieved over an accuracy of over 90% against manual labels in the test dataset .
- North America > United States > Texas > Dallas County > Dallas (0.04)
- North America > United States > Texas > Tarrant County > Arlington (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Geometrically-Aware One-Shot Skill Transfer of Category-Level Objects
de Farias, Cristiana, Figueredo, Luis, Laha, Riddhiman, Adjigble, Maxime, Tamadazte, Brahim, Stolkin, Rustam, Haddadin, Sami, Marturi, Naresh
Robotic manipulation of unfamiliar objects in new environments is challenging and requires extensive training or laborious pre-programming. We propose a new skill transfer framework, which enables a robot to transfer complex object manipulation skills and constraints from a single human demonstration. Our approach addresses the challenge of skill acquisition and task execution by deriving geometric representations from demonstrations focusing on object-centric interactions. By leveraging the Functional Maps (FM) framework, we efficiently map interaction functions between objects and their environments, allowing the robot to replicate task operations across objects of similar topologies or categories, even when they have significantly different shapes. Additionally, our method incorporates a Task-Space Imitation Algorithm (TSIA) which generates smooth, geometrically-aware robot paths to ensure the transferred skills adhere to the demonstrated task constraints. We validate the effectiveness and adaptability of our approach through extensive experiments, demonstrating successful skill transfer and task execution in diverse real-world environments without requiring additional training.
- Europe > United Kingdom > England > West Midlands > Birmingham (0.04)
- Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.04)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- (3 more...)
Deep Sequence Models for Predicting Average Shear Wave Velocity from Strong Motion Records
Yilmaz, Baris, Akagündüz, Erdem, Tileylioglu, Salih
This study explores the use of deep learning for predicting the time averaged shear wave velocity in the top 30 m of the subsurface ($V_{s30}$) at strong motion recording stations in T\"urkiye. $V_{s30}$ is a key parameter in site characterization and, as a result for seismic hazard assessment. However, it is often unavailable due to the lack of direct measurements and is therefore estimated using empirical correlations. Such correlations however are commonly inadequate in capturing complex, site-specific variability and this motivates the need for data-driven approaches. In this study, we employ a hybrid deep learning model combining convolutional neural networks (CNNs) and long short-term memory (LSTM) networks to capture both spatial and temporal dependencies in strong motion records. Furthermore, we explore how using different parts of the signal influence our deep learning model. Our results suggest that the hybrid approach effectively learns complex, nonlinear relationships within seismic signals. We observed that an improved P-wave arrival time model increased the prediction accuracy of $V_{s30}$. We believe the study provides valuable insights into improving $V_{s30}$ predictions using a CNN-LSTM framework, demonstrating its potential for improving site characterization for seismic studies. Our codes are available via this repo: https://github.com/brsylmz23/CNNLSTM_DeepEQ
- Asia > Middle East > Republic of Türkiye (0.47)
- North America > United States (0.46)
Overconfident Oracles: Limitations of In Silico Sequence Design Benchmarking
Surana, Shikha, Grinsztajn, Nathan, Atkinson, Timothy, Duckworth, Paul, Barrett, Thomas D.
Machine learning methods can automate the in silico design of biological sequences, aiming to reduce costs and accelerate medical research. Given the limited access to wet labs, in silico design methods commonly use an oracle model to evaluate de novo generated sequences. However, the use of different oracle models across methods makes it challenging to compare them reliably, motivating the question: are in silico sequence design benchmarks reliable? In this work, we examine 12 sequence design methods that utilise ML oracles common in the literature and find that there are significant challenges with their cross-consistency and reproducibility. Indeed, oracles differing by architecture, or even just training seed, are shown to yield conflicting relative performance with our analysis suggesting poor out-of-distribution generalisation as a key issue. To address these challenges, we propose supplementing the evaluation with a suite of biophysical measures to assess the viability of generated sequences and limit out-of-distribution sequences the oracle is required to score, thereby improving the robustness of the design procedure. Our work aims to highlight potential pitfalls in the current evaluation process and contribute to the development of robust benchmarks, ultimately driving the improvement of in silico design methods.
- Europe > Austria > Vienna (0.14)
- Europe > United Kingdom > England > Greater London > London (0.04)
Iterative Auto-Annotation for Scientific Named Entity Recognition Using BERT-Based Models
This paper presents an iterative approach to performing Scientific Named Entity Recognition (SciNER) using BERT - based models. We leverage transfer learning to fine - tune pre - trained models with a small but high - quality set of manually annotated data. The process is iteratively refined by using the fine - tuned model to auto - annotate a larger dataset, followed by additional rounds of fine - tuning. We evaluated two models, dslim/bert - large - NER and bert - large - cased, and found that bert - large - cased consistently outperformed the former. Our approach demonstrated significant improvements in prediction accuracy and F1 sco res, especially for less common entity classes. Future work could include pre - training with unlabeled data, exploring more powerful encoders like RoBERTa, and expanding the scope of manual annotations. This methodology has broader applications in NLP tasks where access to labeled data is limited.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > District of Columbia > Washington (0.04)