dos
DOS: Dependency-Oriented Sampler for Masked Diffusion Language Models
Zhou, Xueyu, Hu, Yangrong, Huang, Jian
Masked diffusion language models (MDLMs) have recently emerged as a new paradigm in language modeling, offering flexible generation dynamics and enabling efficient parallel decoding. However, existing decoding strategies for pre-trained MDLMs predominantly rely on token-level uncertainty criteria, while largely overlooking sequence-level information and inter-token dependencies. To address this limitation, we propose Dependency-Oriented Sampler (DOS), a training-free decoding strategy that leverages inter-token dependencies to inform token updates during generation. Specifically, DOS exploits attention matrices from transformer blocks to approximate inter-token dependencies, emphasizing information from unmasked tokens when updating masked positions. Empirical results demonstrate that DOS consistently achieves superior performance on both code generation and mathematical reasoning tasks. Moreover, DOS can be seamlessly integrated with existing parallel sampling methods, leading to improved generation efficiency without sacrificing generation quality.
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Density of States Prediction of Crystalline Materials via Prompt-guided Multi-Modal Transformer Namkyeong Lee
That is, DOS is not solely determined by the crystalline material but also by the energy levels, which has been neglected in previous works. In this paper, we propose to integrate heterogeneous information obtained from the crystalline materials and the energies via a multi-modal transformer, thereby modeling the complex relationships between the atoms in the crystalline materials and various energy levels for DOS prediction. Moreover, we propose to utilize prompts to guide the model to learn the crystal structural system-specific interactions between crystalline materials and energies. Extensive experiments on two types of DOS, i.e., Phonon DOS and Electron DOS, with various real-world scenarios demonstrate the superiority of DOST ransformer .
- North America > United States (0.14)
- Asia > Middle East > Israel (0.04)
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- Energy (0.93)
- Health & Medicine (0.67)
- Materials (0.67)
Bayesian Optimization of Multi-Bit Pulse Encoding in In2O3/Al2O3 Thin-film Transistors for Temporal Data Processing
Meza-Arroyo, Javier, Dunn, Benius, Xu, Weijie, Chen, Yu-Chieh, Chen, Jen-Sue, Hsu, Julia W. P.
Utilizing the intrinsic history-dependence and nonlinearity of hardware, physical reservoir computing is a promising neuromorphic approach to encode time-series data for in-sensor computing. The accuracy of this encoding critically depends on the distinguishability of multi-state outputs, which is often limited by suboptimal and empirically chosen reservoir operation conditions. In this work, we demonstrate a machine learning approach, Bayesian optimization, to improve the encoding fidelity of solution-processed Al2O3/In2O3 thin-film transistors (TFTs). We show high-fidelity 6-bit temporal encoding by exploring five key pulse parameters and using the normalized degree of separation (nDoS) as the metric of output state separability. Additionally, we show that a model trained on simpler 4-bit data can effectively guide optimization of more complex 6-bit encoding tasks, reducing experimental cost. Specifically, for the encoding and reconstruction of binary-patterned images of a moving car across 6 sequential frames, we demonstrate that the encoding is more accurate when operating the TFT using optimized pulse parameters and the 4-bit optimized operating condition performs almost as well as the 6-bit optimized condition. Finally, interpretability analysis via Shapley Additive Explanations (SHAP) reveals that gate pulse amplitude and drain voltage are the most influential parameters in achieving higher state separation. This work presents the first systematic method to identify optimal operating conditions for reservoir devices, and the approach can be extended to other physical reservoir implementations across different material platforms.
- North America > United States (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Taiwan (0.04)
- Materials > Chemicals (0.46)
- Information Technology > Software (0.41)
- North America > United States (0.14)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- Energy (0.93)
- Health & Medicine (0.67)
- Materials (0.67)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Netherlands (0.04)
Enhancing ML Model Interpretability: Leveraging Fine-Tuned Large Language Models for Better Understanding of AI
Bokstaller, Jonas, Altheimer, Julia, Dormehl, Julian, Buss, Alina, Wiltfang, Jasper, Schneider, Johannes, Röglinger, Maximilian
Across various sectors applications of eXplainableAI (XAI) gained momentum as the increasing black-boxedness of prevailing Machine Learning (ML) models became apparent. In parallel, Large Language Models (LLMs) significantly developed in their abilities to understand human language and complex patterns. By combining both, this paper presents a novel reference architecture for the interpretation of XAI through an interactive chatbot powered by a fine-tuned LLM. We instantiate the reference architecture in the context of State-of-Health (SoH) prediction for batteries and validate its design in multiple evaluation and demonstration rounds. The evaluation indicates that the implemented prototype enhances the human interpretability of ML, especially for users with less experience with XAI.
- Europe > Austria > Vienna (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- Europe > Liechtenstein > Vaduz > Vaduz (0.04)
- (14 more...)
- Research Report (1.00)
- Overview (0.93)
- Information Technology (0.93)
- Energy > Energy Storage (0.68)
- Electrical Industrial Apparatus (0.68)
Interpretable, multi-dimensional Evaluation Framework for Causal Discovery from observational i.i.d. Data
Velev, Georg, Lessmann, Stefan
Nonlinear causal discovery from observational data imposes strict identifiability assumptions on the formulation of structural equations utilized in the data generating process. The evaluation of structure learning methods under assumption violations requires a rigorous and interpretable approach, which quantifies both the structural similarity of the estimation with the ground truth and the capacity of the discovered graphs to be used for causal inference. Motivated by the lack of unified performance assessment framework, we introduce an interpretable, six-dimensional evaluation metric, i.e., distance to optimal solution (DOS), which is specifically tailored to the field of causal discovery. Furthermore, this is the first research to assess the performance of structure learning algorithms from seven different families on increasing percentage of non-identifiable, nonlinear causal patterns, inspired by real-world processes. Our large-scale simulation study, which incorporates seven experimental factors, shows that besides causal order-based methods, amortized causal discovery delivers results with comparatively high proximity to the optimal solution.
- Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom (0.04)
- Europe > Germany > Berlin (0.04)
- Education (0.34)
- Health & Medicine (0.31)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
- (2 more...)
MYCROFT: Towards Effective and Efficient External Data Augmentation
Sarwar, Zain, Tran, Van, Bhagoji, Arjun Nitin, Feamster, Nick, Zhao, Ben Y., Chakraborty, Supriyo
Machine learning (ML) models often require large amounts of data to perform well. When the available data is limited, model trainers may need to acquire more data from external sources. Often, useful data is held by private entities who are hesitant to share their data due to propriety and privacy concerns. This makes it challenging and expensive for model trainers to acquire the data they need to improve model performance. To address this challenge, we propose Mycroft, a data-efficient method that enables model trainers to evaluate the relative utility of different data sources while working with a constrained data-sharing budget. By leveraging feature space distances and gradient matching, Mycroft identifies small but informative data subsets from each owner, allowing model trainers to maximize performance with minimal data exposure. Experimental results across four tasks in two domains show that Mycroft converges rapidly to the performance of the full-information baseline, where all data is shared. Moreover, Mycroft is robust to noise and can effectively rank data owners by utility. Mycroft can pave the way for democratized training of high performance ML models.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
Towards Safe and Reliable Autonomous Driving: Dynamic Occupancy Set Prediction
Shao, Wenbo, Xu, Jiahui, Yu, Wenhao, Li, Jun, Wang, Hong
In the rapidly evolving field of autonomous driving, reliable prediction is pivotal for vehicular safety. However, trajectory predictions often deviate from actual paths, particularly in complex and challenging environments, leading to significant errors. To address this issue, our study introduces a novel method for Dynamic Occupancy Set (DOS) prediction, it effectively combines advanced trajectory prediction networks with a DOS prediction module, overcoming the shortcomings of existing models. It provides a comprehensive and adaptable framework for predicting the potential occupancy sets of traffic participants. The innovative contributions of this study include the development of a novel DOS prediction model specifically tailored for navigating complex scenarios, the introduction of precise DOS mathematical representations, and the formulation of optimized loss functions that collectively advance the safety and efficiency of autonomous systems. Through rigorous validation, our method demonstrates marked improvements over traditional models, establishing a new benchmark for safety and operational efficiency in intelligent transportation systems.
- Automobiles & Trucks (1.00)
- Transportation > Ground > Road (0.73)
- Information Technology > Robotics & Automation (0.64)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Data Science > Data Mining (0.94)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Agent-oriented Joint Decision Support for Data Owners in Auction-based Federated Learning
Tang, Xiaoli, Yu, Han, Li, Xiaoxiao
Auction-based Federated Learning (AFL) has attracted extensive research interest due to its ability to motivate data owners (DOs) to join FL through economic means. While many existing AFL methods focus on providing decision support to model users (MUs) and the AFL auctioneer, decision support for data owners remains open. To bridge this gap, we propose a first-of-its-kind agent-oriented joint Pricing, Acceptance and Sub-delegation decision support approach for data owners in AFL (PAS-AFL). By considering a DO's current reputation, pending FL tasks, willingness to train FL models, and its trust relationships with other DOs, it provides a systematic approach for a DO to make joint decisions on AFL bid acceptance, task sub-delegation and pricing based on Lyapunov optimization to maximize its utility. It is the first to enable each DO to take on multiple FL tasks simultaneously to earn higher income for DOs and enhance the throughput of FL tasks in the AFL ecosystem. Extensive experiments based on six benchmarking datasets demonstrate significant advantages of PAS-AFL compared to six alternative strategies, beating the best baseline by 28.77% and 2.64% on average in terms of utility and test accuracy of the resulting FL models, respectively.
- Asia > Singapore (0.05)
- North America > Canada > Ontario > Toronto (0.04)
- North America > Canada > British Columbia (0.04)
- Asia > China (0.04)