Goto

Collaborating Authors

 tin


A Proofs of Main Results

Neural Information Processing Systems

(conclusion 1). (conclusion 2). Z contains and only contains exogenous noises w.r.t. " means source and " Based on Theorem 6, we can readily give proof to Theorem 2. Note that in our setting where " is equivalent to " Theorem 7 (Trek-separation for directed graphical models, Theorem 2.8 in [ We now show that Theorem 2 can also be proved by trek-separation theorem: Proof of Theorem 2 (another version). 's noise components that is not shared in Therefore, the direction between X and Y is unidentifiable. GIN( Z, Y) must hold, with solution ω .



SigmaCollab: An Application-Driven Dataset for Physically Situated Collaboration

Bohus, Dan, Andrist, Sean, Paradiso, Ann, Saw, Nick, Schoonbeek, Tim, Stiber, Maia

arXiv.org Artificial Intelligence

We introduce SigmaCollab, a dataset enabling research on physically situated human-AI collaboration. The dataset consists of a set of 85 sessions in which untrained participants were guided by a mixed-reality assistive AI agent in performing procedural tasks in the physical world. SigmaCollab includes a set of rich, multimodal data streams, such as the participant and system audio, egocentric camera views from the head-mounted device, depth maps, head, hand and gaze tracking information, as well as additional annotations performed post-hoc. While the dataset is relatively small in size (~ 14 hours), its application-driven and interactive nature brings to the fore novel research challenges for human-AI collaboration, and provides more realistic testing grounds for various AI models operating in this space. In future work, we plan to use the dataset to construct a set of benchmarks for physically situated collaboration in mixed-reality task assistive scenarios. SigmaCollab is available at https://github.com/microsoft/SigmaCollab.


A Proofs of Main Results

Neural Information Processing Systems

(conclusion 1). (conclusion 2). Z contains and only contains exogenous noises w.r.t. " means source and " Based on Theorem 6, we can readily give proof to Theorem 2. Note that in our setting where " is equivalent to " Theorem 7 (Trek-separation for directed graphical models, Theorem 2.8 in [ We now show that Theorem 2 can also be proved by trek-separation theorem: Proof of Theorem 2 (another version). 's noise components that is not shared in Therefore, the direction between X and Y is unidentifiable. GIN( Z, Y) must hold, with solution ω .



DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning

Gao, Zeyu, Mu, Yao, Qu, Jinye, Hu, Mengkang, Guo, Lingyue, Luo, Ping, Lu, Yanfeng

arXiv.org Artificial Intelligence

Dual-arm robots offer enhanced versatility and efficiency over single-arm counterparts by enabling concurrent manipulation of multiple objects or cooperative execution of tasks using both arms. However, effectively coordinating the two arms for complex long-horizon tasks remains a significant challenge. Existing task planning methods predominantly focus on single-arm robots or rely on predefined bimanual operations, failing to fully leverage the capabilities of dual-arm systems. To address this limitation, we introduce DAG-Plan, a structured task planning framework tailored for dual-arm robots. DAG-Plan harnesses large language models (LLMs) to decompose intricate tasks into actionable sub-tasks represented as nodes within a directed acyclic graph (DAG). Critically, DAG-Plan dynamically assigns these sub-tasks to the appropriate arm based on real-time environmental observations, enabling parallel and adaptive execution. We evaluate DAG-Plan on the novel Dual-Arm Kitchen Benchmark, comprising 9 sequential tasks with 78 sub-tasks and 26 objects. Extensive experiments demonstrate the superiority of DAG-Plan over directly using LLM to generate plans, achieving nearly 50% higher efficiency compared to the single-arm task planning baseline and nearly double the success rate of the dual-arm task planning baseline.


Temporal Interest Network for Click-Through Rate Prediction

Zhou, Haolin, Pan, Junwei, Zhou, Xinyi, Chen, Xihua, Jiang, Jie, Gao, Xiaofeng, Chen, Guihai

arXiv.org Artificial Intelligence

The history of user behaviors constitutes one of the most significant characteristics in predicting the click-through rate (CTR), owing to their strong semantic and temporal correlation with the target item. While the literature has individually examined each of these correlations, research has yet to analyze them in combination, that is, the quadruple correlation of (behavior semantics, target semantics, behavior temporal, and target temporal). The effect of this correlation on performance and the extent to which existing methods learn it remain unknown. To address this gap, we empirically measure the quadruple correlation and observe intuitive yet robust quadruple patterns. We measure the learned correlation of several representative user behavior methods, but to our surprise, none of them learn such a pattern, especially the temporal one. In this paper, we propose the Temporal Interest Network (TIN) to capture the quadruple semantic and temporal correlation between behaviors and the target. We achieve this by incorporating target-aware temporal encoding, in addition to semantic embedding, to represent behaviors and the target. Furthermore, we deploy target-aware attention, along with target-aware representation, to explicitly conduct the 4-way interaction. We performed comprehensive evaluations on the Amazon and Alibaba datasets. Our proposed TIN outperforms the best-performing baselines by 0.43\% and 0.29\% on two datasets, respectively. Comprehensive analysis and visualization show that TIN is indeed capable of learning the quadruple correlation effectively, while all existing methods fail to do so. We provide our implementation of TIN in Tensorflow.


Automated Testing and Improvement of Named Entity Recognition Systems

Yu, Boxi, Hu, Yiyan, Mang, Qiuyang, Hu, Wenhan, He, Pinjia

arXiv.org Artificial Intelligence

Named entity recognition (NER) systems have seen rapid progress in recent years due to the development of deep neural networks. These systems are widely used in various natural language processing applications, such as information extraction, question answering, and sentiment analysis. However, the complexity and intractability of deep neural networks can make NER systems unreliable in certain circumstances, resulting in incorrect predictions. For example, NER systems may misidentify female names as chemicals or fail to recognize the names of minority groups, leading to user dissatisfaction. To tackle this problem, we introduce TIN, a novel, widely applicable approach for automatically testing and repairing various NER systems. The key idea for automated testing is that the NER predictions of the same named entities under similar contexts should be identical. The core idea for automated repairing is that similar named entities should have the same NER prediction under the same context. We use TIN to test two SOTA NER models and two commercial NER APIs, i.e., Azure NER and AWS NER. We manually verify 784 of the suspicious issues reported by TIN and find that 702 are erroneous issues, leading to high precision (85.0%-93.4%) across four categories of NER errors: omission, over-labeling, incorrect category, and range error. For automated repairing, TIN achieves a high error reduction rate (26.8%-50.6%) over the four systems under test, which successfully repairs 1,056 out of the 1,877 reported NER errors.


Independence Testing-Based Approach to Causal Discovery under Measurement Error and Linear Non-Gaussian Models

Dai, Haoyue, Spirtes, Peter, Zhang, Kun

arXiv.org Artificial Intelligence

Causal discovery aims to recover causal structures generating the observational data. Despite its success in certain problems, in many real-world scenarios the observed variables are not the target variables of interest, but the imperfect measures of the target variables. Causal discovery under measurement error aims to recover the causal graph among unobserved target variables from observations made with measurement error. We consider a specific formulation of the problem, where the unobserved target variables follow a linear non-Gaussian acyclic model, and the measurement process follows the random measurement error model. Existing methods on this formulation rely on non-scalable over-complete independent component analysis (OICA). In this work, we propose the Transformed Independent Noise (TIN) condition, which checks for independence between a specific linear transformation of some measured variables and certain other measured variables. By leveraging the non-Gaussianity and higher-order statistics of data, TIN is informative about the graph structure among the unobserved target variables. By utilizing TIN, the ordered group decomposition of the causal model is identifiable. In other words, we could achieve what once required OICA to achieve by only conducting independence tests. Experimental results on both synthetic and real-world data demonstrate the effectiveness and reliability of our method.