AITopics | igma

Collaborating Authors

igma

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SigmaCollab: An Application-Driven Dataset for Physically Situated Collaboration

Bohus, Dan, Andrist, Sean, Paradiso, Ann, Saw, Nick, Schoonbeek, Tim, Stiber, Maia

arXiv.org Artificial IntelligenceNov-5-2025

We introduce SigmaCollab, a dataset enabling research on physically situated human-AI collaboration. The dataset consists of a set of 85 sessions in which untrained participants were guided by a mixed-reality assistive AI agent in performing procedural tasks in the physical world. SigmaCollab includes a set of rich, multimodal data streams, such as the participant and system audio, egocentric camera views from the head-mounted device, depth maps, head, hand and gaze tracking information, as well as additional annotations performed post-hoc. While the dataset is relatively small in size (~ 14 hours), its application-driven and interactive nature brings to the fore novel research challenges for human-AI collaboration, and provides more realistic testing grounds for various AI models operating in this space. In future work, we plan to use the dataset to construct a set of benchmarks for physically situated collaboration in mixed-reality task assistive scenarios. SigmaCollab is available at https://github.com/microsoft/SigmaCollab.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.0256

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Human Computer Interaction > Interfaces (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Lin, Zhenghao, Tang, Zihao, Liu, Xiao, Gong, Yeyun, Cheng, Yi, Chen, Qi, Li, Hang, Xin, Ying, Yang, Ziyue, Yang, Kailai, Yan, Yu, Liang, Xiao, Lu, Shuai, Huang, Yiming, Luo, Zheheng, Qu, Lei, Feng, Xuan, Wang, Yaoxiang, Xia, Yuqing, Chen, Feiyang, Jiang, Yuting, Hu, Yasen, Ni, Hao, Li, Binyang, Zhao, Guoshuai, Chiang, Jui-Hao, Guo, Zhongxin, Lin, Chen, Kuang, Kun, Li, Wenjie, Shen, Yelong, Jiao, Jian, Cheng, Peng, Yang, Mao

arXiv.org Artificial IntelligenceFeb-10-2025

We introduce Sigma, an efficient large language model specialized for the system domain, empowered by a novel architecture including DiffQKV attention, and pre-trained on our meticulously collected system domain data. DiffQKV attention significantly enhances the inference efficiency of Sigma by optimizing the Query (Q), Key (K), and Value (V) components in the attention mechanism differentially, based on their varying impacts on the model performance and efficiency indicators. Specifically, we (1) conduct extensive experiments that demonstrate the model's varying sensitivity to the compression of K and V components, leading to the development of differentially compressed KV, and (2) propose augmented Q to expand the Q head dimension, which enhances the model's representation capacity with minimal impacts on the inference speed. Rigorous theoretical and empirical analyses reveal that DiffQKV attention significantly enhances efficiency, achieving up to a 33.36% improvement in inference speed over the conventional grouped-query attention (GQA) in long-context scenarios. We pre-train Sigma on 6T tokens from various sources, including 19.5B system domain data that we carefully collect and 1T tokens of synthesized and rewritten data. In general domains, Sigma achieves comparable performance to other state-of-arts models. In the system domain, we introduce the first comprehensive benchmark AIMicius, where Sigma demonstrates remarkable performance across all tasks, significantly outperforming GPT-4 with an absolute improvement up to 52.5%.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.13629

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

SIGMA: An Open-Source Interactive System for Mixed-Reality Task Assistance Research

Bohus, Dan, Andrist, Sean, Saw, Nick, Paradiso, Ann, Chakraborty, Ishani, Rad, Mahdi

arXiv.org Artificial IntelligenceMay-16-2024

We introduce an open-source system called SIGMA (short for "Situated Interactive Guidance, Monitoring, and Assistance") as a platform for conducting research on task-assistive agents in mixed-reality scenarios. The system leverages the sensing and rendering affordances of a head-mounted mixed-reality device in conjunction with large language and vision models to guide users step by step through procedural tasks. We present the system's core capabilities, discuss its overall design and implementation, and outline directions for future research enabled by the system. SIGMA is easily extensible and provides a useful basis for future research at the intersection of mixed reality and AI. By open-sourcing an end-to-end implementation, we aim to lower the barrier to entry, accelerate research in this space, and chart a path towards community-driven end-to-end evaluation of large language, vision, and multimodal models in the context of real-world interactive applications.

application, igma, instruction, (17 more...)

arXiv.org Artificial Intelligence

2405.13035

Country:

North America > United States > Hawaii (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Middle East > Malta (0.04)
(2 more...)

Genre:

Workflow (0.93)
Research Report (0.64)

Industry:

Health & Medicine (0.68)
Education (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback