AITopics | xyz

Collaborating Authors

xyz

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins

Mu, Yao, Chen, Tianxing, Chen, Zanxin, Peng, Shijia, Lan, Zhiqian, Gao, Zeyu, Liang, Zhixuan, Yu, Qiaojun, Zou, Yude, Xu, Mingkun, Lin, Lunkai, Xie, Zhiqiang, Ding, Mingyu, Luo, Ping

arXiv.org Artificial IntelligenceApr-18-2025

In the rapidly advancing field of robotics, dual-arm coordination and complex object manipulation are essential capabilities for developing advanced autonomous systems. However, the scarcity of diverse, high-quality demonstration data and real-world-aligned evaluation benchmarks severely limits such development. To address this, we introduce RoboTwin, a generative digital twin framework that uses 3D generative foundation models and large language models to produce diverse expert datasets and provide a real-world-aligned evaluation platform for dual-arm robotic tasks. Specifically, RoboTwin creates varied digital twins of objects from single 2D images, generating realistic and interactive scenarios. It also introduces a spatial relation-aware code generation framework that combines object annotations with large language models to break down tasks, determine spatial constraints, and generate precise robotic movement code. Our framework offers a comprehensive benchmark with both simulated and real-world data, enabling standardized evaluation and better alignment between simulated training and real-world performance. We validated our approach using the open-source COBOT Magic Robot platform. Policies pre-trained on RoboTwin-generated data and fine-tuned with limited real-world samples demonstrate significant potential for enhancing dual-arm robotic manipulation systems by improving success rates by over 70% for single-arm tasks and over 40% for dual-arm tasks compared to models trained solely on real-world data.

actor, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2504.13059

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry:

Education (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.66)

Add feedback

Privacy-Preserving Edge Speech Understanding with Tiny Foundation Models

Benazir, Afsara, Lin, Felix Xiaozhu

arXiv.org Artificial IntelligenceJan-29-2025

Robust speech recognition systems rely on cloud service providers for inference. It needs to ensure that an untrustworthy provider cannot deduce the sensitive content in speech. Sanitization can be done on speech content keeping in mind that it has to avoid compromising transcription accuracy. Realizing the under utilized capabilities of tiny speech foundation models (FMs), for the first time, we propose a novel use: enhancing speech privacy on resource-constrained devices. We introduce XYZ, an edge/cloud privacy preserving speech inference engine that can filter sensitive entities without compromising transcript accuracy. We utilize a timestamp based on-device masking approach that utilizes a token to entity prediction model to filter sensitive entities. Our choice of mask strategically conceals parts of the input and hides sensitive data. The masked input is sent to a trusted cloud service or to a local hub to generate the masked output. The effectiveness of XYZ hinges on how well the entity time segments are masked. Our recovery is a confidence score based approach that chooses the best prediction between cloud and on-device model. We implement XYZ on a 64 bit Raspberry Pi 4B. Experiments show that our solution leads to robust speech recognition without forsaking privacy. XYZ with < 100 MB memory, achieves state-of-the-art (SOTA) speech transcription performance while filtering about 83% of private entities directly on-device. XYZ is 16x smaller in memory and 17x more compute efficient than prior privacy preserving speech frameworks and has a relative reduction in word error rate (WER) by 38.8-77.5% when compared to existing offline transcription services.

artificial intelligence, speech recognition, tiny foundation model, (16 more...)

arXiv.org Artificial Intelligence

2502.01649

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.89)

Technology: Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)

Add feedback

UNet: A Generic and Reliable Multi-UAV Communication and Networking Architecture for Heterogeneous Applications

Roy, Sanku Kumar, Samshad, Mohamed, Rajawat, Ketan

arXiv.org Artificial IntelligenceNov-5-2024

The rapid growth of UAV applications necessitates a robust communication and networking architecture capable of addressing the diverse requirements of various applications concurrently, rather than relying on application-specific solutions. This paper proposes a generic and reliable multi-UAV communication and networking architecture designed to support the varying demands of heterogeneous applications, including short-range and long-range communication, star and mesh topologies, different data rates, and multiple wireless standards. Our architecture accommodates both adhoc and infrastructure networks, ensuring seamless connectivity throughout the network. Additionally, we present the design of a multi-protocol UAV gateway that enables interoperability among various communication protocols. Furthermore, we introduce a data processing and service layer framework with a graphical user interface of a ground control station that facilitates remote control and monitoring from any location at any time. We practically implemented the proposed architecture and evaluated its performance using different metrics, demonstrating its effectiveness.

application, architecture, communication, (16 more...)

arXiv.org Artificial Intelligence

2411.03048

Country:

Asia > India > Uttar Pradesh > Kanpur (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Asia > China (0.04)

Genre: Research Report (0.82)

Industry:

Information Technology (1.00)
Telecommunications > Networks (0.68)
Transportation > Air (0.47)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)

Add feedback

On Triangular versus Edge Representations -- Towards Scalable Modeling of Networks

Neural Information Processing SystemsMar-14-2024, 10:48:39 GMT

In this paper, we argue for representing networks as a bag of triangular motifs, particularly for important network problems that current model-based approaches handle poorly due to computational bottlenecks incurred by using edge representations.

triangle, triangular motif, vertex, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Singapore (0.04)

Industry:

Telecommunications > Networks (0.34)
Information Technology > Networks (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Leveraging cache to enable SLU on tiny devices

Benazir, Afsara, Xu, Zhiming, Lin, Felix Xiaozhu

arXiv.org Artificial IntelligenceDec-12-2023

This paper addresses spoken language understanding (SLU) on microcontroller-like embedded devices, integrating on-device execution with cloud offloading in a novel fashion. We exploit temporal locality in a device's speech inputs and accordingly reuse recent SLU inferences. Our idea is simple: let the device match new inputs against cached results, and only offload unmatched inputs to the cloud for full inference. Realization of this idea, however, is non-trivial: the device needs to compare acoustic features in a robust, low-cost way. To this end, we present XYZ, a speech cache for tiny devices. It matches speech inputs at two levels of representations: first by clustered sequences of raw sound units, then as sequences of phonemes. Working in tandem, the two representations offer complementary cost/accuracy tradeoffs. To further boost accuracy, our cache is learning: with the mismatched and then offloaded inputs, it continuously finetunes the device's feature extractors (with the assistance of the cloud). We implement XYZ on an off-the-shelf STM32 microcontroller. The resultant implementation has a small memory footprint of 2MB. Evaluated on challenging speech benchmarks, our system resolves 45%--90% of inputs on device, reducing the average latency by up to 80% compared to offloading to popular cloud speech services. Our benefit is pronounced even in adversarial settings -- noisy environments, cold cache, or one device shared by a number of users.

accuracy, sequence, utterance, (14 more...)

arXiv.org Artificial Intelligence

2311.18188

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > Virginia (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Information Technology > Smart Houses & Appliances (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

Simulating Opinion Dynamics with Networks of LLM-based Agents

Chuang, Yun-Shiuan, Goyal, Agam, Harlalka, Nikunj, Suresh, Siddharth, Hawkins, Robert, Yang, Sijia, Shah, Dhavan, Hu, Junjie, Rogers, Timothy T.

arXiv.org Artificial IntelligenceNov-16-2023

Accurately simulating human opinion dynamics is crucial for understanding a variety of societal phenomena, including polarization and the spread of misinformation. However, the agent-based models (ABMs) commonly used for such simulations lack fidelity to human behavior. We propose a new approach to simulating opinion dynamics based on populations of Large Language Models (LLMs). Our findings reveal a strong inherent bias in LLM agents towards accurate information, leading to consensus in line with scientific reality. However, this bias limits the simulation of individuals with resistant views on issues like climate change. After inducing confirmation bias through prompt engineering, we observed opinion fragmentation in line with existing agent-based research. These insights highlight the promise and limitations of LLM agents in this domain and suggest a path forward: refining LLMs with real-world discourse to better simulate the evolution of human beliefs.

agent, interaction, tweet, (14 more...)

arXiv.org Artificial Intelligence

2311.09618

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > United Kingdom (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Media (0.88)
Health & Medicine > Therapeutic Area > Immunology (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Twitch's AI-Generated, 'Seinfeld' Like Show Gets Weird - usalive.xyz

#artificialintelligenceFeb-13-2023, 23:41:36 GMT

Artificial intelligence's take on a classic sitcom is more than a load of "yada yada yada." "Nothing, Forever" is an AI-generated, "Seinfeld" like show on streaming platform Twitch that's set to never stop broadcasting. The 24/7 show, which has been streaming since December, has grown in popularity over the past week as thousands have tuned in to watch the adventures of animated characters Larry Feinberg, Fred Kastopolous, Yvonne Torres and Zoltan Kalker. As of Saturday morning, "Nothing, Forever" had over 131,000 Twitch followers. The show plays out in a similar fashion to the TV classic: It includes stand-up sequences, laugh tracks and conversations among AI friends similar to Jerry, Elaine, George and Kramer inside of an apartment.

ai-generated, seinfeld, twitch, (3 more...)

#artificialintelligence

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (0.77)

Add feedback

Negative Shannon Information Hides Networks

Luo, Ming-Xing

arXiv.org Artificial IntelligenceDec-11-2022

Shannon information was defined for characterizing the uncertainty information of classical probabilistic distributions. As an uncertainty measure it is generally believed to be positive. This holds for any information quantity from two random variables because of the polymatroidal axioms. However, it is unknown why there is negative information for more than two random variables on finite dimensional spaces. We first show the negative tripartite Shannon mutual information implies specific Bayesian network representations of its joint distribution. We then show that the negative Shannon information is obtained from general tripartite Bayesian networks with quantum realizations. This provides a device-independent witness of negative Shannon information. We finally extend the result for general networks. The present result shows new insights in the network compatibility from non-Shannon information inequalities.

artificial intelligence, information, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2206.0432

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)
(3 more...)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

The War between AI and the Blockchain

#artificialintelligenceSep-24-2020, 16:40:08 GMT

Deepfakes are developing fast, and although faking video and audio is not new, experts agree that we can't win this fight. Machines will be able to create digital media that can not be recognized as such by a normal human consumer. We have written about this threat because it spells disaster. Chaos is what we expect to be the result in any media/public relation, motivated by malignant attitudes, desire to have fun or the desire to exploit. Fake news is already a problem, leading to lynchings in some countries, based only on accusations.

artificial intelligence, blockchain, fingerprint, (7 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (0.53)

Technology:

Information Technology > e-Commerce > Financial Technology (0.66)
Information Technology > Artificial Intelligence (0.61)
Information Technology > Security & Privacy (0.53)

Add feedback

The Woman Worked as a Babysitter: On Biases in Language Generation

Sheng, Emily, Chang, Kai-Wei, Natarajan, Premkumar, Peng, Nanyun

arXiv.org Artificial IntelligenceSep-3-2019

W e present a systematic study of biases in natural language generation (NLG) by analyzing text generated from prompts that contain mentions of different demographic groups. In this work, we introduce the notion of the regard towards a demographic, use the varying levels of regard towards different demographics as a defining metric for bias in NLG, and analyze the extent to which sentiment scores are a relevant proxy metric for regard. To this end, we collect strategically-generated text from language models and manually annotate the text with both sentiment and regard scores. Additionally, we build an automatic regard classifier through transfer learning, so that we can analyze biases in unseen text. Together, these methods reveal the extent of the biased nature of language model generations. Our analysis provides a study of biases in NLG, bias metrics and correlated human judgments, and empirical evidence on the usefulness of our annotated dataset.

machine learning, natural language, xyz, (16 more...)

arXiv.org Artificial Intelligence

1909.01326

Country:

North America > United States (0.46)
Europe (0.29)

Genre: Research Report (0.50)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback