AITopics | ui component

Collaborating Authors

ui component

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Pairwise GUI Dataset Construction Between Android Phones and Tablets

Neural Information Processing SystemsFeb-16-2026, 19:41:55 GMT

GUIs development to enhance developers' productivity.

data mining, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > Dominican Republic (0.04)
Asia > Taiwan > Takao Province > Kaohsiung (0.04)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback

Pairwise GUI Dataset Construction Between Android Phones and Tablets

Neural Information Processing SystemsOct-9-2025, 06:04:36 GMT

GUIs development to enhance developers' productivity.

data mining, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > Dominican Republic (0.04)
Asia > Taiwan > Takao Province > Kaohsiung (0.04)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback

LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Automation Task Evaluation

Zhang, Li, Wang, Shihe, Jia, Xianqing, Zheng, Zhihan, Yan, Yunhe, Gao, Longxi, Li, Yuanchun, Xu, Mengwei

arXiv.org Artificial IntelligenceApr-12-2024

The emergent large language/multimodal models facilitate the evolution of mobile agents, especially in the task of mobile UI automation. However, existing evaluation approaches, which rely on human validation or established datasets to compare agent-predicted actions with predefined ones, are unscalable and unfaithful. To overcome these limitations, this paper presents LlamaTouch, a testbed for on-device agent execution and faithful, scalable agent evaluation. By observing that the task execution process only transfers UI states, LlamaTouch employs a novel evaluation approach that only assesses whether an agent traverses all manually annotated, essential application/system states. LlamaTouch comprises three key techniques: (1) On-device task execution that enables mobile agents to interact with real mobile environments for task completion. (2) Fine-grained UI component annotation that merges pixel-level screenshots and textual screen hierarchies to explicitly identify and precisely annotate essential UI components with a rich set of designed annotation primitives. (3) A multi-level state matching algorithm that utilizes exact and fuzzy matching to accurately detect critical information in each screen with unpredictable UI layout/content dynamics. LlamaTouch currently incorporates four mobile agents and 495 UI automation tasks, encompassing both tasks in the widely-used datasets and our self-constructed ones for more diverse mobile applications. Evaluation results demonstrate the LlamaTouch's high faithfulness of evaluation in real environments and its better scalability than human validation. LlamaTouch also enables easy task annotation and integration of new mobile agents. Code and dataset are publicly available at https://github.com/LlamaTouch/LlamaTouch.

agent, llamatouch, mobile agent, (12 more...)

arXiv.org Artificial Intelligence

2404.16054

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.93)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Large Language User Interfaces: Voice Interactive User Interfaces powered by LLMs

Wasti, Syed Mekael, Pu, Ken Q., Neshati, Ali

arXiv.org Artificial IntelligenceFeb-7-2024

The modern world relies on and is driven by software. Embedded systems, command-line interfaces and user interface (UI) software are present across systems all around the world. The ease of use coupled with their intuitive nature has allowed for UI systems to become a staple as a crucial tool in modern software and beyond. UI systems serve as a visually appealing packaging of function calls and event handlers, allowing for complex event pipelines and data flows to be abstracted by buttons, text fields, menus, etc. The evolutions made in large language models (LLMs) over the past year have exhibited true "cognitive" potential. This potent ability has unveiled innumerable new opportunities to revolutionize the way our contemporary software systems are expected to operate. In this paper, we explore our vision and progress toward developing a UI architectural paradigm which employs a multimodal engine powered by LLMs and state-of-the-art transformer models. This framework aims to abstract monotonous UI interactions with prompting mechanisms that serve as "cognitively aware", powering automated functional calling and data flow pipelines, which translate to full speech-based intelligence control over visual UI systems.

application, language user interface, user interface, (12 more...)

arXiv.org Artificial Intelligence

2402.07938

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > Canada > Ontario (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.05)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Development of a Legal Document AI-Chatbot

Devaraj, Pranav Nataraj, P, Rakesh Teja V, Gangrade, Aaryav, R, Manoj Kumar

arXiv.org Artificial IntelligenceNov-21-2023

With the exponential growth of digital data and the increasing complexity of legal documentation, there is a pressing need for efficient and intelligent tools to streamline the handling of legal documents.With the recent developments in the AI field, especially in chatbots, it cannot be ignored as a very compelling solution to this problem.An insight into the process of creating a Legal Documentation AI Chatbot with as many relevant features as possible within the given time frame is presented.The development of each component of the chatbot is presented in detail.Each component's workings and functionality has been discussed.Starting from the build of the Android app and the Langchain query processing code till the integration of both through a Flask backend and REST API methods.

application, chatbot, query, (13 more...)

arXiv.org Artificial Intelligence

2311.12719

Country: Asia > India > Tamil Nadu > Chennai (0.06)

Genre: Research Report (0.64)

Industry: Law (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)

Add feedback

Pairwise GUI Dataset Construction Between Android Phones and Tablets

Hu, Han, Zhan, Haolan, Huang, Yujin, Liu, Di

arXiv.org Artificial IntelligenceNov-5-2023

In the current landscape of pervasive smartphones and tablets, apps frequently exist across both platforms. Although apps share most graphic user interfaces (GUIs) and functionalities across phones and tablets, developers often rebuild from scratch for tablet versions, escalating costs and squandering existing design resources. Researchers are attempting to collect data and employ deep learning in automated GUIs development to enhance developers' productivity. There are currently several publicly accessible GUI page datasets for phones, but none for pairwise GUIs between phones and tablets. This poses a significant barrier to the employment of deep learning in automated GUI development. In this paper, we introduce the Papt dataset, a pioneering pairwise GUI dataset tailored for Android phones and tablets, encompassing 10,035 phone-tablet GUI page pairs sourced from 5,593 unique app pairs. We propose novel pairwise GUI collection approaches for constructing this dataset and delineate its advantages over currently prevailing datasets in the field. Through preliminary experiments on this dataset, we analyze the present challenges of utilizing deep learning in automated GUI development.

dataset, proceedings, screenshot, (15 more...)

arXiv.org Artificial Intelligence

2310.04755

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > Dominican Republic (0.04)
Asia > Taiwan > Takao Province > Kaohsiung (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Graphics (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

UI Layers Merger: Merging UI layers via Visual Learning and Boundary Prior

Chen, Yun-nong, Zhen, Yan-kun, Shi, Chu-ning, Li, Jia-zhi, Chen, Liu-qing, Li, Ze-jian, Sun, Ling-yun, Zhou, Ting-ting, Chang, Yan-fang

arXiv.org Artificial IntelligenceSep-3-2022

With the fast-growing GUI development workload in the Internet industry, some work on intelligent methods attempted to generate maintainable front-end code from UI screenshots. It can be more suitable for utilizing UI design drafts that contain UI metadata. However, fragmented layers inevitably appear in the UI design drafts which greatly reduces the quality of code generation. None of the existing GUI automated techniques detects and merges the fragmented layers to improve the accessibility of generated code. In this paper, we propose UI Layers Merger (UILM), a vision-based method, which can automatically detect and merge fragmented layers into UI components. Our UILM contains Merging Area Detector (MAD) and a layers merging algorithm. MAD incorporates the boundary prior knowledge to accurately detect the boundaries of UI components. Then, the layers merging algorithm can search out the associated layers within the components' boundaries and merge them into a whole part. We present a dynamic data augmentation approach to boost the performance of MAD. We also construct a large-scale UI dataset for training the MAD and testing the performance of UILM. The experiment shows that the proposed method outperforms the best baseline regarding merging area detection and achieves a decent accuracy regarding layers merging.

design draft, fragmented layer, ui component, (13 more...)

arXiv.org Artificial Intelligence

2206.13389

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(16 more...)

Genre: Research Report > New Finding (0.94)

Industry: Information Technology (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Graphics (0.89)
(2 more...)

Add feedback

UIBert: Learning Generic Multimodal Representations for UI Understanding

Bai, Chongyang, Zang, Xiaoxue, Xu, Ying, Sunkara, Srinivas, Rastogi, Abhinav, Chen, Jindong, Arcas, Blaise Aguera y

arXiv.org Artificial IntelligenceAug-10-2021

To improve the accessibility of smart devices and to simplify their usage, building models which understand user interfaces (UIs) and assist users to complete their tasks is critical. However, unique challenges are proposed by UI-specific characteristics, such as how to effectively leverage multimodal UI features that involve image, text, and structural metadata and how to achieve good performance when high-quality labeled data is unavailable. To address such challenges we introduce UIBert, a transformer-based joint image-text model trained through novel pre-training tasks on large-scale unlabeled UI data to learn generic feature representations for a UI and its components. Our key intuition is that the heterogeneous features in a UI are self-aligned, i.e., the image and text features of UI components, are predictive of each other. We propose five pretraining tasks utilizing this self-alignment among different features of a UI component and across various components in the same UI. We evaluate our method on nine real-world downstream UI tasks where UIBert outperforms strong multimodal baselines by up to 9.26% accuracy.

ui component, uibert, vh component, (12 more...)

arXiv.org Artificial Intelligence

2107.13731

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Filters

Collaborating Authors

ui component

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

bc4cff0b37ccab13e98b6128d89ca172-Supplemental-Datasets_and_Benchmarks.pdf

Pairwise GUI Dataset Construction Between Android Phones and Tablets

bc4cff0b37ccab13e98b6128d89ca172-Supplemental-Datasets_and_Benchmarks.pdf

Pairwise GUI Dataset Construction Between Android Phones and Tablets

LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Automation Task Evaluation

Large Language User Interfaces: Voice Interactive User Interfaces powered by LLMs

Development of a Legal Document AI-Chatbot

Pairwise GUI Dataset Construction Between Android Phones and Tablets

UI Layers Merger: Merging UI layers via Visual Learning and Boundary Prior

UIBert: Learning Generic Multimodal Representations for UI Understanding