Goto

Collaborating Authors

 screenshot show


UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning

Lu, Zhengxi, Ye, Jiabo, Tang, Fei, Shen, Yongliang, Xu, Haiyang, Zheng, Ziwei, Lu, Weiming, Yan, Ming, Huang, Fei, Xiao, Jun, Zhuang, Yueting

arXiv.org Artificial Intelligence

Graphical User Interface (GUI) agents have demonstrated remarkable progress in automating complex user interface interactions through reinforcement learning. However, current approaches face a fundamental dilemma: offline RL enables stable training on pre-collected trajectories, but struggles with multi-step task execution for lack of trajectory-level reward signals; online RL captures these signals through environment interaction, but suffers from sparse rewards and prohibitive deployment costs. To address it, we present Semi-online Reinforcement Learning, a novel paradigm that simulates online RL on offline trajectories. During each rollout process, we preserve the original model output within the multi-turn dialogue, where a Patch Module adaptively recovers the divergence between rollout and expert trajectories. To capture long-term training signals, Semi-online RL introduces discounted future returns into the reward computation and optimizes the policy with weighted step-level and episode-level advantages. We further introduce Semi-Online Performance (SOP), a metric that aligns better with true online performance, serving as a practical and effective proxy for real-world evaluation. Experiments show that ours UI-S1-7B achieves SOT A performance among 7B models across four dynamic benchmarks, with significant gains over the base model (e.g., +12.0% on AndroidWorld, +23.8% on AITW), demonstrating significant progress in bridging the gap between of-fline training efficiency and online multi-turn reasoning. Our proposed Semi-online RL simulates online RL on offline static trajectories, which enhances multi-turn agent capabilities more efficiently. Work done during internship at Tongyi Lab, Alibaba Group.


AI-powered 'Death Clock' predicts how and when you'll die, down to the second... so how long do YOU have left?

Daily Mail - Science & tech

If you could find out exactly how and when you'll die, would you want to know? A new AI-powered death clock claims to be able to do just that, predicting the method and age at which you will die, right down to the second. The free website, called the Death Clock, uses AI to analyze age, weight, and general outlook on life to'accurately' predict how long you have left to live. It also asks users to input information on lifestyle habits like drinking, smoking, diet, and exercise. Users can also reveal their alleged cause of death and see how their life expectancy compares to other people of the same sex and body mass index (BMI).


QAnswer: Towards Question Answering Search over Websites

Guo, Kunpeng, Defretiere, Clement, Diefenbach, Dennis, Gravier, Christophe, Gourru, Antoine

arXiv.org Artificial Intelligence

Question Answering (QA) is increasingly used by search engines to provide results to their end-users, yet very few websites currently use QA technologies for their search functionality. To illustrate the potential of QA technologies for the website search practitioner, we demonstrate web searches that combine QA over knowledge graphs and QA over free text -- each being usually tackled separately. We also discuss the different benefits and drawbacks of both approaches for web site searches. We use the case studies made of websites hosted by the Wikimedia Foundation (namely Wikipedia and Wikidata). Differently from a search engine (e.g. Google, Bing, etc), the data are indexed integrally, i.e. we do not index only a subset, and they are indexed exclusively, i.e. we index only data available on the corresponding website.


Unravel the knowledge in Slack workspaces with intelligent search using the Amazon Kendra Slack connector

#artificialintelligence

Organizations use messaging platforms like Slack to bring the right people together to securely communicate with each other and collaborate to get work done. A Slack workspace captures invaluable organizational knowledge in the form of the information that flows through it as the users collaborate. However, making this knowledge easily and securely available to users is challenging due to the fragmented structure of Slack workspaces. Additionally, the conversational nature of Slack communication renders a traditional keyword-based approach to search ineffective. You can now use the Amazon Kendra Slack connector to index Slack messages and documents, and search this content using intelligent search in Amazon Kendra, powered by machine learning (ML).


Control formality in machine translated text using Amazon Translate

#artificialintelligence

Amazon Translate is a neural machine translation service that delivers fast, high-quality, affordable, and customizable language translation. Amazon Translate now supports formality customization. This feature allows you to customize the level of formality in your translation output. At the time of writing, the formality customization feature is available for six target languages: French, German, Hindi, Italian, Japanese, and Spanish. You can customize the formality of your translated output to suit your communication needs.


Amazon Lookout for Vision now supports visual inspection of product defects at the edge

#artificialintelligence

Discrete and continuous manufacturing lines generate a high volume of products at low latency, ranging from milliseconds to a few seconds. To identify defects at the same throughput of production, camera streams of images must be processed at low latency. Additionally, factories may have low network bandwidth or intermittent cloud connectivity. In such scenarios, you may need to run the defect detection system on your on-premises compute infrastructure, and upload the processed results for further development and monitoring purposes to the AWS Cloud. This hybrid approach with both local edge hardware and the cloud can address the low latency requirements and help reduce storage and network transfer costs to the cloud.


Extend model lineage to include ML features using Amazon SageMaker Feature Store

#artificialintelligence

Feature engineering is expensive and time-consuming, which may lead you to adopt a feature store for managing features across teams and models. Unfortunately, machine learning (ML) lineage solutions have yet to adapt to this new concept of feature management. To achieve the full benefits of a feature store by enabling feature reuse, you need to be able to answer fundamental questions about features. For example, how were these features built? What models are using these features?


Detect manufacturing defects in real time using Amazon Lookout for Vision

#artificialintelligence

In this post, we look at how we can automate the detection of anomalies in a manufactured product using Amazon Lookout for Vision. Using Amazon Lookout for Vision, you can notify operators in real time when defects are detected, provide dashboards for monitoring the workload, and get visual insights from the process for business users. Amazon Lookout for Vision is a machine learning (ML) service that spots defects and anomalies in visual representations using computer vision (CV). With Amazon Lookout for Vision, manufacturing companies can increase quality and reduce operational costs by quickly identifying differences in images of objects at scale. Defect and anomaly detection during manufacturing processes is a vital step to ensure the quality of the products. The timely detection of faults or defects and taking appropriate actions is important to reduce operational and quality-related costs. According to Aberdeen's research, "Many organizations will have true quality-related costs as high as 15 to 20 percent of sales revenue, in extreme cases some going as high as 40 percent." Manual inspection, either in-line or end-of-line, is a time-consuming and expensive task.


Build an intelligent search solution with automated content enrichment

#artificialintelligence

Unstructured data belonging to the enterprise continues to grow, making it a challenge for customers and employees to get the information they need. Amazon Kendra is a highly accurate intelligent search service powered by machine learning (ML). It helps you easily find the content you're looking for, even when it's scattered across multiple locations and content repositories. Amazon Kendra leverages deep learning and reading comprehension to deliver precise answers. It offers natural language search for a user experience that's like interacting with a human expert.


Translate All: Automating multiple file type batch translation with AWS CloudFormation

#artificialintelligence

This is a guest post by Cyrus Wong, an AWS Machine Learning Hero. You can learn more about and connect with AWS Machine Learning Heroes at the community page. On July 29, 2020, AWS announced that Amazon Translate now supports Microsoft Office documents, including .docx, The world is full of bilingual countries and cities like Hong Kong. I find myself always needing to prepare Office documents and presentation slides in both English and Chinese.