action word
Generalizable Skill Learning for Construction Robots with Crowdsourced Natural Language Instructions, Composable Skills Standardization, and Large Language Model
Yu, Hongrui, Kamat, Vineet R., Menassa, Carol C.
The quasi-repetitive nature of construction work and the resulting lack of generalizability in programming construction robots presents persistent challenges to the broad adoption of robots in the construction industry. Robots cannot achieve generalist capabilities as skills learnt from one domain cannot readily transfer to another work domain or be directly used to perform a different set of tasks. Human workers have to arduously reprogram their scene-understanding, path-planning, and manipulation components to enable the robots to perform alternate work tasks. The methods presented in this paper resolve a significant proportion of such reprogramming workload by proposing a generalizable learning architecture that directly teaches robots versatile task-performance skills through crowdsourced online natural language instructions. A Large Language Model (LLM), a standardized and modularized hierarchical modeling approach, and Building Information Modeling-Robot sematic data pipeline are developed to address the multi-task skill transfer problem. The proposed skill standardization scheme and LLM-based hierarchical skill learning framework were tested with a long-horizon drywall installation experiment using a full-scale industrial robotic manipulator. The resulting robot task learning scheme achieves multi-task reprogramming with minimal effort and high quality.
- North America > United States > Michigan (0.04)
- North America > United States > Virginia (0.04)
- Europe (0.04)
- (2 more...)
- Workflow (1.00)
- Research Report > New Finding (0.67)
- Construction & Engineering (1.00)
- Education > Educational Setting (0.67)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- (2 more...)
ESALE: Enhancing Code-Summary Alignment Learning for Source Code Summarization
Fang, Chunrong, Sun, Weisong, Chen, Yuchen, Chen, Xiao, Wei, Zhao, Zhang, Quanjun, You, Yudu, Luo, Bin, Liu, Yang, Chen, Zhenyu
(Source) code summarization aims to automatically generate succinct natural language summaries for given code snippets. Such summaries play a significant role in promoting developers to understand and maintain code. Inspired by neural machine translation, deep learning-based code summarization techniques widely adopt an encoder-decoder framework, where the encoder transforms given code snippets into context vectors, and the decoder decodes context vectors into summaries. Recently, large-scale pre-trained models for source code are equipped with encoders capable of producing general context vectors and have achieved substantial improvements on code summarization. However, although they are usually trained mainly on code-focused tasks and can capture general code features, they still fall short in capturing specific features that need to be summarized. This paper proposes a novel approach to improve code summarization based on summary-focused tasks. Specifically, we exploit a multi-task learning paradigm to train the encoder on three summary-focused tasks to enhance its ability to learn code-summary alignment, including unidirectional language modeling (ULM), masked language modeling (MLM), and action word prediction (AWP). Unlike pre-trained models that mainly predict masked tokens in code snippets, we design ULM and MLM to predict masked words in summaries. Intuitively, predicting words based on given code snippets would help learn the code-summary alignment. Additionally, we introduce the domain-specific task AWP to enhance the ability of the encoder to learn the alignment between action words and code snippets. The extensive experiments on four datasets demonstrate that our approach, called ESALE significantly outperforms baselines in all three widely used metrics, including BLEU, METEOR, and ROUGE-L.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (33 more...)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.88)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.86)
Macbeth
Computer users today are demanding greater performance from systems that understand and respond intelligently to human language as input. In the past, researchers proposed and built conceptual analysis systems that attempted to understand language in depth by decomposing a text into structures representing complex combinations of primitive acts, events, and state changes in the world the way people conceive them. However, these systems have traditionally been time-consuming and costly to build and maintain by hand. This paper presents two studies of crowdsourcing a parallel corpus to build conceptual analysis systems through machine learning. In the first study, we found that crowdworkers can view simple English sentences built around specific action words, and build conceptual structures that represent decompositions of the meaning of that action word into simple and complex combinations of conceptual primitives. The conceptual structures created by crowdworkers largely agree with a set of gold standard conceptual structures built by experts, but are often missing parts of the gold standard conceptualization. In the second study, we developed and tested a novel method for improving the corpus through a subsequent round of crowdsourcing; In this "refinement" step, we presented only conceptual structures to a second set of crowdworkers, and found that when crowdworkers could identify the action word in the original sentence based only on the conceptual structure, the conceptual structure was a stronger match to the gold standard structure for that sentence. We also calculated a statistically significant correlation between the number of crowdworkers who identified the original action word for a conceptual structure, and the degree of matching between the conceptual structure and a gold standard conceptual structure. This indicates that crowdsourcing may be used not only to generate the conceptual structures, but also to select only those of the highest quality for a parallel corpus linking them to natural language.
Course Difficulty Estimation Based on Mapping of Bloom's Taxonomy and ABET Criteria
M, Premalatha, G, Suganya, V, Viswanathan, Chowdary, G Jignesh
Current Educational system uses grades or marks to assess the performance of the student. The marks or grades a students' scores depends on different parameters, the main parameter being the difficulty level of a course. Computation of this difficulty level may serve as a support for both the students and teachers to fix the level of training needed for successful completion of course. In this paper, we proposed a methodology that estimates the difficulty level of a course by mapping the Bloom's Taxonomy action words along with Accreditation Board for Engineering and Technology (ABET) criteria and learning outcomes. The estimated difficulty level is validated based on the history of grades secured by the students.
- Asia > India > Tamil Nadu > Chennai (0.05)
- North America > United States > New York > Westchester County > White Plains (0.04)
- North America > United States > Kentucky (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
Moment in Time: The Biggest Short Video Dataset For Data Scientists
Moment in Time is one of the biggest human-commented video datasets catching visual and discernible short occasions created by people, creatures, articles and nature. It was developed in 2018 by the researchers: Mathew Monfort, Alex Andonian, Bolei Zhou and Kandan Ramakrishnan. The dataset comprises more than 1,000,000 3-second recordings relating to 339 unique action words. Every action word is related to more than 1,000 recordings bringing about a huge adjusted dataset for taking in powerful occasions from recordings. The various day to day activities associated with this dataset includes falling on the floor, the opening of the mouth, eye, swimming, bouncing etc.
- Information Technology > Artificial Intelligence (0.54)
- Information Technology > Communications (0.41)
- Information Technology > Data Science (0.40)
Extracting Action and Event Semantics from Web Text
Sil, Avirup (Temple University) | Huang, Fei (Temple University) | Yates, Alexander (Temple University)
Most information extraction research identifies the state of the world in text, including the entities and the relationships that exist between them. Much less attention has been paid to the understanding of dynamics, or how the state of the world changes over time. Because intelligent behavior seeks to change the state of the world in rational and utility-maximizing ways, common-sense knowledge about dynamics is essential for intelligent agents. In this paper, we describe a novel system, Prepost , that tackles the problem of extracting the preconditions and effects of actions and events, two important kinds of knowledge for connecting world state and the actions that affect it. In experiments on Web text, Prepost is able to improve by 79% over a baseline technique for identifying the effects of actions (64% improvement for preconditions).
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.70)
Representations of Time in Symbol Grounding Systems
Förster, Frank (University of Hertfordshire) | Nehaniv, Chrystopher L. (University of Hertfordshire)
This paper gives a short overview of time representations in current symbol grounding architectures. Furthermore we report on a recently developed embodied language acquisition system that acquires object words from a linguistically unconstrained human-robot dialogue. Conceptual issues in future development of the system towards the acquisition of action words will be discussed briefly.