AITopics

2304.06248

Country:

North America > United States > California (0.04)
Asia > Singapore (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Sharma, Mandar, Gogineni, Ajay, Ramakrishnan, Naren

Innovations in Neural Data-to-text Generation: A Survey

arXiv.org Artificial IntelligenceApr-12-2023

The neural boom that has sparked natural language processing (NLP) research through the last decade has similarly led to significant innovations in data-to-text generation (DTG). This survey offers a consolidated view into the neural DTG paradigm with a structured examination of the approaches, benchmark datasets, and evaluation protocols. This survey draws boundaries separating DTG from the rest of the natural language generation (NLG) landscape, encompassing an up-to-date synthesis of the literature, and highlighting the stages of technological adoption from within and outside the greater NLG umbrella. With this holistic view, we highlight promising avenues for DTG research that not only focus on the design of linguistically capable systems but also systems that exhibit fairness and accountability.

large language model, machine learning, natural language, (18 more...)

2207.12571

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > Kansas (0.04)
(7 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment > Sports (1.00)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(7 more...)

arXiv.org Artificial IntelligenceApr-11-2023

Towards Efficient Fine-tuning of Pre-trained Code Models: An Experimental Study and Beyond

Shi, Ensheng, Wang, Yanlin, Zhang, Hongyu, Du, Lun, Han, Shi, Zhang, Dongmei, Sun, Hongbin

Recently, fine-tuning pre-trained code models such as CodeBERT on downstream tasks has achieved great success in many software testing and analysis tasks. While effective and prevalent, fine-tuning the pre-trained parameters incurs a large computational cost. In this paper, we conduct an extensive experimental study to explore what happens to layer-wise pre-trained representations and their encoded code knowledge during fine-tuning. We then propose efficient alternatives to fine-tune the large pre-trained code model based on the above findings. Our experimental study shows that (1) lexical, syntactic and structural properties of source code are encoded in the lower, intermediate, and higher layers, respectively, while the semantic property spans across the entire model. (2) The process of fine-tuning preserves most of the code properties. Specifically, the basic code properties captured by lower and intermediate layers are still preserved during fine-tuning. Furthermore, we find that only the representations of the top two layers change most during fine-tuning for various downstream tasks. (3) Based on the above findings, we propose Telly to efficiently fine-tune pre-trained code models via layer freezing. The extensive experimental results on five various downstream tasks demonstrate that training parameters and the corresponding time cost are greatly reduced, while performances are similar or better. Replication package including source code, datasets, and online Appendix is available at: \url{https://github.com/DeepSoftwareAnalytics/Telly}.

artificial intelligence, machine learning, natural language, (22 more...)

2304.05216

Country:

North America > United States (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.93)
(2 more...)

arXiv.org Artificial IntelligenceApr-10-2023

A Cheaper and Better Diffusion Language Model with Soft-Masked Noise

Chen, Jiaao, Zhang, Aston, Li, Mu, Smola, Alex, Yang, Diyi

Diffusion models that are based on iterative denoising have been recently proposed and leveraged in various generation tasks like image generation. Whereas, as a way inherently built for continuous data, existing diffusion models still have some limitations in modeling discrete data, e.g., languages. For example, the generally used Gaussian noise can not handle the discrete corruption well, and the objectives in continuous spaces fail to be stable for textual data in the diffusion process especially when the dimension is high. To alleviate these issues, we introduce a novel diffusion model for language modeling, Masked-Diffuse LM, with lower training cost and better performances, inspired by linguistic features in languages. Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual data. Also, we directly predict the categorical distribution with cross-entropy loss function in every diffusion step to connect the continuous space and discrete space in a more efficient and straightforward way. Through experiments on 5 controlled generation tasks, we demonstrate that our Masked-Diffuse LM can achieve better generation quality than the state-of-the-art diffusion models with better efficiency.

artificial intelligence, machine learning, natural language, (18 more...)

2304.04746

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Dominican Republic (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(3 more...)

Genre: Research Report (0.52)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.95)

arXiv.org Artificial IntelligenceApr-10-2023

RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL

Li, Haoyang, Zhang, Jing, Li, Cuiping, Chen, Hong

One of the recent best attempts at Text-to-SQL is the pre-trained language model. Due to the structural property of the SQL queries, the seq2seq model takes the responsibility of parsing both the schema items (i.e., tables and columns) and the skeleton (i.e., SQL keywords). Such coupled targets increase the difficulty of parsing the correct SQL queries especially when they involve many schema items and logic operators. This paper proposes a ranking-enhanced encoding and skeleton-aware decoding framework to decouple the schema linking and the skeleton parsing. Specifically, for a seq2seq encoder-decode model, its encoder is injected by the most relevant schema items instead of the whole unordered ones, which could alleviate the schema linking effort during SQL parsing, and its decoder first generates the skeleton and then the actual SQL query, which could implicitly constrain the SQL parsing. We evaluate our proposed framework on Spider and its three robustness variants: Spider-DK, Spider-Syn, and Spider-Realistic. The experimental results show that our framework delivers promising performance and robustness. Our code is available at https://github.com/RUCKBReasoning/RESDSQL.

artificial intelligence, natural language, schema item, (16 more...)

2302.05965

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

arXiv.org Artificial IntelligenceApr-10-2023

Scallop: A Language for Neurosymbolic Programming

Li, Ziyang, Huang, Jiani, Naik, Mayur

We present Scallop, a language which combines the benefits of deep learning and logical reasoning. Scallop enables users to write a wide range of neurosymbolic applications and train them in a data- and compute-efficient manner. It achieves these goals through three key features: 1) a flexible symbolic representation that is based on the relational data model; 2) a declarative logic programming language that is based on Datalog and supports recursion, aggregation, and negation; and 3) a framework for automatic and efficient differentiable reasoning that is based on the theory of provenance semirings. We evaluate Scallop on a suite of eight neurosymbolic applications from the literature. Our evaluation demonstrates that Scallop is capable of expressing algorithmic reasoning in diverse and challenging AI tasks, provides a succinct interface for machine learning programmers to integrate logical domain knowledge, and yields solutions that are comparable or superior to state-of-the-art models in terms of accuracy. Furthermore, Scallop's solutions outperform these models in aspects such as runtime and data efficiency, interpretability, and generalizability.

logic & formal reasoning, machine learning, natural language, (21 more...)

2304.04812

Country:

North America > United States > Pennsylvania (0.04)
Europe > Switzerland > Geneva > Geneva (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Software (1.00)
Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Neural Information Processing SystemsApr-6-2023, 19:48:20 GMT

Incremental Parsing by Modular Recurrent Connectionist Networks

We present a novel, modular, recurrent connectionist network architec(cid:173) ture which learns to robustly perform incremental parsing of complex sentences. From sequential input, one word at a time, our networks learn to do semantic role assignment, noun phrase attachment, and clause structure recognition for sentences with passive constructions and center embedded clauses. The networks make syntactic and semantic predictions at every point in time, and previous predictions are revised as expectations are affirmed or violated with the arrival of new informa(cid:173) tion. Our networks induce their own "grammar rules" for dynamically transforming an input sequence of words into a syntactic/semantic in(cid:173) terpretation. These networks generalize and display tolerance to input which has been corrupted in ways common in spoken language.

incremental parsing, modular recurrent connectionist network, prediction, (1 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Neural Information Processing SystemsApr-6-2023, 19:18:59 GMT

Generalization Performance in PARSEC - A Structured Connectionist Parsing Architecture

This paper presents PARSEC-a system for generating connectionist parsing networks from example parses. PARSEC is not based on formal grammar systems and is geared toward spoken language tasks. PARSEC networks exhibit three strengths important for application to speech pro(cid:173) cessing: 1) they learn to parse, and generalize well compared to hand(cid:173) coded grammars; 2) they tolerate several types of noise; 3) they can learn to use multi-modal input. Presented are the PARSEC architecture and performance analyses along several dimensions that demonstrate PARSEC's features. PARSEC's performance is compared to that of tra(cid:173) ditional grammar-based parsing systems.

generalization performance, parsec, structured connectionist parsing architecture, (1 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Neural Information Processing SystemsApr-6-2023, 18:47:37 GMT

Grammar Learning by a Self-Organizing Network

This paper presents the design and simulation results of a self(cid:173) organizing neural network which induces a grammar from exam(cid:173) ple sentences. Input sentences are generated from a simple phrase structure grammar including number agreement, verb transitiv(cid:173) ity, and recursive noun phrase construction rules. The network induces a grammar explicitly in the form of symbol categorization rules and phrase structure rules. The purpose of this research is to show that a self-organizing network with a certain structure can acquire syntactic knowledge from only positive (i.e. There has been research on supervised neural network models of language acquisi(cid:173) tion tasks [Elman, 1991, Miikkulainen and Dyer, 1988, John and McClelland, 1988].

grammar learning, phrase construction rule, self-organizing network, (4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.84)

Neural Information Processing SystemsApr-6-2023, 18:26:27 GMT

Harmony Networks Do Not Work

Harmony networks have been proposed as a means by which con(cid:173) nectionist models can perform symbolic computation. Indeed, pro(cid:173) ponents claim that a harmony network can be built that constructs parse trees for strings in a context free language. This paper shows that harmony networks do not work in the following sense: they construct many outputs that are not valid parse trees. In order to show that the notion of systematicity is compatible with connectionism, Paul Smolensky, Geraldine Legendre and Yoshiro Miyata (Smolensky, Legendre, and Miyata 1992; Smolen sky 1993; Smolen sky, Legendre, and Miyata 1994) pro(cid:173) posed a mechanism, "Harmony Theory," by which connectionist models purportedly perform structure sensitive operations without implementing classical algorithms. Harmony theory describes a "harmony network" which, in the course of reaching a stable equilibrium, apparently computes parse trees that are valid according to the rules of a particular context-free grammar.

activation vector, harmony network, parse tree, (3 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)