parrot
Text-to-Pipeline: Bridging Natural Language and Data Preparation Pipelines
Ge, Yuhang, Liu, Yachuan, Ye, Zhangyan, Mao, Yuren, Gao, Yunjun
Data preparation (DP) transforms raw data into a form suitable for downstream applications, typically by composing operations into executable pipelines. Building such pipelines is time-consuming and requires sophisticated programming skills, posing a significant barrier for non-experts. To lower this barrier, we introduce Text-to-Pipeline, a new task that translates NL data preparation instructions into DP pipelines, and PARROT, a large-scale benchmark to support systematic evaluation. To ensure realistic DP scenarios, PARROT is built by mining transformation patterns from production pipelines and instantiating them on 23,009 real-world tables, resulting in ~18,000 tasks spanning 16 core operators. Our empirical evaluation on PARROT reveals a critical failure mode in cutting-edge LLMs: they struggle not only with multi-step compositional logic but also with semantic parameter grounding. We thus establish a strong baseline with Pipeline-Agent, an execution-aware agent that iteratively reflects on intermediate states. While it achieves state-of-the-art performance, a significant gap remains, underscoring the deep, unsolved challenges for PARROT. It provides the essential, large-scale testbed for developing and evaluating the next generation of autonomous data preparation agentic systems.
- North America > Canada (0.05)
- South America > Chile (0.04)
- Europe > United Kingdom (0.04)
- (5 more...)
- Workflow (1.00)
- Research Report (0.63)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Bird or droid? Starlings nail R2-D2 beeps and boops.
The songbirds are even better at mimicking the'Star Wars' robot than parrots. Breakthroughs, discoveries, and DIY tips sent every weekday. Songbirds like parrots and parakeets might be well known for squeaking out embarrassing one-liners and certain four-letter words, but those aren't the only sounds they can mimic. Birds have been observed copying dog barks, car alarms, and even chainsaws . But it turns out some species are better equipped to copy the droid's high-pitched beeps and boops than others.
- South America > Chile (0.05)
- North America > United States > New York (0.05)
- North America > Canada (0.05)
- (5 more...)
- Leisure & Entertainment (1.00)
- Media > Film (0.49)
- Media > Photography (0.49)
- Health & Medicine > Therapeutic Area (0.49)
- Information Technology > Artificial Intelligence > Robots (0.53)
- Information Technology > Communications > Social Media (0.49)
Parrot: A Training Pipeline Enhances Both Program CoT and Natural Language CoT for Reasoning
Jin, Senjie, Chen, Lu, Xi, Zhiheng, Wang, Yuhui, Song, Sirui, Zhou, Yuhao, Zhang, Xinbo, Sun, Peng, Lu, Hong, Gui, Tao, Zhang, Qi, Huang, Xuanjing
Natural language chain-of-thought (N-CoT) and Program chain-of-thought (P-CoT) have emerged as two primary paradigms for large language models (LLMs) to solve mathematical reasoning problems. Current research typically endeavors to achieve unidirectional enhancement: P-CoT enhanced N-CoT or N-CoT enhanced P-CoT. In this paper, we seek to fully unleash the two paradigms' strengths for mutual enhancement and ultimately achieve simultaneous improvements. We conduct a detailed analysis of the error types across two paradigms, based on which we propose Parrot, a novel training pipeline for mathematical problems: 1) Three target-designed subtasks integrate sequential P-CoT and N-CoT generation. 2) A subtask hybrid training strategy to facilitate natural language semantic transferability. 3) The converted N-CoT auxiliary reward is designed to alleviate the sparse rewards in P-CoT optimization. Extensive experiments demonstrate that Parrot significantly enhances both the performance of N-CoT and P-CoT, especially on N-CoT. Using Parrot SFT, the N-CoT performance of LLaMA2 and CodeLLaMA achieve gains of +21.87 and +21.48 on MathQA over the RL baseline, which is resource-intensive.
- Europe > Austria > Vienna (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Shanghai > Shanghai (0.04)
- (6 more...)
PARROT: An Open Multilingual Radiology Reports Dataset
Guellec, Bastien Le, Adambounou, Kokou, Adams, Lisa C, Agripnidis, Thibault, Ahn, Sung Soo, Chalal, Radhia Ait, Antonoli, Tugba Akinci D, Amouyel, Philippe, Andersson, Henrik, Bentegeac, Raphael, Benzoni, Claudio, Blandino, Antonino Andrea, Busch, Felix, Can, Elif, Cau, Riccardo, Cavallo, Armando Ugo, Chavihot, Christelle, Chiquete, Erwin, Cuocolo, Renato, Divjak, Eugen, Ivanac, Gordana, Macek, Barbara Dziadkowiec, Elogne, Armel, Fanni, Salvatore Claudio, Ferrarotti, Carlos, Fossataro, Claudia, Fossataro, Federica, Fulek, Katarzyna, Fulek, Michal, Gac, Pawel, Gachowska, Martyna, Juarez, Ignacio Garcia, Gatti, Marco, Gorelik, Natalia, Goulianou, Alexia Maria, Hamroun, Aghiles, Herinirina, Nicolas, Kraik, Krzysztof, Krupka, Dominik, Holay, Quentin, Kitamura, Felipe, Klontzas, Michail E, Kompanowska, Anna, Kompanowski, Rafal, Lefevre, Alexandre, Lemke, Tristan, Lindholz, Maximilian, Muller, Lukas, Macek, Piotr, Makowski, Marcus, Mannacio, Luigi, Meddeb, Aymen, Natale, Antonio, Edzang, Beatrice Nguema, Ojeda, Adriana, Park, Yae Won, Piccione, Federica, Ponsiglione, Andrea, Poreba, Malgorzata, Poreba, Rafal, Prucker, Philipp, Pruvo, Jean Pierre, Pugliesi, Rosa Alba, Rabemanorintsoa, Feno Hasina, Rafailidis, Vasileios, Resler, Katarzyna, Rotkegel, Jan, Saba, Luca, Siebert, Ezann, Stanzione, Arnaldo, Tekin, Ali Fuat, Yanchapaxi, Liz Toapanta, Triantafyllou, Matthaios, Tsaoulia, Ekaterini, Vassalou, Evangelia, Vernuccio, Federica, Wasselius, Johan, Wang, Weilang, Urban, Szymon, Wlodarczak, Adrian, Wlodarczak, Szymon, Wysocki, Andrzej, Xu, Lina, Zatonski, Tomasz, Zhang, Shuhang, Ziegelmayer, Sebastian, Kuchcinski, Gregory, Bressem, Keno K
Rationale and Objectives: To develop and validate PARROT (Polyglottal Annotated Radiology Reports for Open Testing), a large, multicentric, open-access dataset of fictional radiology reports spanning multiple languages for testing natural language processing applications in radiology. Materials and Methods: From May to September 2024, radiologists were invited to contribute fictional radiology reports following their standard reporting practices. Contributors provided at least 20 reports with associated metadata including anatomical region, imaging modality, clinical context, and for non-English reports, English translations. All reports were assigned ICD-10 codes. A human vs. AI report differentiation study was conducted with 154 participants (radiologists, healthcare professionals, and non-healthcare professionals) assessing whether reports were human-authored or AI-generated. Results: The dataset comprises 2,658 radiology reports from 76 authors across 21 countries and 13 languages. Reports cover multiple imaging modalities (CT: 36.1%, MRI: 22.8%, radiography: 19.0%, ultrasound: 16.8%) and anatomical regions, with chest (19.9%), abdomen (18.6%), head (17.3%), and pelvis (14.1%) being most prevalent. In the differentiation study, participants achieved 53.9% accuracy (95% CI: 50.7%-57.1%) in distinguishing between human and AI-generated reports, with radiologists performing significantly better (56.9%, 95% CI: 53.3%-60.6%, p<0.05) than other groups. Conclusion: PARROT represents the largest open multilingual radiology report dataset, enabling development and validation of natural language processing applications across linguistic, geographic, and clinical boundaries without privacy constraints.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Canada > Quebec > Montreal (0.14)
- Europe > Poland > Lower Silesia Province > Wroclaw (0.07)
- (34 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Nuclear Medicine (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Exploring syntactic information in sentence embeddings through multilingual subject-verb agreement
Nastase, Vivi, Jiang, Chunyang, Samo, Giuseppe, Merlo, Paola
In this paper, our goal is to investigate to what degree multilingual pretrained language models capture cross-linguistically valid abstract linguistic representations. We take the approach of developing curated synthetic data on a large scale, with specific properties, and using them to study sentence representations built using pretrained language models. We use a new multiple-choice task and datasets, Blackbird Language Matrices (BLMs), to focus on a specific grammatical structural phenomenon -- subject-verb agreement across a variety of sentence structures -- in several languages. Finding a solution to this task requires a system detecting complex linguistic patterns and paradigms in text representations. Using a two-level architecture that solves the problem in two steps -- detect syntactic objects and their properties in individual sentences, and find patterns across an input sequence of sentences -- we show that despite having been trained on multilingual texts in a consistent manner, multilingual pretrained language models have language-specific differences, and syntactic structure is not shared, even across closely related languages.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (7 more...)
The improbable voyage of Starship Titanic, the 1998 Douglas Adams video game filled with 'unhinged' chatbots
Douglas Adams originally devoted just half a page to eulogizing Starship Titanic in the tenth chapter of Life, The Universe, and Everything. A "sensationally beautiful, staggeringly huge" cruise liner resembling a "silver Arcturan Megavoidwhale," the luxury ship did not even complete its first radio message--an SOS--before its "Improbability Field" engine prototype triggered a "sudden and gratuitous total existence failure" shortly after launching. It was one of many in-world anecdotes scattered through the third part of the late sci-fi author's revered "trilogy in five parts," The Hitchhiker's Guide to the Galaxy. Over 17 years after its publication, most readers had probably forgotten about Starship Titanic. In 1998, however, the ill-fated intergalactic cruise liner's tale suddenly expanded to include a video game featuring tens of thousands of lines of scripted dialogue, hours of vocal performance recordings, and a standalone 223-page novel written by Monty Python's Terry Jones.
- South America > Brazil (0.05)
- North America > United States > Indiana > Marion County > Indianapolis (0.04)
- North America > United States > California (0.04)
- North America > Canada > Ontario (0.04)
Enhanced Intrusion Detection System for Multiclass Classification in UAV Networks
Menssouri, Safaa, Delamou, Mamady, Ibrahimi, Khalil, Amhoud, El Mehdi
Unmanned Aerial Vehicles (UAVs) have become increasingly popular in various applications, especially with the emergence of 6G systems and networks. However, their widespread adoption has also led to concerns regarding security vulnerabilities, making the development of reliable intrusion detection systems (IDS) essential for ensuring UAVs safety and mission success. This paper presents a new IDS for UAV networks. A binary-tuple representation was used for encoding class labels, along with a deep learning-based approach employed for classification. The proposed system enhances the intrusion detection by capturing complex class relationships and temporal network patterns. Moreover, a cross-correlation study between common features of different UAVs was conducted to discard correlated features that might mislead the classification of the proposed IDS. The full study was carried out using the UAV-IDS-2020 dataset, and we assessed the performance of the proposed IDS using different evaluation metrics. The experimental results highlighted the effectiveness of the proposed multiclass classifier model with an accuracy of 95%.
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Lin, Chaofan, Han, Zhenhua, Zhang, Chengruidong, Yang, Yuqing, Yang, Fan, Chen, Chen, Qiu, Lili
The rise of large language models (LLMs) has enabled LLM-based applications (a.k.a. AI agents or co-pilots), a new software paradigm that combines the strength of LLM and conventional software. Diverse LLM applications from different tenants could design complex workflows using multiple LLM requests to accomplish one task. However, they have to use the over-simplified request-level API provided by today's public LLM services, losing essential application-level information. Public LLM services have to blindly optimize individual LLM requests, leading to sub-optimal end-to-end performance of LLM applications. This paper introduces Parrot, an LLM service system that focuses on the end-to-end experience of LLM-based applications. Parrot proposes Semantic Variable, a unified abstraction to expose application-level knowledge to public LLM services. A Semantic Variable annotates an input/output variable in the prompt of a request, and creates the data pipeline when connecting multiple LLM requests, providing a natural way to program LLM applications. Exposing Semantic Variables to the public LLM service allows it to perform conventional data flow analysis to uncover the correlation across multiple LLM requests. This correlation opens a brand-new optimization space for the end-to-end performance of LLM-based applications. Extensive evaluations demonstrate that Parrot can achieve up to an order-of-magnitude improvement for popular and practical use cases of LLM applications.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Georgia > Chatham County > Savannah (0.04)
- (7 more...)
Domain Bridge: Generative model-based domain forensic for black-box models
Zhang, Jiyi, Fang, Han, Chang, Ee-Chien
In forensic investigations of machine learning models, techniques that determine a model's data domain play an essential role, with prior work relying on large-scale corpora like ImageNet to approximate the target model's domain. Although such methods are effective in finding broad domains, they often struggle in identifying finer-grained classes within those domains. In this paper, we introduce an enhanced approach to determine not just the general data domain (e.g., human face) but also its specific attributes (e.g., wearing glasses). Our approach uses an image embedding model as the encoder and a generative model as the decoder. Beginning with a coarse-grained description, the decoder generates a set of images, which are then presented to the unknown target model. Successful classifications by the model guide the encoder to refine the description, which in turn, are used to produce a more specific set of images in the subsequent iteration. This iterative refinement narrows down the exact class of interest. A key strength of our approach lies in leveraging the expansive dataset, LAION-5B, on which the generative model Stable Diffusion is trained. This enlarges our search space beyond traditional corpora, such as ImageNet. Empirical results showcase our method's performance in identifying specific attributes of a model's input domain, paving the way for more detailed forensic analyses of deep learning models.
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
- Transportation > Air (0.65)
ParroT: Translating during Chat using Large Language Models tuned with Human Translation and Feedback
Jiao, Wenxiang, Huang, Jen-tse, Wang, Wenxuan, He, Zhiwei, Liang, Tian, Wang, Xing, Shi, Shuming, Tu, Zhaopeng
Large language models (LLMs) like ChatGPT have exhibited remarkable abilities on a wide range of natural language processing~(NLP) tasks, including various machine translation abilities accomplished during chat. However, these models are only accessible through restricted APIs, which creates barriers to new research and advancements in the field. Therefore, we propose ParroT, a framework to enhance and regulate the translation abilities during chat based on open-source LLMs (e.g., LLaMA), human-written translation and feedback data. Specifically, ParroT reformulates translation data into the instruction-following style, and introduces a "$\mathbf{Hint}$" field for incorporating extra requirements to regulate the translation process. Accordingly, we propose three instruction types for finetuning ParroT models, including translation instruction, contrastive instruction, and error-guided instruction. Experiments on Flores subsets and WMT22 test sets suggest that translation instruction improves the translation performance of vanilla LLMs significantly while error-guided instruction can lead to further improvement, which demonstrates the importance of learning from low-quality translations annotated by humans. We also demonstrate the potential of automatic evaluation tools in providing quality information of translations, when constructing error-guided instructions for directions that lack human annotation data. Please refer to our Github project for more implementation details: https://github.com/wxjiao/ParroT
- South America (0.04)
- Europe > Spain (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)