Uzbekistan
Harnessing LLMs for Educational Content-Driven Italian Crossword Generation
Zeinalipour, Kamyar, Fusco, Achille, Zanollo, Asya, Maggini, Marco, Gori, Marco
In this work, we unveil a novel tool for generating Italian crossword puzzles from text, utilizing advanced language models such as GPT-4o, Mistral-7B-Instruct-v0.3, and Llama3-8b-Instruct. Crafted specifically for educational applications, this cutting-edge generator makes use of the comprehensive Italian-Clue-Instruct dataset, which comprises over 30,000 entries including diverse text, solutions, and types of clues. This carefully assembled dataset is designed to facilitate the creation of contextually relevant clues in various styles associated with specific texts and keywords. The study delves into four distinctive styles of crossword clues: those without format constraints, those formed as definite determiner phrases, copular sentences, and bare noun phrases. Each style introduces unique linguistic structures to diversify clue presentation. Given the lack of sophisticated educational tools tailored to the Italian language, this project seeks to enhance learning experiences and cognitive development through an engaging, interactive platform. By meshing state-of-the-art AI with contemporary educational strategies, our tool can dynamically generate crossword puzzles from Italian educational materials, thereby providing an enjoyable and interactive learning environment. This technological advancement not only redefines educational paradigms but also sets a new benchmark for interactive and cognitive language learning solutions.
Guaranteed Generation from Large Language Models
Kim, Minbeom, Thonet, Thibaut, Rozen, Jos, Lee, Hwaran, Jung, Kyomin, Dymetman, Marc
As large language models (LLMs) are increasingly used across various applications, there is a growing need to control text generation to satisfy specific constraints or requirements. This raises a crucial question: Is it possible to guarantee strict constraint satisfaction in generated outputs while preserving the distribution of the original model as much as possible? We first define the ideal distribution - the one closest to the original model, which also always satisfies the expressed constraint - as the ultimate goal of guaranteed generation. We then state a fundamental limitation, namely that it is impossible to reach that goal through autoregressive training alone. This motivates the necessity of combining training-time and inference-time methods to enforce such guarantees. Based on this insight, we propose GUARD, a simple yet effective approach that combines an autoregressive proposal distribution with rejection sampling. Through GUARD's theoretical properties, we show how controlling the KL divergence between a specific proposal and the target ideal distribution simultaneously optimizes inference speed and distributional closeness. To validate these theoretical concepts, we conduct extensive experiments on two text generation settings with hard-to-satisfy constraints: a lexical constraint scenario and a sentiment reversal scenario. These experiments show that GUARD achieves perfect constraint satisfaction while almost preserving the ideal distribution with highly improved inference efficiency. GUARD provides a principled approach to enforcing strict guarantees for LLMs without compromising their generative capabilities.
FeruzaSpeech: A 60 Hour Uzbek Read Speech Corpus with Punctuation, Casing, and Context
This paper introduces FeruzaSpeech, a read speech corpus of the Uzbek language, containing transcripts in both Cyrillic and Latin alphabets, freely available for academic research purposes. This corpus includes 60 hours of high-quality recordings from a single native female speaker from Tashkent, Uzbekistan. These recordings consist of short excerpts from a book and BBC News. This paper discusses the enhancement of the Word Error Rates (WERs) on CommonVoice 16.1's Uzbek data, Uzbek Speech Corpus data, and FeruzaSpeech data upon integrating FeruzaSpeech.
Review on Application of Drone in Spraying Pesticides and Fertilizers
In today's agriculture, there are far too many innovations involved. One of the emerging technologies is pesticide spraying using drones. Manual pesticide spraying has a number of negative consequences for the people who are involved in the spraying operation. The result of exposure symptoms can include minor skin inflammation and birth abnormalities, tumors, genetic modifications, nerve and blood diseases, endocrinal interference, coma or death. However, Drone can be used to automate fertilizer application, pesticide spraying, and field tracking. This paper provides a concise overview of the use of drones for field inspection and pesticide spraying. displays different methodologies and controllers of agriculture drone and explains some essential Drone Hardware, Software elements and applications
Inter Subject Emotion Recognition Using Spatio-Temporal Features From EEG Signal
Asif, Mohammad, Srivastava, Diya, Gupta, Aditya, Tiwary, Uma Shanker
Inter-subject or subject-independent emotion recognition has been a challenging task in affective computing. This work is about an easy-to-implement emotion recognition model that classifies emotions from EEG signals subject independently. It is based on the famous EEGNet architecture, which is used in EEG-related BCIs. We used the'Dataset on Emotion using Naturalistic Stimuli' (DENS) dataset. The dataset contains the'Emotional Events'- the precise information of the emotion timings that participants felt. The model is a combination of regular, depthwise and separable convolution layers of CNN to classify the emotions. The model has the capacity to learn the spatial features of the EEG channels and the temporal features of the EEG signals variability with time. The model is evaluated for the valence space ratings. The model achieved an accuracy of 73.04%.
Uzbek text's correspondence with the educational potential of pupils: a case study of the School corpus
Madatov, Khabibulla, Matlatipov, Sanatbek, Aripov, Mersaid
One of the major challenges of an educational system is choosing appropriate content considering pupils' age and intellectual potential. In this article the experiment of primary school grades (from 1st to 4th grades) is considered for automatically determining the correspondence of an educational materials recommended for pupils by using the School corpus where it includes the dataset of 25 school textbooks confirmed by the Ministry of preschool and school education of the Republic of Uzbekistan. In this case, TF-IDF scores of the texts are determined, they are converted into a vector representation, and the given educational materials are compared with the corresponding class of the School corpus using the cosine similarity algorithm. Based on the results of the calculation, it is determined whether the given educational material is appropriate or not appropriate for the pupils' educational potential.
Text classification dataset and analysis for Uzbek language
Kuriyozov, Elmurod, Salaev, Ulugbek, Matlatipov, Sanatbek, Matlatipov, Gayrat
Text classification is an important task in Natural Language Processing (NLP), where the goal is to categorize text data into predefined classes. In this study, we analyze the dataset creation steps and evaluation techniques of multi-label news categorisation task as part of text classification. We first present a newly obtained dataset for Uzbek text classification, which was collected from 10 different news and press websites and covers 15 categories of news, press and law texts. We also present a comprehensive evaluation of different models, ranging from traditional bag-of-words models to deep learning architectures, on this newly created dataset. Our experiments show that the Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN) based models outperform the rule-based models. The best performance is achieved by the BERTbek model, which is a transformer-based BERT model trained on the Uzbek corpus. Our findings provide a good baseline for further research in Uzbek text classification.
Creating a morphological and syntactic tagged corpus for the Uzbek language
Sharipov, Maksud, Mattiev, Jamolbek, Sobirov, Jasur, Baltayev, Rustam
Nowadays, creation of the tagged corpora is becoming one of the most important tasks of Natural Language Processing (NLP). There are not enough tagged corpora to build machine learning models for the low-resource Uzbek language. In this paper, we tried to fill that gap by developing a novel Part Of Speech (POS) and syntactic tagset for creating the syntactic and morphologically tagged corpus of the Uzbek language. This work also includes detailed description and presentation of a web-based application to work on a tagging as well. Based on the developed annotation tool and the software, we share our experience results of the first stage of the tagged corpus creaton.
Artificial intelligence expert moves to Montreal because it's an AI hub – IAM Network
Article contentIrina Rish, now a renowned expert in the field of artificial intelligence, first became drawn to the topic as a teenager in the former Soviet republic of Uzbekistan. At 14, she was fascinated by the notion that machines might have their own thought processes."I "I didn't know the word yet (algorithm) but that's essentially what it was. How do you solve tough problems?"She But the other interesting part of that is that you hope that by doing so, you can also better understand the human mind and hopefully achieve better human intelligence.
Artificial intelligence expert moves to Montreal because it's an AI hub
Irina Rish, now a renowned expert in the field of artificial intelligence, first became drawn to the topic as a teenager in the former Soviet republic of Uzbekistan. At 14, she was fascinated by the notion that machines might have their own thought processes. "I was interested in math in school and I was looking at how you improve problem solving and how you come up with algorithms," Rish said in a phone interview Friday afternoon. "I didn't know the word yet (algorithm) but that's essentially what it was. How do you solve tough problems?"