Goto

Collaborating Authors

 Kazakhstan


Sentiment analysis of texts from social networks based on machine learning methods for monitoring public sentiment

arXiv.org Artificial Intelligence

A sentiment analysis system powered by machine learning was created in this study to improve real-time social network public opinion monitoring. For sophisticated sentiment identification, the suggested approach combines cutting-edge transformer-based architectures (DistilBERT, RoBERTa) with traditional machine learning models (Logistic Regression, SVM, Naive Bayes). The system achieved an accuracy of up to 80-85% using transformer models in real-world scenarios after being tested using both deep learning techniques and standard machine learning processes on annotated social media datasets. According to experimental results, deep learning models perform noticeably better than lexicon-based and conventional rule-based classifiers, lowering misclassification rates and enhancing the ability to recognize nuances like sarcasm. According to feature importance analysis, context tokens, sentiment-bearing keywords, and part-of-speech structure are essential for precise categorization. The findings confirm that AI-driven sentiment frameworks can provide a more adaptive and efficient approach to modern sentiment challenges. Despite the system's impressive performance, issues with computing overhead, data quality, and domain-specific terminology still exist. In order to monitor opinions on a broad scale, future research will investigate improving computing performance, extending coverage to various languages, and integrating real-time streaming APIs. The results demonstrate that governments, corporations, and social researchers looking for more in-depth understanding of public mood on digital platforms can find a reliable and adaptable answer in AI-powered sentiment analysis.


Russia-Ukraine war: List of key events โ€“ day 1,091

Al Jazeera

Kyiv also said that Russian forces launched two missile strikes and 72 air strikes, and used 1,024 kamikaze drones, along with 4,200 artillery attacks that targeted Ukrainian positions and settlements, AA reports. In Ukraine's Kharkiv region, Ukrainian forces said they prevented Russian advances towards Mala Shapkivka and Topoli, while Moscow's troops launched 16 attacks in Ukraine's Kupiansk region, with Kyiv's forces claiming to have repelled 14, as battles continue, Anadolu reports. Russia said oil flows through the Caspian Pipeline Consortium, a major route for supplying Kazakhstan and exporting to the global market, have been reduced by 30 to 40 percent after a Ukrainian drone attack on a pumping station. The Caspian pipeline, which ships more than 1 percent of daily global oil supplies, stretches over 1,500km (939 miles) and carries crude oil from Kazakhstan's Tengiz oilfield on Russia's northeastern shores of the Caspian Sea as well as from Russian producers. Freedom in Russia and the end of Russian President Vladimir Putin's government depends on Ukraine winning the war, former chess world champion and Kremlin critic Garry Kasparov said.


Instruction Tuning on Public Government and Cultural Data for Low-Resource Language: a Case Study in Kazakh

arXiv.org Artificial Intelligence

Instruction tuning in low-resource languages remains underexplored due to limited text data, particularly in government and cultural domains. To address this, we introduce and open-source a large-scale (10,600 samples) instruction-following (IFT) dataset, covering key institutional and cultural knowledge relevant to Kazakhstan. Our dataset enhances LLMs' understanding of procedural, legal, and structural governance topics. We employ LLM-assisted data generation, comparing open-weight and closed-weight models for dataset construction, and select GPT-4o as the backbone. Each entity of our dataset undergoes full manual verification to ensure high quality. We also show that fine-tuning Qwen, Falcon, and Gemma on our dataset leads to consistent performance improvements in both multiple-choice and generative tasks, demonstrating the potential of LLM-assisted instruction tuning for low-resource languages.


Qorgau: Evaluating LLM Safety in Kazakh-Russian Bilingual Contexts

arXiv.org Artificial Intelligence

Large language models (LLMs) are known to have the potential to generate harmful content, posing risks to users. While significant progress has been made in developing taxonomies for LLM risks and safety evaluation prompts, most studies have focused on monolingual contexts, primarily in English. However, language- and region-specific risks in bilingual contexts are often overlooked, and core findings can diverge from those in monolingual settings. In this paper, we introduce Qorgau, a novel dataset specifically designed for safety evaluation in Kazakh and Russian, reflecting the unique bilingual context in Kazakhstan, where both Kazakh (a low-resource language) and Russian (a high-resource language) are spoken. Experiments with both multilingual and language-specific LLMs reveal notable differences in safety performance, emphasizing the need for tailored, region-specific datasets to ensure the responsible and safe deployment of LLMs in countries like Kazakhstan. Warning: this paper contains example data that may be offensive, harmful, or biased.


KazMMLU: Evaluating Language Models on Kazakh, Russian, and Regional Knowledge of Kazakhstan

arXiv.org Artificial Intelligence

Despite having a population of twenty million, Kazakhstan's culture and language remain underrepresented in the field of natural language processing. Although large language models (LLMs) continue to advance worldwide, progress in Kazakh language has been limited, as seen in the scarcity of dedicated models and benchmark evaluations. To address this gap, we introduce KazMMLU, the first MMLU-style dataset specifically designed for Kazakh language. KazMMLU comprises 23,000 questions that cover various educational levels, including STEM, humanities, and social sciences, sourced from authentic educational materials and manually validated by native speakers and educators. The dataset includes 10,969 Kazakh questions and 12,031 Russian questions, reflecting Kazakhstan's bilingual education system and rich local context. Our evaluation of several state-of-the-art multilingual models (Llama-3.1, Qwen-2.5, GPT-4, and DeepSeek V3) demonstrates substantial room for improvement, as even the best-performing models struggle to achieve competitive performance in Kazakh and Russian. These findings underscore significant performance gaps compared to high-resource languages. We hope that our dataset will enable further research and development of Kazakh-centric LLMs. Data and code will be made available upon acceptance.


LLM Modules: Knowledge Transfer from a Large to a Small Model using Enhanced Cross-Attention

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated outstanding performance in natural language processing tasks; however, their training and deployment require significant computational resources. This has led to the need for methods that transfer knowledge from large pre-trained models to smaller models. Such approaches are especially relevant for applied tasks with limited computational resources. In this work, we propose a modular LLM architecture in which a large model serves as a knowledge source, while a smaller model receives external representations via Enhanced Cross-Attention and generates responses. This method significantly reduces training costs while remaining effective for solving specific business tasks.


Kazakhstan plane crash survivors say they heard bangs before aircraft went down

FOX News

Fox News correspondent Stephanie Bennett has the latest on the aftermath of the Kazakhstan plane crash on'Special Report.' Crew members and survivors of the Azerbaijan Airlines plane that crashed in Kazakhstan on Christmas Day say they heard at least one loud bang before the aircraft crashed in a ball of fire, heightening speculation that a Russian anti-aircraft missile may have been responsible for the tragedy. The Embraer 190 passenger jet flying from Azerbaijan to Russia crashed near the city of Aktau in Kazakhstan after diverting from an area of southern Russia where Moscow has repeatedly used air defense systems against Ukrainian attack drones. At least 38 people were killed while 29 survived. Subhonkul Rakhimov, one of the passengers aboard Flight J2-8243, told Reuters from the hospital that he had begun to recite prayers and prepare for the end after hearing a bang.


Did Russian air defence down the Azerbaijani plane in Kazakhstan?

Al Jazeera

Kyiv, Ukraine โ€“ Russian air defence officials could very possibly have struck an Azerbaijani passenger jet over Chechnya after panicking during a Ukrainian drone attack, analysts and experts from Ukraine, Kazakhstan and Azerbaijan have told Al Jazeera. Moscow might have also compounded what one expert described as a "crime" by not letting the damaged plane land nearby and instead forcing it to fly to Kazakhstan. The analysis by these experts comes amid mounting reports quoting unnamed Azerbaijani officials and other analysts pointing fingers at Russia for the crash, in which at least 38 people were killed. The Kremlin claimed that the AZAL 8432 flight with 67 passengers on board hit a flock of birds early Wednesday after it entered Russian airspace to land in Grozny, Chechnya's administrative capital. But within hours, photos and videos of the plane surfaced, apparently showing deep holes and multiple pockmarks on its tail.


Development of a Service Robot for Hospital Environments in Rehabilitation Medicine with LiDAR Based Simultaneous Localization and Mapping

arXiv.org Artificial Intelligence

This paper presents the development and evaluation of a medical service robot equipped with 3D LiDAR and advanced localization capabilities for use in hospital environments. The robot employs LiDAR-based Simultaneous Localization and Mapping SLAM to navigate autonomously and interact effectively within complex and dynamic healthcare settings. A comparative analysis with established 3D SLAM technology in Autoware version 1.14.0, under a Linux ROS framework, provided a benchmark for evaluating our system performance. The adaptation of Normal Distribution Transform NDT Matching to indoor navigation allowed for precise real-time mapping and enhanced obstacle avoidance capabilities. Empirical validation was conducted through manual maneuvers in various environments, supplemented by ROS simulations to test the system response to simulated challenges. The findings demonstrate that the robot integration of 3D LiDAR and NDT Matching significantly improves navigation accuracy and operational reliability in a healthcare context. This study highlights the robot ability to perform essential tasks with high efficiency and identifies potential areas for further improvement, particularly in sensor performance under diverse environmental conditions. The successful deployment of this technology in a hospital setting illustrates its potential to support medical staff and contribute to patient care, suggesting a promising direction for future research and development in healthcare robotics.


NUSense: Robust Soft Optical Tactile Sensor

arXiv.org Artificial Intelligence

While most tactile sensors rely on measuring pressure, insights from continuum mechanics suggest that measuring shear strain provides critical information for tactile sensing. In this work, we introduce an optical tactile sensing principle based on shear strain detection. A silicone rubber layer, dyed with color inks, is used to quantify the shear magnitude of the sensing layer. This principle was validated using the NUSense camera-based tactile sensor. The wide-angle camera captures the elongation of the soft pad under mechanical load, a phenomenon attributed to the Poisson effect. The physical and optical properties of the inked pad are essential and should ideally remain stable over time. We tested the robustness of the sensor by subjecting the outermost layer to multiple load cycles using a robot arm. Additionally, we discussed potential applications of this sensor in force sensing and contact localization.