Sarajevo
Illegal Waste Detection in Remote Sensing Images: A Case Study
Gibellini, Federico, Fraternali, Piero, Boracchi, Giacomo, Morandini, Luca, Diecidue, Andrea, Malegori, Simona
Environmental crime currently represents the third largest criminal activity worldwide while threatening ecosystems as well as human health. Among the crimes related to this activity, improper waste management can nowadays be countered more easily thanks to the increasing availability and decreasing cost of Very-High-Resolution Remote Sensing images, which enable semi-automatic territory scanning in search of illegal landfills. This paper proposes a pipeline, developed in collaboration with professionals from a local environmental agency, for detecting candidate illegal dumping sites leveraging a classifier of Remote Sensing images. To identify the best configuration for such classifier, an extensive set of experiments was conducted and the impact of diverse image characteristics and training settings was thoroughly analyzed. The local environmental agency was then involved in an experimental exercise where outputs from the developed classifier were integrated in the experts' everyday work, resulting in time savings with respect to manual photo-interpretation. The classifier was eventually run with valuable results on a location outside of the training area, highlighting potential for cross-border applicability of the proposed pipeline.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Italy > Lombardy (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (6 more...)
- Research Report (1.00)
- Overview (0.93)
- Water & Waste Management > Solid Waste Management (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.93)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
Deep Learning Models for Physical Layer Communications
The increased availability of data and computing resources has enabled researchers to successfully adopt machine learning (ML) techniques and make significant contributions in several engineering areas. ML and in particular deep learning (DL) algorithms have shown to perform better in tasks where a physical bottom-up description of the phenomenon is lacking and/or is mathematically intractable. Indeed, they take advantage of the observations of natural phenomena to automatically acquire knowledge and learn internal relations. Despite the historical model-based mindset, communications engineering recently started shifting the focus towards top-down data-driven learning models, especially in domains such as channel modeling and physical layer design, where in most of the cases no general optimal strategies are known. In this thesis, we aim at solving some fundamental open challenges in physical layer communications exploiting new DL paradigms. In particular, we mathematically formulate, under ML terms, classic problems such as channel capacity and optimal coding-decoding schemes, for any arbitrary communication medium. We design and develop the architecture, algorithm and code necessary to train the equivalent DL model, and finally, we propose novel solutions to long-standing problems in the field.
- Africa > Chad > Salamat (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Spain > Andalusia > Málaga Province > Málaga (0.04)
- (13 more...)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Research Report > Promising Solution (0.87)
- Energy > Power Industry (1.00)
- Aerospace & Defense (1.00)
- Information Technology (0.92)
Diffusion Instruction Tuning
Jin, Chen, Tanno, Ryutaro, Saseendran, Amrutha, Diethe, Tom, Teare, Philip
We introduce Lavender, a simple supervised fine-tuning (SFT) method that boosts the performance of advanced vision-language models (VLMs) by leveraging state-of-the-art image generation models such as Stable Diffusion. Specifically, Lavender aligns the text-vision attention in the VLM transformer with the equivalent used by Stable Diffusion during SFT, instead of adapting separate encoders. This alignment enriches the model's visual understanding and significantly boosts performance across in- and out-of-distribution tasks. Lavender requires just 0.13 million training examples, 2.5% of typical large-scale SFT datasets, and fine-tunes on standard hardware (8 GPUs) in a single day. It consistently improves state-of-the-art open-source multimodal LLMs (e.g., Llama-3.2-11B, MiniCPM-Llama3-v2.5), achieving up to 30% gains and a 68% boost on challenging out-of-distribution medical QA tasks. By efficiently transferring the visual expertise of image generators with minimal supervision, Lavender offers a scalable solution for more accurate vision-language systems. All code, training data, and models will be shared at https://astrazeneca.github.io/vlm/.
- Asia > China (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (5 more...)
Real-Time Sampling-Based Safe Motion Planning for Robotic Manipulators in Dynamic Environments
Covic, Nermin, Lacevic, Bakir, Osmankovic, Dinko, Uzunovic, Tarik
In this paper, we present the main features of Dynamic Rapidly-exploring Generalized Bur Tree (DRGBT) algorithm, a sampling-based planner for dynamic environments. We provide a detailed time analysis and appropriate scheduling to facilitate a real-time operation. To this end, an extensive analysis is conducted to identify the time-critical routines and their dependence on the number of obstacles. Furthermore, information about the distance to obstacles is used to compute a structure called dynamic expanded bubble of free configuration space, which is then utilized to establish sufficient conditions for a guaranteed safe motion of the robot while satisfying all kinematic constraints. An extensive randomized simulation trial is conducted to compare the proposed algorithm to a competing state-of-the-art method. Finally, an experimental study on a real robot is carried out covering a variety of scenarios including those with human presence. The results show the effectiveness and feasibility of real-time execution of the proposed motion planning algorithm within a typical sensor-based arrangement, using cheap hardware and sequential architecture, without the necessity for GPUs or heavy parallelization.
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.05)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- North America > United States > New York (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.86)
Knowledge Distillation for Real-Time Classification of Early Media in Voice Communications
Altwlkany, Kemal, Hadžić, Hadžem, Kurić, Amar, Lacic, Emanuel
This paper investigates the industrial setting of real-time classification of early media exchanged during the initialization phase of voice calls. We explore the application of state-of-the-art audio tagging models and highlight some limitations when applied to the classification of early media. While most existing approaches leverage convolutional neural networks, we propose a novel approach for low-resource requirements based on gradient-boosted trees. Our approach not only demonstrates a substantial improvement in runtime performance, but also exhibits a comparable accuracy. We show that leveraging knowledge distillation and class aggregation techniques to train a simpler and smaller model accelerates the classification of early media in voice calls. We provide a detailed analysis of the results on a proprietary and publicly available dataset, regarding accuracy and runtime performance. We additionally report a case study of the achieved performance improvements at a regional data center in India.
- Asia > India (0.25)
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.05)
- Europe > Croatia > Zagreb County > Zagreb (0.04)
- Research Report > New Finding (0.46)
- Research Report > Promising Solution (0.34)
- Telecommunications (0.68)
- Information Technology (0.49)
- Media (0.47)
- Leisure & Entertainment (0.47)
Towards Probabilistic Planning of Explanations for Robot Navigation
Halilovic, Amar, Krivic, Senka
In robotics, ensuring that autonomous systems are comprehensible and accountable to users is essential for effective human-robot interaction. This paper introduces a novel approach that integrates user-centered design principles directly into the core of robot path planning processes. We propose a probabilistic framework for automated planning of explanations for robot navigation, where the preferences of different users regarding explanations are probabilistically modeled to tailor the stochasticity of the real-world human-robot interaction and the communication of decisions of the robot and its actions towards humans. This approach aims to enhance the transparency of robot path planning and adapt to diverse user explanation needs by anticipating the types of explanations that will satisfy individual users.
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.05)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- Europe > Germany (0.04)
News-Driven Stock Price Forecasting in Indian Markets: A Comparative Study of Advanced Deep Learning Models
Attaluri, Kaushal, Tripathi, Mukesh, Reddy, Srinithi, Shivendra, null
Forecasting stock market prices remains a complex challenge for traders, analysts, and engineers due to the multitude of factors that influence price movements. Recent advancements in artificial intelligence (AI) and natural language processing (NLP) have significantly enhanced stock price prediction capabilities. AI's ability to process vast and intricate data sets has led to more sophisticated forecasts. However, achieving consistently high accuracy in stock price forecasting remains elusive. In this paper, we leverage 30 years of historical data from national banks in India, sourced from the National Stock Exchange, to forecast stock prices. Our approach utilizes state-of-the-art deep learning models, including multivariate multi-step Long Short-Term Memory (LSTM), Facebook Prophet with LightGBM optimized through Optuna, and Seasonal Auto-Regressive Integrated Moving Average (SARIMA). We further integrate sentiment analysis from tweets and reliable financial sources such as Business Standard and Reuters, acknowledging their crucial influence on stock price fluctuations.
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
- Asia > India > Telangana > Hyderabad (0.05)
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.04)
- Asia > India > Bihar (0.04)
A Recurrent Neural Network Approach to the Answering Machine Detection Problem
Altwlkany, Kemal, Delalic, Sead, Selmanovic, Elmedin, Alihodzic, Adis, Lovric, Ivica
In the field of telecommunications and cloud communications, accurately and in real-time detecting whether a human or an answering machine has answered an outbound call is of paramount importance. This problem is of particular significance during campaigns as it enhances service quality, efficiency and cost reduction through precise caller identification. Despite the significance of the field, it remains inadequately explored in the existing literature. This paper presents an innovative approach to answering machine detection that leverages transfer learning through the YAMNet model for feature extraction. The YAMNet architecture facilitates the training of a recurrent-based classifier, enabling real-time processing of audio streams, as opposed to fixed-length recordings. The results demonstrate an accuracy of over 96% on the test set. Furthermore, we conduct an in-depth analysis of misclassified samples and reveal that an accuracy exceeding 98% can be achieved with the integration of a silence detection algorithm, such as the one provided by FFmpeg.
- North America > United States (0.14)
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.04)
- South America > Peru (0.04)
- (4 more...)
Multi-Class Plant Leaf Disease Detection: A CNN-based Approach with Mobile App Integration
Foysal, Md Aziz Hosen, Ahmed, Foyez, Haque, Md Zahurul
Prompt and accurate detection is crucial for the efficient management and mitigation of plant diseases. This study investigates advanced techniques in plant disease detection, emphasizing the integration of image processing, machine learning, deep learning methods, and mobile technologies. High-resolution images of plant leaves were captured and analyzed using convolutional neural networks (CNNs) to detect symptoms of various diseases, such as blight, mildew, and rust. This study explores 14 classes of plants and diagnoses 26 unique plant diseases. We focus on common diseases affecting various crops. The model was trained on a diverse dataset encompassing multiple crops and disease types, achieving 98.14% accuracy in disease diagnosis. Finally integrated this model into mobile apps for real-time disease diagnosis.
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.04)
- Asia > India > West Bengal > Kolkata (0.04)
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
- Health & Medicine (1.00)
- Food & Agriculture > Agriculture (1.00)
The Language of Trauma: Modeling Traumatic Event Descriptions Across Domains with Explainable AI
Schirmer, Miriam, Leemann, Tobias, Kasneci, Gjergji, Pfeffer, Jürgen, Jurgens, David
Psychological trauma can manifest following various distressing events and is captured in diverse online contexts. However, studies traditionally focus on a single aspect of trauma, often neglecting the transferability of findings across different scenarios. We address this gap by training language models with progressing complexity on trauma-related datasets, including genocide-related court data, a Reddit dataset on post-traumatic stress disorder (PTSD), counseling conversations, and Incel forum posts. Our results show that the fine-tuned RoBERTa model excels in predicting traumatic events across domains, slightly outperforming large language models like GPT-4. Additionally, SLALOM-feature scores and conceptual explanations effectively differentiate and cluster trauma-related language, highlighting different trauma aspects and identifying sexual abuse and experiences related to death as a common traumatic event across all datasets. This transferability is crucial as it allows for the development of tools to enhance trauma detection and intervention in diverse populations and settings.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Cambodia (0.04)
- North America > United States > Michigan (0.04)
- (5 more...)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Law (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)