AITopics | Philippines

Supplementary Materials

Neural Information Processing SystemsMar-23-2025, 12:26:19 GMT

We provide the supplements of "Contextual Gaussian Process Bandits with Neural Networks" here. Specifically, we discuss alternative acquisition functions that can be incorporated with the neural network-accompanied Gaussian process (NN-AGP) model in Section 6. In Section 7, we discuss the bandit algorithm with NN-AGP, where the neural network approximation error is considered. In Section 8, we provide the detailed proof of theorems. We provide the experimental details and include additional numerical experiments in Section 9. Last we discuss the limitations of NN-AGP and propose the potential approaches to addressing the limitations for future work, including sparse NN-AGP for alleviating computational burdens and transfer learning with NN-AGP to address cold-start issue; see Section 10. In the main text, we employ the upper confidence bound function as the acquisition function in the contextual Bayesian optimization approach. Here, we provide two alternative choices: Thompson sampling (TS) and knowledge gradient (KG). We describe the two procedures of the contextual GP bandit problems with NN-AGP, where the acquisition function is replaced by TS or KG. It chooses the action that maximizes the expected reward with respect to a random belief that is drawn for a posterior distribution.

data mining, machine learning, optimization, (19 more...)

Neural Information Processing Systems

Country: Asia > Philippines > Luzon > National Capital Region (0.45)

Industry:

Health & Medicine (0.49)
Education (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

USAM-Net: A U-Net-based Network for Improved Stereo Correspondence and Scene Depth Estimation using Features from a Pre-trained Image Segmentation network

Dayo, Joseph Emmanuel DL, Naval, Prospero C. Jr

arXiv.org Artificial IntelligenceMar-19-2025

The increasing demand for high-accuracy depth estimation in autonomous driving and augmented reality applications necessitates advanced neural architectures capable of effectively leveraging multiple data modalities. In this context, we introduce the Unified Segmentation Attention Mechanism Network (USAM-Net), a novel convolutional neural network that integrates stereo image inputs with semantic segmentation maps and attention to enhance depth estimation performance. USAM-Net employs a dual-pathway architecture, which combines a pre-trained segmentation model (SAM) and a depth estimation model. The segmentation pathway preprocesses the stereo images to generate semantic masks, which are then concatenated with the stereo images as inputs to the depth estimation pathway. This integration allows the model to focus on important features such as object boundaries and surface textures which are crucial for accurate depth perception. Empirical evaluation on the DrivingStereo dataset demonstrates that USAM-Net achieves superior performance metrics, including a Global Difference (GD) of 3.61\% and an End-Point Error (EPE) of 0.88, outperforming traditional models such as CFNet, SegStereo, and iResNet. These results underscore the effectiveness of integrating segmentation information into stereo depth estimation tasks, highlighting the potential of USAM-Net in applications demanding high-precision depth data.

artificial intelligence, image understanding, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2503.1495

Country:

Asia > Philippines (0.14)
Europe > Germany > Baden-Württemberg (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology (0.49)
Transportation (0.35)
Automobiles & Trucks (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Reinforcement Learning Environment with LLM-Controlled Adversary in D&D 5th Edition Combat

Dayo, Joseph Emmanuel DL, Ogbinar, Michel Onasis S., Naval, Prospero C. Jr

arXiv.org Artificial IntelligenceMar-19-2025

The objective of this study is to design and implement a reinforcement learning (RL) environment using D\&D 5E combat scenarios to challenge smaller RL agents through interaction with a robust adversarial agent controlled by advanced Large Language Models (LLMs) like GPT-4o and LLaMA 3 8B. This research employs Deep Q-Networks (DQN) for the smaller agents, creating a testbed for strategic AI development that also serves as an educational tool by simulating dynamic and unpredictable combat scenarios. We successfully integrated sophisticated language models into the RL framework, enhancing strategic decision-making processes. Our results indicate that while RL agents generally outperform LLM-controlled adversaries in standard metrics, the strategic depth provided by LLMs significantly enhances the overall AI capabilities in this complex, rule-based setting. The novelty of our approach and its implications for mastering intricate environments and developing adaptive strategies are discussed, alongside potential innovations in AI-driven interactive simulations. This paper aims to demonstrate how integrating LLMs can create more robust and adaptable AI systems, providing valuable insights for further research and educational applications.

agent, llm, rl agent, (12 more...)

arXiv.org Artificial Intelligence

2503.15726

Country: Asia > Philippines (0.14)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Education (0.84)
Government > Military (0.70)
Information Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Enhancement of Jiang, Z., et al.s Compression-Based Classification Algorithm Applied to News Article Categorization

Benavides, Sean Lester C., Masapol, Cid Antonio F., Morano, Jonathan C., Cortez, Dan Michael A.

arXiv.org Artificial IntelligenceFeb-20-2025

This study enhances Jiang et al.'s compression-based classification algorithm by addressing its limitations in detecting semantic similarities between text documents. The proposed improvements focus on unigram extraction and optimized concatenation, eliminating reliance on entire document compression. By compressing extracted unigrams, the algorithm mitigates sliding window limitations inherent to gzip, improving compression efficiency and similarity detection. The optimized concatenation strategy replaces direct concatenation with the union of unigrams, reducing redundancy and enhancing the accuracy of Normalized Compression Distance (NCD) calculations. Experimental results across datasets of varying sizes and complexities demonstrate an average accuracy improvement of 5.73%, with gains of up to 11% on datasets containing longer documents. Notably, these improvements are more pronounced in datasets with high-label diversity and complex text structures. The methodology achieves these results while maintaining computational efficiency, making it suitable for resource-constrained environments. This study provides a robust, scalable solution for text classification, emphasizing lightweight preprocessing techniques to achieve efficient compression, which in turn enables more accurate classification.

machine learning, natural language, text classification, (20 more...)

arXiv.org Artificial Intelligence

2502.14444

Country: Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)

Genre: Research Report > Experimental Study (0.34)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.39)

Add feedback

Semantic Decomposition and Selective Context Filtering -- Text Processing Techniques for Context-Aware NLP-Based Systems

Villardar, Karl John

arXiv.org Artificial IntelligenceFeb-19-2025

In this paper, we present two techniques for use in context-aware systems: Semantic Decomposition, which sequentially decomposes input prompts into a structured and hierarchal information schema in which systems can parse and process easily, and Selective Context Filtering, which enables systems to systematically filter out specific irrelevant sections of contextual information that is fed through a system's NLP-based pipeline. We will explore how context-aware systems and applications can utilize these two techniques in order to implement dynamic LLM-to-system interfaces, improve an LLM's ability to generate more contextually cohesive user-facing responses, and optimize complex automated workflows and pipelines.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.14048

Country: Asia > Philippines (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine (1.00)
Banking & Finance (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Supplementary Materials

Neural Information Processing SystemsFeb-9-2025, 20:27:22 GMT

We provide the supplements of "Contextual Gaussian Process Bandits with Neural Networks" here. Specifically, we discuss alternative acquisition functions that can be incorporated with the neural network-accompanied Gaussian process (NN-AGP) model in Section 6. In Section 7, we discuss the bandit algorithm with NN-AGP, where the neural network approximation error is considered. In Section 8, we provide the detailed proof of theorems. We provide the experimental details and include additional numerical experiments in Section 9. Last we discuss the limitations of NN-AGP and propose the potential approaches to addressing the limitations for future work, including sparse NN-AGP for alleviating computational burdens and transfer learning with NN-AGP to address cold-start issue; see Section 10. In the main text, we employ the upper confidence bound function as the acquisition function in the contextual Bayesian optimization approach. Here, we provide two alternative choices: Thompson sampling (TS) and knowledge gradient (KG). We describe the two procedures of the contextual GP bandit problems with NN-AGP, where the acquisition function is replaced by TS or KG. It chooses the action that maximizes the expected reward with respect to a random belief that is drawn for a posterior distribution.

data mining, machine learning, optimization, (19 more...)

Neural Information Processing Systems

Country: Asia > Philippines > Luzon > National Capital Region (0.45)

Industry:

Health & Medicine (0.49)
Education (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Advancing Vehicle Plate Recognition: Multitasking Visual Language Models with VehiclePaliGemma

AlDahoul, Nouar, Tan, Myles Joshua Toledo, Tera, Raghava Reddy, Karim, Hezerul Abdul, Lim, Chee How, Mishra, Manish Kumar, Zaki, Yasir

arXiv.org Artificial IntelligenceDec-14-2024

License plate recognition (LPR) involves automated systems that utilize cameras and computer vision to read vehicle license plates. Such plates collected through LPR can then be compared against databases to identify stolen vehicles, uninsured drivers, crime suspects, and more. The LPR system plays a significant role in saving time for institutions such as the police force. In the past, LPR relied heavily on Optical Character Recognition (OCR), which has been widely explored to recognize characters in images. Usually, collected plate images suffer from various limitations, including noise, blurring, weather conditions, and close characters, making the recognition complex. Existing LPR methods still require significant improvement, especially for distorted images. To fill this gap, we propose utilizing visual language models (VLMs) such as OpenAI GPT4o, Google Gemini 1.5, Google PaliGemma (Pathways Language and Image model + Gemma model), Meta Llama 3.2, Anthropic Claude 3.5 Sonnet, LLaVA, NVIDIA VILA, and moondream2 to recognize such unclear plates with close characters. This paper evaluates the VLM's capability to address the aforementioned problems. Additionally, we introduce ``VehiclePaliGemma'', a fine-tuned Open-sourced PaliGemma VLM designed to recognize plates under challenging conditions. We compared our proposed VehiclePaliGemma with state-of-the-art methods and other VLMs using a dataset of Malaysian license plates collected under complex conditions. The results indicate that VehiclePaliGemma achieved superior performance with an accuracy of 87.6\%. Moreover, it is able to predict the car's plate at a speed of 7 frames per second using A100-80GB GPU. Finally, we explored the multitasking capability of VehiclePaliGemma model to accurately identify plates containing multiple cars of various models and colors, with plates positioned and oriented in different directions.

large language model, machine learning, recognition, (20 more...)

arXiv.org Artificial Intelligence

2412.14197

Country: Asia > Philippines (0.28)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (0.93)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

An Enhancement of CNN Algorithm for Rice Leaf Disease Image Classification in Mobile Applications

Rodrigo, Kayne Uriel K., Marcial, Jerriane Hillary Heart S., Brillo, Samuel C., Mata, Khatalyn E., Morano, Jonathan C.

arXiv.org Artificial IntelligenceDec-9-2024

This study focuses on enhancing rice leaf disease image classification algorithms, which have traditionally relied on Convolutional Neural Network (CNN) models. We employed transfer learning with MobileViTV2_050 using ImageNet-1k weights, a lightweight model that integrates CNN's local feature extraction with Vision Transformers' global context learning through a separable self-attention mechanism. Our approach resulted in a significant 15.66% improvement in classification accuracy for MobileViTV2_050-A, our first enhanced model trained on the baseline dataset, achieving 93.14%. Furthermore, MobileViTV2_050-B, our second enhanced model trained on a broader rice leaf dataset, demonstrated a 22.12% improvement, reaching 99.6% test accuracy. Additionally, MobileViTV2-A attained an F1-score of 93% across four rice labels and a Receiver Operating Characteristic (ROC) curve ranging from 87% to 97%. In terms of resource consumption, our enhanced models reduced the total parameters of the baseline CNN model by up to 92.50%, from 14 million to 1.1 million. These results indicate that MobileViTV2_050 not only improves computational efficiency through its separable self-attention mechanism but also enhances global context learning. Consequently, it offers a lightweight and robust solution suitable for mobile deployment, advancing the interpretability and practicality of models in precision agriculture.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.07182

Country: Asia > Philippines > Luzon > National Capital Region > City of Manila (0.16)

Genre: Research Report > New Finding (0.46)

Industry: Food & Agriculture > Agriculture (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AI TrackMate: Finally, Someone Who Will Give Your Music More Than Just "Sounds Great!"

Jiang, Yi-Lin, Hsiung, Chia-Ho, Yeh, Yen-Tung, Chen, Lu-Rong, Chen, Bo-Yu

arXiv.org Artificial IntelligenceDec-9-2024

The rise of "bedroom producers" has democratized music creation, while challenging producers to objectively evaluate their work. To address this, we present AI TrackMate, an LLM-based music chatbot designed to provide constructive feedback on music productions. By combining LLMs' inherent musical knowledge with direct audio track analysis, AI TrackMate offers production-specific insights, distinguishing it from text-only approaches. Our framework integrates a Music Analysis Module, an LLM-Readable Music Report, and Music Production-Oriented Feedback Instruction, creating a plug-and-play, training-free system compatible with various LLMs and adaptable to future advancements. We demonstrate AI TrackMate's capabilities through an interactive web interface and present findings from a pilot study with a music producer. By bridging AI capabilities with the needs of independent producers, AI TrackMate offers on-demand analytical feedback, potentially supporting the creative process and skill development in music production. This system addresses the growing demand for objective self-assessment tools in the evolving landscape of independent music production.

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2412.06617

Country: Asia > Philippines (0.28)

Genre: Research Report (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Drone footage shows huge fire engulfing Manila shanty town

BBC NewsNov-25-2024, 03:28:52 GMT

Huge flames can be seen engulfing a closely-built shanty community in the port area of Manila, in drone footage released by the city's disaster management office. Hundreds of residents have been left without homes, after around 1,000 houses were destroyed in the blaze on Sunday, Manila Fire District said. Dozens of fire engines and fire boats were deployed to tackle the blaze, and the Philippine Air Force sent two helicopters which scooped up up water from Manila Bay to help extinguish the fire. Emergency services have reported no casualties so far, and are yet to identify the cause. The incident comes months after 11 people died in residential fire in the Chinatown district of the Philippine capital.

artificial intelligence, drone footage show huge fire, manila shanty town, (1 more...)

BBC News

Country: Asia > Philippines > Luzon > National Capital Region > City of Manila (1.00)

Industry: Law Enforcement & Public Safety > Fire & Emergency Services (0.32)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.68)

Add feedback

Filters

Collaborating Authors

Philippines

Supplementary Materials

USAM-Net: A U-Net-based Network for Improved Stereo Correspondence and Scene Depth Estimation using Features from a Pre-trained Image Segmentation network

Reinforcement Learning Environment with LLM-Controlled Adversary in D&D 5th Edition Combat

An Enhancement of Jiang, Z., et al.s Compression-Based Classification Algorithm Applied to News Article Categorization

Semantic Decomposition and Selective Context Filtering -- Text Processing Techniques for Context-Aware NLP-Based Systems

Supplementary Materials

Advancing Vehicle Plate Recognition: Multitasking Visual Language Models with VehiclePaliGemma

An Enhancement of CNN Algorithm for Rice Leaf Disease Image Classification in Mobile Applications

AI TrackMate: Finally, Someone Who Will Give Your Music More Than Just "Sounds Great!"

Drone footage shows huge fire engulfing Manila shanty town