AITopics | Roder, Mateus

Plotting

Roder, Mateus

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adapting LLMs for the Medical Domain in Portuguese: A Study on Fine-Tuning and Model Evaluation

Paiola, Pedro Henrique, Garcia, Gabriel Lino, Manesco, João Renato Ribeiro, Roder, Mateus, Rodrigues, Douglas, Papa, João Paulo

arXiv.org Artificial IntelligenceSep-30-2024

This study evaluates the performance of large language models (LLMs) as medical agents in Portuguese, aiming to develop a reliable and relevant virtual assistant for healthcare professionals. The HealthCareMagic-100k-en and MedQuAD datasets, translated from English using GPT-3.5, were used to fine-tune the ChatBode-7B model using the PEFT-QLoRA method. The InternLM2 model, with initial training on medical data, presented the best overall performance, with high precision and adequacy in metrics such as accuracy, completeness and safety. However, DrBode models, derived from ChatBode, exhibited a phenomenon of catastrophic forgetting of acquired medical knowledge. Despite this, these models performed frequently or even better in aspects such as grammaticality and coherence. A significant challenge was low inter-rater agreement, highlighting the need for more robust assessment protocols. This work paves the way for future research, such as evaluating multilingual models specific to the medical field, improving the quality of training data, and developing more consistent evaluation methodologies for the medical field.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2410.00163

Country: South America (0.29)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.47)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.95)
Health & Medicine > Therapeutic Area > Immunology (0.69)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Consumer Health (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Feature Selection and Hyperparameter Fine-tuning in Artificial Neural Networks for Wood Quality Classification

Roder, Mateus, Passos, Leandro Aparecido, Papa, João Paulo, Rossi, André Luis Debiaso

arXiv.org Artificial IntelligenceOct-20-2023

Quality classification of wood boards is an essential task in the sawmill industry, which is still usually performed by human operators in small to median companies in developing countries. Machine learning algorithms have been successfully employed to investigate the problem, offering a more affordable alternative compared to other solutions. However, such approaches usually present some drawbacks regarding the proper selection of their hyperparameters. Moreover, the models are susceptible to the features extracted from wood board images, which influence the induction of the model and, consequently, its generalization power. Therefore, in this paper, we investigate the problem of simultaneously tuning the hyperparameters of an artificial neural network (ANN) as well as selecting a subset of characteristics that better describes the wood board quality. Experiments were conducted over a private dataset composed of images obtained from a sawmill industry and described using different feature descriptors. The predictive performance of the model was compared against five baseline methods as well as a random search, performing either ANN hyperparameter tuning and feature selection. Experimental results suggest that hyperparameters should be adjusted according to the feature set, or the features should be selected considering the hyperparameter values. In summary, the best predictive performance, i.e., a balanced accuracy of $0.80$, was achieved in two distinct scenarios: (i) performing only feature selection, and (ii) performing both tasks concomitantly. Thus, we suggest that at least one of the two approaches should be considered in the context of industrial applications.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-45389-2_22

2310.1349

Country: Europe > Austria (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Materials > Paper & Forest Products (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Facial Point Graphs for Amyotrophic Lateral Sclerosis Identification

Gomes, Nícolas Barbosa, Yoshida, Arissa, Roder, Mateus, de Oliveira, Guilherme Camargo, Papa, João Paulo

arXiv.org Artificial IntelligenceJul-22-2023

Identifying Amyotrophic Lateral Sclerosis (ALS) in its early stages is essential for establishing the beginning of treatment, enriching the outlook, and enhancing the overall well-being of those affected individuals. However, early diagnosis and detecting the disease's signs is not straightforward. A simpler and cheaper way arises by analyzing the patient's facial expressions through computational methods. When a patient with ALS engages in specific actions, e.g., opening their mouth, the movement of specific facial muscles differs from that observed in a healthy individual. This paper proposes Facial Point Graphs to learn information from the geometry of facial images to identify ALS automatically. The experimental outcomes in the Toronto Neuroface dataset show the proposed approach outperformed state-of-the-art results, fostering promising developments in the area.

artificial intelligence, machine learning, repetition, (17 more...)

arXiv.org Artificial Intelligence

2307.12159

Country: North America > Canada > Ontario > Toronto (0.25)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Amyotrophic Lateral Sclerosis (ALS) (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Enhancing Hyper-To-Real Space Projections Through Euclidean Norm Meta-Heuristic Optimization

Ribeiro, Luiz C. F., Roder, Mateus, de Rosa, Gustavo H., Passos, Leandro A., Papa, João P.

arXiv.org Artificial IntelligenceJan-31-2023

The continuous computational power growth in the last decades has made solving several optimization problems significant to humankind a tractable task; however, tackling some of them remains a challenge due to the overwhelming amount of candidate solutions to be evaluated, even by using sophisticated algorithms. In such a context, a set of nature-inspired stochastic methods, called meta-heuristic optimization, can provide robust approximate solutions to different kinds of problems with a small computational burden, such as derivative-free real function optimization. Nevertheless, these methods may converge to inadequate solutions if the function landscape is too harsh, e.g., enclosing too many local optima. Previous works addressed this issue by employing a hypercomplex representation of the search space, like quaternions, where the landscape becomes smoother and supposedly easier to optimize. Under this approach, meta-heuristic computations happen in the hypercomplex space, whereas variables are mapped back to the real domain before function evaluation. Despite this latter operation being performed by the Euclidean norm, we have found that after the optimization procedure has finished, it is usually possible to obtain even better solutions by employing the Minkowski $p$-norm instead and fine-tuning $p$ through an auxiliary sub-problem with neglecting additional cost and no hyperparameters. Such behavior was observed in eight well-established benchmarking functions, thus fostering a new research direction for hypercomplex meta-heuristic optimization.

artificial intelligence, optimization, optimization problem, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-93420-0_11

2301.13671

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Improving Pre-Trained Weights Through Meta-Heuristics Fine-Tuning

de Rosa, Gustavo H., Roder, Mateus, Papa, João Paulo, Santos, Claudio F. G. dos

arXiv.org Artificial IntelligenceDec-19-2022

Machine Learning algorithms have been extensively researched throughout the last decade, leading to unprecedented advances in a broad range of applications, such as image classification and reconstruction, object recognition, and text categorization. Nonetheless, most Machine Learning algorithms are trained via derivative-based optimizers, such as the Stochastic Gradient Descent, leading to possible local optimum entrapments and inhibiting them from achieving proper performances. A bio-inspired alternative to traditional optimization techniques, denoted as meta-heuristic, has received significant attention due to its simplicity and ability to avoid local optimums imprisonment. In this work, we propose to use meta-heuristic techniques to fine-tune pre-trained weights, exploring additional regions of the search space, and improving their effectiveness. The experimental evaluation comprises two classification tasks (image and text) and is assessed under four literature datasets. Experimental results show nature-inspired algorithms' capacity in exploring the neighborhood of pre-trained weights, achieving superior results than their counterpart pre-trained architectures. Additionally, a thorough analysis of distinct architectures, such as Multi-Layer Perceptron and Recurrent Neural Networks, attempts to visualize and provide more precise insights into the most critical weights to be fine-tuned in the learning process.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2212.09447

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

From Actions to Events: A Transfer Learning Approach Using Improved Deep Belief Networks

Roder, Mateus, Almeida, Jurandy, de Rosa, Gustavo H., Passos, Leandro A., Rossi, André L. D., Papa, João P.

arXiv.org Artificial IntelligenceNov-30-2022

In the last decade, exponential data growth supplied machine learning-based algorithms' capacity and enabled their usage in daily-life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors produce outstanding results, they also pose drawbacks regarding the learning process as training complex models over large datasets are expensive and time-consuming. Such a problem is even more evident when dealing with video analysis. Some works have considered transfer learning or domain adaptation, i.e., approaches that map the knowledge from one domain to another, to ease the training burden, yet most of them operate over individual or small blocks of frames. This paper proposes a novel approach to map the knowledge from action recognition to event recognition using an energy-based model, denoted as Spectral Deep Belief Network. Such a model can process all frames simultaneously, carrying spatial and temporal information through the learning process. The experimental results conducted over two public video dataset, the HMDB-51 and the UCF-101, depict the effectiveness of the proposed model and its reduced computational burden when compared to traditional energy-based models, such as Restricted Boltzmann Machines and Deep Belief Networks.

artificial intelligence, machine learning, recognition, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/SSCI50451.2021.9660128

2211.17045

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.82)

Add feedback

Comparative Study Between Distance Measures On Supervised Optimum-Path Forest Classification

de Rosa, Gustavo Henrique, Roder, Mateus, Papa, João Paulo

arXiv.org Artificial IntelligenceFeb-8-2022

Machine Learning has attracted considerable attention throughout the past decade due to its potential to solve far-reaching tasks, such as image classification, object recognition, anomaly detection, and data forecasting. A standard approach to tackle such applications is based on supervised learning, which is assisted by large sets of labeled data and is conducted by the so-called classifiers, such as Logistic Regression, Decision Trees, Random Forests, and Support Vector Machines, among others. An alternative to traditional classifiers is the parameterless Optimum-Path Forest (OPF), which uses a graph-based methodology and a distance measure to create arcs between nodes and hence sets of trees, responsible for conquering the nodes, defining their labels, and shaping the forests. Nevertheless, its performance is strongly associated with an appropriate distance measure, which may vary according to the dataset's nature. Therefore, this work proposes a comparative study over a wide range of distance measures applied to the supervised Optimum-Path Forest classification. The experimental results are conducted using well-known literature datasets and compared across benchmarking classifiers, illustrating OPF's ability to adapt to distinct domains.

artificial intelligence, classifier, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2202.03854

Country: South America > Brazil (0.14)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

A Layer-Wise Information Reinforcement Approach to Improve Learning in Deep Belief Networks

Roder, Mateus, Passos, Leandro A., Ribeiro, Luiz Carlos Felix, Pereira, Clayton, Papa, João Paulo

arXiv.org Artificial IntelligenceJan-17-2021

With the advent of deep learning, the number of works proposing new methods or improving existent ones has grown exponentially in the last years. In this scenario, "very deep" models were emerging, once they were expected to extract more intrinsic and abstract features while supporting a better performance. However, such models suffer from the gradient vanishing problem, i.e., backpropagation values become too close to zero in their shallower layers, ultimately causing learning to stagnate. Such an issue was overcome in the context of convolution neural networks by creating "shortcut connections" between layers, in a so-called deep residual learning framework. Nonetheless, a very popular deep learning technique called Deep Belief Network still suffers from gradient vanishing when dealing with discriminative tasks. Therefore, this paper proposes the Residual Deep Belief Network, which considers the information reinforcement layer-by-layer to improve the feature extraction and knowledge retaining, that support better discriminative performance. Experiments conducted over three public datasets demonstrate its robustness concerning the task of binary image classification.

deep belief network, deep learning, neural network, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-61401-0_22

2101.06749

Country: South America > Brazil (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Energy-based Dropout in Restricted Boltzmann Machines: Why not go random

Roder, Mateus, de Rosa, Gustavo H., de Albuquerque, Victor Hugo C., Rossi, André L. D., Papa, João P.

arXiv.org Artificial IntelligenceJan-17-2021

Deep learning architectures have been widely fostered throughout the last years, being used in a wide range of applications, such as object recognition, image reconstruction, and signal processing. Nevertheless, such models suffer from a common problem known as overfitting, which limits the network from predicting unseen data effectively. Regularization approaches arise in an attempt to address such a shortcoming. Among them, one can refer to the well-known Dropout, which tackles the problem by randomly shutting down a set of neurons and their connections according to a certain probability. Therefore, this approach does not consider any additional knowledge to decide which units should be disconnected. In this paper, we propose an energy-based Dropout (E-Dropout) that makes conscious decisions whether a neuron should be dropped or not. Specifically, we design this regularization method by correlating neurons and the model's energy as an importance level for further applying it to energy-based models, such as Restricted Boltzmann Machines (RBMs). The experimental results over several benchmark datasets revealed the proposed approach's suitability compared to the traditional Dropout and the standard RBMs.

deep learning, dropout, neural network, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TETCI.2020.3043764

2101.06741

Country: South America > Brazil > São Paulo (0.15)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.62)

Add feedback

MaxDropout: Deep Neural Network Regularization Based on Maximum Output Values

Santos, Claudio Filipi Goncalves do, Colombo, Danilo, Roder, Mateus, Papa, João Paulo

arXiv.org Machine LearningJul-27-2020

Different techniques have emerged in the deep learning scenario, such as Convolutional Neural Networks, Deep Belief Networks, and Long Short-Term Memory Networks, to cite a few. In lockstep, regularization methods, which aim to prevent overfitting by penalizing the weight connections, or turning off some units, have been widely studied either. In this paper, we present a novel approach called MaxDropout, a regularizer for deep neural network models that works in a supervised fashion by removing (shutting off) the prominent neurons (i.e., most active) in each hidden layer. The model forces fewer activated units to learn more representative information, thus providing sparsity. Regarding the experiments, we show that it is possible to improve existing neural networks and provide better results in neural networks when Dropout is replaced by MaxDropout. The proposed method was evaluated in image classification, achieving comparable results to existing regularizers, such as Cutout and RandomErasing, also improving the accuracy of neural networks that uses Dropout by replacing the existing layer by MaxDropout.

deep learning, maxdropout, neural network, (17 more...)

arXiv.org Machine Learning

2007.13723

Country: South America > Brazil (0.29)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback