dado
Leveraging Discrete Function Decomposability for Scientific Design
Bowden, James C., Levine, Sergey, Listgarten, Jennifer
In the era of AI-driven science and engineering, we often want to design discrete objects in silico according to user-specified properties. For example, we may wish to design a protein to bind its target, arrange components within a circuit to minimize latency, or find materials with certain properties. Given a property predictive model, in silico design typically involves training a generative model over the design space (e.g., protein sequence space) to concentrate on designs with the desired properties. Distributional optimization -- which can be formalized as an estimation of distribution algorithm or as reinforcement learning policy optimization -- finds the generative model that maximizes an objective function in expectation. Optimizing a distribution over discrete-valued designs is in general challenging because of the combinatorial nature of the design space. However, many property predictors in scientific applications are decomposable in the sense that they can be factorized over design variables in a way that could in principle enable more effective optimization. For example, amino acids at a catalytic site of a protein may only loosely interact with amino acids of the rest of the protein to achieve maximal catalytic activity. Current distributional optimization algorithms are unable to make use of such decomposability structure. Herein, we propose and demonstrate use of a new distributional optimization algorithm, Decomposition-Aware Distributional Optimization (DADO), that can leverage any decomposability defined by a junction tree on the design variables, to make optimization more efficient. At its core, DADO employs a soft-factorized "search distribution" -- a learned generative model -- for efficient navigation of the search space, invoking graph message-passing to coordinate optimization across linked factors.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (2 more...)
- Research Report > Experimental Study (0.68)
- Research Report > New Finding (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Analise de Desaprendizado de Maquina em Modelos de Classificacao de Imagens Medicas
Falcao, Andreza M. C., Cordeiro, Filipe R.
Machine unlearning aims to remove private or sensitive data from a pre-trained model while preserving the model's robustness. Despite recent advances, this technique has not been explored in medical image classification. This work evaluates the SalUn unlearning model by conducting experiments on the PathMNIST, OrganAMNIST, and BloodMNIST datasets. W e also analyze the impact of data augmentation on the quality of unlearning. Results show that SalUn achieves performance close to full retraining, indicating an efficient solution for use in medical applications.
- Health & Medicine (0.55)
- Information Technology > Security & Privacy (0.47)
Semi-automated Fact-checking in Portuguese: Corpora Enrichment using Retrieval with Claim extraction
Gomes, Juliana Resplande Sant'anna, Filho, Arlindo Rodrigues Galvão
The accelerated dissemination of disinformation often outpaces the capacity for manual fact-checking, highlighting the urgent need for Semi-Automated Fact-Checking (SAFC) systems. Within the Portuguese language context, there is a noted scarcity of publicly available datasets ( corpora) that integrate external evidence, an essential component for developing robust AFC systems, as many existing resources focus solely on classification based on intrinsic text features. This dissertation addresses this gap by developing, applying, and analyzing a methodology to enrich Portuguese news corpora (Fake.Br, COVID19.BR, MuMiN-PT) with external evidence. The approach simulates a user's verification process, employing Large Language Models (LLMs, specifically Gemini 1.5 Flash) to extract the main claim from texts and search engine APIs (Google Search API, Google FactCheck Claims Search API) to retrieve relevant external documents (evidence). Additionally, a data validation and pre-processing framework, including near-duplicate detection, is introduced to enhance the quality of the base corpora. The main results demonstrate the methodology's viability, providing enriched corpora and analyses that confirm the utility of claim extraction, the influence of original data characteristics on the process, and the positive impact of enrichment on the performance of classification models (Bertimbau and Gemini 1.5 Flash), especially with fine-tuning. This work contributes valuable resources and insights for advancing SAFC in Portuguese.
- South America > Brazil (1.00)
- Asia > Middle East > UAE (0.45)
- North America > United States > Minnesota (0.27)
- Europe > Spain > Galicia (0.27)
- Research Report (0.70)
- Overview (0.67)
- Information Technology > Services (1.00)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.94)
- Media > News (0.70)
Determinação Automática de Limiar de Detecção de Ataques em Redes de Computadores Utilizando Autoencoders
Miranda, Luan Gonçalves, da Cruz, Pedro Ivo, Loiola, Murilo Bellezoni
Currently, digital security mechanisms like Anomaly Detection Systems using Autoencoders (AE) show great potential for bypassing problems intrinsic to the data, such as data imbalance. Because AE use a non-trivial and nonstandardized separation threshold to classify the extracted reconstruction error, the definition of this threshold directly impacts the performance of the detection process. Thus, this work proposes the automatic definition of this threshold using some machine learning algorithms. For this, three algorithms were evaluated: the K-Nearst Neighbors, the K-Means and the Support Vector Machine.
Detecção da Psoríase Utilizando Visão Computacional: Uma Abordagem Comparativa Entre CNNs e Vision Transformers
Lucena, Natanael, da Silva, Fábio S., Rios, Ricardo
This paper presents a comparison of the performance of Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) in the task of multi-classifying images containing lesions of psoriasis and diseases similar to it. Models pre-trained on ImageNet were adapted to a specific data set. Both achieved high predictive metrics, but the ViTs stood out for their superior performance with smaller models. Dual Attention Vision Transformer-Base (DaViT-B) obtained the best results, with an f1-score of 96.4%, and is recommended as the most efficient architecture for automated psoriasis detection. This article reinforces the potential of ViTs for medical image classification tasks.
Comparative Analysis of Deepfake Detection Models: New Approaches and Perspectives
The growing threat posed by deepfake videos, capable of manipulating realities and disseminating misinformation, drives the urgent need for effective detection methods. This work investigates and compares different approaches for identifying deepfakes, focusing on the GenConViT model and its performance relative to other architectures present in the DeepfakeBenchmark. To contextualize the research, the social and legal impacts of deepfakes are addressed, as well as the technical fundamentals of their creation and detection, including digital image processing, machine learning, and artificial neural networks, with emphasis on Convolutional Neural Networks (CNNs), Generative Adversarial Networks (GANs), and Transformers. The performance evaluation of the models was conducted using relevant metrics and new datasets established in the literature, such as WildDeep-fake and DeepSpeak, aiming to identify the most effective tools in the battle against misinformation and media manipulation. The obtained results indicated that GenConViT, after fine-tuning, exhibited superior performance in terms of accuracy (93.82%) and generalization capacity, surpassing other architectures in the DeepfakeBenchmark on the DeepSpeak dataset. This study contributes to the advancement of deepfake detection techniques, offering contributions to the development of more robust and effective solutions against the dissemination of false information.
- North America > United States (0.14)
- South America > Brazil > Minas Gerais > Itajubá (0.04)
- South America > Brazil > Rio Grande do Sul > Porto Alegre (0.04)
- (2 more...)
- Media (1.00)
- Information Technology > Security & Privacy (1.00)
Deep Learning-Based Transfer Learning for Classification of Cassava Disease
Junior, Ademir G. Costa, da Silva, Fábio S., Rios, Ricardo
This paper presents a performance comparison among four Convolutional Neural Network architectures (EfficientNet-B3, InceptionV3, ResNet50, and VGG16) for classifying cassava disease images. The images were sourced from an imbalanced dataset from a competition. Appropriate metrics were employed to address class imbalance. The results indicate that EfficientNet-B3 achieved on this task accuracy of 87.7%, precision of 87.8%, revocation of 87.8% and F1-Score of 87.7%. These findings suggest that EfficientNet-B3 could be a valuable tool to support Digital Agriculture.
- South America > Brazil > Amazonas > Manaus (0.04)
- North America > United States (0.04)
- Africa > Uganda > Central Region > Kampala (0.04)
Hybrid model of the kernel method for quantum computers
de Borba, Jhordan Silveira, Maziero, Jonas
The field of quantum machine learning is a promising way to lead to a revolution in intelligent data processing methods. In this way, a hybrid learning method based on classic kernel methods is proposed. This proposal also requires the development of a quantum algorithm for the calculation of internal products between vectors of continuous values. In order for this to be possible, it was necessary to make adaptations to the classic kernel method, since it is necessary to consider the limitations imposed by the Hilbert space of the quantum processor. As a test case, we applied this new algorithm to learn to classify whether new points generated randomly, in a finite square located under a plane, were found inside or outside a circle located inside this square. It was found that the algorithm was able to correctly detect new points in 99% of the samples tested, with a small difference due to considering the radius slightly larger than the ideal. However, the kernel method was able to perform classifications correctly, as well as the internal product algorithm successfully performed the internal product calculations using quantum resources. Thus, the present work represents a contribution to the area, proposing a new model of machine learning accessible to both physicists and computer scientists.
- South America > Brazil > Rio Grande do Sul > Porto Alegre (0.04)
- North America > United States (0.04)
Neural Networks with LSTM and GRU in Modeling Active Fires in the Amazon
This study presents a comprehensive methodology for modeling and forecasting the historical time series of active fire spots detected by the AQUA\_M-T satellite in the Amazon, Brazil. The approach employs a mixed Recurrent Neural Network (RNN) model, combining Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) architectures to predict the monthly accumulations of daily detected active fire spots. Data analysis revealed a consistent seasonality over time, with annual maximum and minimum values tending to repeat at the same periods each year. The primary objective is to verify whether the forecasts capture this inherent seasonality through machine learning techniques. The methodology involved careful data preparation, model configuration, and training using cross-validation with two seeds, ensuring that the data generalizes well to both the test and validation sets for both seeds. The results indicate that the combined LSTM and GRU model delivers excellent forecasting performance, demonstrating its effectiveness in capturing complex temporal patterns and modeling the observed time series. This research significantly contributes to the application of deep learning techniques in environmental monitoring, specifically in forecasting active fire spots. The proposed approach highlights the potential for adaptation to other time series forecasting challenges, opening new opportunities for research and development in machine learning and prediction of natural phenomena. Keywords: Time Series Forecasting; Recurrent Neural Networks; Deep Learning.
- South America > Brazil (0.34)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Asia > Singapore (0.04)
- (3 more...)
Mind-reading AI recreates what you're looking at with amazing accuracy
Second row: images reconstructed by AI based on brain recordings from a macaque. Artificial intelligence systems can now create remarkably accurate reconstructions of what someone is looking at based on recordings of their brain activity. These reconstructed images are greatly improved when the AI learns which parts of the brain to pay attention to. "As far as I know, these are the closest, most accurate reconstructions," says Umut Güçlü at Radboud University in the Netherlands. How this moment for AI will change society forever (and how it won't) Güçlü's team is one of several around the world using AI systems to work out what animals or people are seeing from brain recordings and scans. In one previous study, his team used a functional MRI (fMRI) scanner to record the brain activity of three people as they were shown a series of photographs.
- Health & Medicine > Therapeutic Area > Neurology (0.81)
- Health & Medicine > Health Care Technology (0.59)