Goto

Collaborating Authors

 South America


A quest for a fair schedule: The Young Physicists' Tournament

arXiv.org Artificial Intelligence

The Young Physicists Tournament is an established team-oriented scientific competition between high school students from 37 countries on 5 continents. The competition consists of scientific discussions called Fights. Three or four teams participate in each Fight, each of whom presents a problem while rotating the roles of Presenter, Opponent, Reviewer, and Observer among them. The rules of a few countries require that each team announce in advance 3 problems they will present at the national tournament. The task of the organizers is to choose the composition of Fights in such a way that each team presents each of its chosen problems exactly once and within a single Fight no problem is presented more than once. Besides formalizing these feasibility conditions, in this paper we formulate several additional fairness conditions for tournament schedules. We show that the fulfillment of some of them can be ensured by constructing suitable edge colorings in bipartite graphs. To find fair schedules, we propose integer linear programs and test them on real as well as randomly generated data.


Prediction of short and long-term droughts using artificial neural networks and hydro-meteorological variables

arXiv.org Machine Learning

Drought is a natural creeping threat with numerous damaging effects in various aspects of human life. Accurate drought prediction is a promising step in helping policy makers to set drought risk management strategies. To fulfill this purpose, choosing appropriate models plays an important role in predicting approach. In this study, different models of Artificial Neural Network (ANN) are employed to predict short and long-term of droughts by using Standardized Precipitation Index (SPI) at different time scales, including 3, 6, 12, 24 and 48 months in Tabriz city, Iran. To this end, different combination of calculated SPI and time series of various hydro-meteorological variables, such as precipitation, wind velocity, relative humidity and sunshine hours for years 1992 to 2010 are used to train the ANN models. In order to compare the models performances, some well-known measures, namely RMSE, Mean Absolute Error (MAE) and Correlation Coefficient (CC) are utilized in the present study. The results illustrate that the application of all hydro-meteorological variables significantly improves the prediction of SPI at different time scales.


Adaptive quadrature schemes for Bayesian inference via active learning

arXiv.org Machine Learning

Numerical integration and emulation are fundamental topics across scientific fields. We propose novel adaptive quadrature schemes based on an active learning procedure. We consider an interpolative approach for building a surrogate posterior density, combining it with Monte Carlo sampling methods and other quadrature rules. The nodes of the quadrature are sequentially chosen by maximizing a suitable acquisition function, which takes into account the current approximation of the posterior and the positions of the nodes. This maximization does not require additional evaluations of the true posterior. We introduce two specific schemes based on Gaussian and Nearest Neighbors (NN) bases. For the Gaussian case, we also provide a novel procedure for fitting the bandwidth parameter, in order to build a suitable emulator of a density function. With both techniques, we always obtain a positive estimation of the marginal likelihood (a.k.a., Bayesian evidence). An equivalent importance sampling interpretation is also described, which allows the design of extended schemes. Several theoretical results are provided and discussed. Numerical results show the advantage of the proposed approach, including a challenging inference problem in an astronomic dynamical model, with the goal of revealing the number of planets orbiting a star.


AI systems trained on data skewed by sex are worse at diagnosing disease

#artificialintelligence

The artificial intelligence model showed great promise in predicting which patients treated in U.S. Veterans Affairs hospitals would experience a sudden decline in kidney function. But it also came with a crucial caveat: Women represented only about 6% of the patients whose data were used to train the algorithm, and it performed worse when tested on women. The shortcomings of that high-profile algorithm, built by the Google sister company DeepMind, highlight a problem that machine learning researchers working in medicine are increasingly worried about. And it's an issue that may be more pervasive -- and more insidious -- than experts previously realized, new research suggests. The study, led by researchers in Argentina and published Monday in the journal PNAS, found that when female patients were excluded from or significantly underrepresented in the training data used to develop a machine learning model, the algorithm performed worse in diagnosing them when tested across across a wide range of medical conditions affecting the chest area.


Descriptor Revision for Conditionals: Literal Descriptors and Conditional Preservation

arXiv.org Artificial Intelligence

Descriptor revision by Hansson is a framework for addressing the problem of belief change. In descriptor revision, different kinds of change processes are dealt with in a joint framework. Individual change requirements are qualified by specific success conditions expressed by a belief descriptor, and belief descriptors can be combined by logical connectives. This is in contrast to the currently dominating AGM paradigm shaped by Alchourr\'on, G\"ardenfors, and Makinson, where different kinds of changes, like a revision or a contraction, are dealt with separately. In this article, we investigate the realisation of descriptor revision for a conditional logic while restricting descriptors to the conjunction of literal descriptors. We apply the principle of conditional preservation developed by Kern-Isberner to descriptor revision for conditionals, show how descriptor revision for conditionals under these restrictions can be characterised by a constraint satisfaction problem, and implement it using constraint logic programming. Since our conditional logic subsumes propositional logic, our approach also realises descriptor revision for propositional logic.


Depth-Optimized Delay-Aware Tree (DO-DAT) for Virtual Network Function Placement

arXiv.org Artificial Intelligence

With the constant increase in demand for data connectivity, network service providers are faced with the task of reducing their capital and operational expenses while ensuring continual improvements to network performance. Although Network Function Virtualization (NFV) has been identified as a solution, several challenges must be addressed to ensure its feasibility. In this paper, we present a machine learning-based solution to the Virtual Network Function (VNF) placement problem. This paper proposes the Depth-Optimized Delay-Aware Tree (DO-DAT) model by using the particle swarm optimization technique to optimize decision tree hyper-parameters. Using the Evolved Packet Core (EPC) as a use case, we evaluate the performance of the model and compare it to a previously proposed model and a heuristic placement strategy.


Saber Pro success prediction model using decision tree based learning

arXiv.org Artificial Intelligence

The primary objective of this report is to determine what influences the success rates of students who have studied in Colombia, analyzing the Saber 11, the test done at the last school year, some socioeconomic aspects and comparing the Saber Pro results with the national average. The problem this faces is to find what influences success, but it also provides an insight in the countries education dynamics and predicts one's opportunities to be prosperous. The opposite situation to the one presented in this paper could be the desertion levels, in the sense that by detecting what makes someone outstanding, these factors can say what makes one unsuccessful. The solution proposed to solve this problem was to implement a CART decision tree algorithm that helps to predict the probability that a student has of scoring higher than the mean value, based on different socioeconomic and academic factors, such as the profession of the parents of the subject parents and the results obtained on Saber 11. It was discovered that one of the most influential factors is the score in the Saber 11, on the topic of Social Studies, and that the gender of the subject is not as influential as it is usually portrayed as. The algorithm designed provided significant insight into which factors most affect the probability of success of any given person and if further pursued could be used in many given situations such as deciding which subject in school should be given more intensity to and academic curriculum in general.


Epileptic seizure prediction using Pearson's product-moment correlation coefficient of a linear classifier from generalized Gaussian modeling

arXiv.org Machine Learning

To predict an epileptic event means the ability to determine in advance the time of the seizure with the highest possible accuracy. A correct prediction benchmark for epilepsy events in clinical applications is a typical problem in biomedical signal processing that helps to an appropriate diagnosis and treatment of this disease. In this work, we use Pearson's product-moment correlation coefficient from generalized Gaussian distribution parameters coupled with a linear-based classifier to predict between seizure and non-seizure events in epileptic EEG signals. The performance in 36 epileptic events from 9 patients showing good performance with 100% of effectiveness for sensitivity and specificity greater than 83% for seizures events in all brain rhythms. Pearson's test suggests that all brain rhythms are highly correlated in non-seizure events but no during the seizure events. This suggests that our model can be scaled with the Pearson's product-moment correlation coefficient for the detection of epileptic seizures.


Multi-Stage Transfer Learning with an Application to Selection Process

arXiv.org Machine Learning

In multi-stage processes, decisions happen in an ordered sequence of stages. Many of them have the structure of dual funnel problem: as the sample size decreases from one stage to the other, the information increases. A related example is a selection process, where applicants apply for a position, prize, or grant. In each stage, more applicants are evaluated and filtered out, and from the remaining ones, more information is collected. In the last stage, decision-makers use all available information to make their final decision. To train a classifier for each stage becomes impracticable as they can underfit due to the low dimensionality in early stages or overfit due to the small sample size in the latter stages. In this work, we proposed a \textit{Multi-StaGe Transfer Learning} (MSGTL) approach that uses knowledge from simple classifiers trained in early stages to improve the performance of classifiers in the latter stages. By transferring weights from simpler neural networks trained in larger datasets, we able to fine-tune more complex neural networks in the latter stages without overfitting due to the small sample size. We show that it is possible to control the trade-off between conserving knowledge and fine-tuning using a simple probabilistic map. Experiments using real-world data demonstrate the efficacy of our approach as it outperforms other state-of-the-art methods for transfer learning and regularization.


A multimodal approach for multi-label movie genre classification

arXiv.org Machine Learning

Movie genre classification is a challenging task that has increasingly attracted the attention of researchers. In this paper, we addressed the multi-label classification of the movie genres in a multimodal way. For this purpose, we created a dataset composed of trailer video clips, subtitles, synopses, and movie posters taken from 152,622 movie titles from The Movie Database. The dataset was carefully curated and organized, and it was also made available as a contribution of this work. Each movie of the dataset was labeled according to a set of eighteen genre labels. We extracted features from these data using different kinds of descriptors, namely Mel Frequency Cepstral Coefficients, Statistical Spectrum Descriptor , Local Binary Pattern with spectrograms, Long-Short Term Memory, and Convolutional Neural Networks. The descriptors were evaluated using different classifiers, such as BinaryRelevance and ML-kNN. We have also investigated the performance of the combination of different classifiers/features using a late fusion strategy, which obtained encouraging results. Based on the F-Score metric, our best result, 0.628, was obtained by the fusion of a classifier created using LSTM on the synopses, and a classifier created using CNN on movie trailer frames. When considering the AUC-PR metric, the best result, 0.673, was also achieved by combining those representations, but in addition, a classifier based on LSTM created from the subtitles was used. These results corroborate the existence of complementarity among classifiers based on different sources of information in this field of application. As far as we know, this is the most comprehensive study developed in terms of the diversity of multimedia sources of information to perform movie genre classification.