AITopics | Pontevedra

Collaborating Authors

Pontevedra

Nonparametric independence tests in high-dimensional settings, with applications to the genetics of complex disease

arXiv.org Machine LearningJul-28-2024

[PhD thesis of FCP.] Nowadays, genetics studies large amounts of very diverse variables. Mathematical statistics has evolved in parallel to its applications, with much recent interest high-dimensional settings. In the genetics of human common disease, a number of relevant problems can be formulated as tests of independence. We show how defining adequate premetric structures on the support spaces of the genetic data allows for novel approaches to such testing. This yields a solid theoretical framework, which reflects the underlying biology, and allows for computationally-efficient implementations. For each problem, we provide mathematical results, simulations and the application to real data.

generalised distance covariance, mathematical statistics, nonparametric independence test, (17 more...)

arXiv.org Machine Learning

2407.19624

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Spain > Galicia > A Coruña Province > Santiago de Compostela (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
(9 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > Promising Solution (0.65)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(3 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(5 more...)

Add feedback

A System for Automatic English Text Expansion

Méndez, Silvia García, Gavilanes, Milagros Fernández, Montenegro, Enrique Costa, Martínez, Jonathan Juncal, Castaño, Francisco Javier González, Reiter, Ehud

arXiv.org Artificial IntelligenceMay-28-2024

We present an automatic text expansion system to generate English sentences, which performs automatic Natural Language Generation (NLG) by combining linguistic rules with statistical approaches. Here, "automatic" means that the system can generate coherent and correct sentences from a minimum set of words. From its inception, the design is modular and adaptable to other languages. This adaptability is one of its greatest advantages. For English, we have created the highly precise aLexiE lexicon with wide coverage, which represents a contribution on its own. We have evaluated the resulting NLG library in an Augmentative and Alternative Communication (AAC) proof of concept, both directly (by regenerating corpus sentences) and manually (from annotations) using a popular corpus in the NLG field. We performed a second analysis by comparing the quality of text expansion in English to Spanish, using an ad-hoc Spanish-English parallel corpus. The system might also be applied to other domains such as report and news generation.

computational linguistic, input word, proceedings, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ACCESS.2019.2937505

2405.1835

Country:

Europe > Montenegro (0.04)
Europe > Bulgaria > Sofia City Province > Sofia (0.04)
North America > United States > New York > New York County > New York City (0.04)
(10 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)

Add feedback

Explainable automatic industrial carbon footprint estimation from bank transaction classification using natural language processing

González-González, Jaime, García-Méndez, Silvia, de Arriba-Pérez, Francisco, González-Castaño, Francisco J., Barba-Seara, Óscar

arXiv.org Artificial IntelligenceMay-23-2024

Concerns about the effect of greenhouse gases have motivated the development of certification protocols to quantify the industrial carbon footprint (CF). These protocols are manual, work-intensive, and expensive. All of the above have led to a shift towards automatic data-driven approaches to estimate the CF, including Machine Learning (ML) solutions. Unfortunately, the decision-making processes involved in these solutions lack transparency from the end user's point of view, who must blindly trust their outcomes compared to intelligible traditional manual approaches. In this research, manual and automatic methodologies for CF estimation were reviewed, taking into account their transparency limitations. This analysis led to the proposal of a new explainable ML solution for automatic CF calculations through bank transaction classification. Consideration should be given to the fact that no previous research has considered the explainability of bank transaction classification for this purpose. For classification, different ML models have been employed based on their promising performance in the literature, such as Support Vector Machine, Random Forest, and Recursive Neural Networks. The results obtained were in the 90 % range for accuracy, precision, and recall evaluation metrics. From their decision paths, the proposed solution estimates the CO2 emissions associated with bank transactions. The explainability methodology is based on an agnostic evaluation of the influence of the input terms extracted from the descriptions of transactions using locally interpretable models. The explainability terms were automatically validated using a similarity metric over the descriptions of the target categories. Conclusively, the explanation performance is satisfactory in terms of the proximity of the explanations to the associated activity sector descriptions.

bank transaction classification, classification, explanation, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ACCESS.2022.3226324

2405.14505

Country:

North America > United States (0.14)
Europe > Austria > Vienna (0.14)
Europe > Spain > Galicia > Madrid (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry:

Government (1.00)
Energy (1.00)
Banking & Finance (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Identifying Banking Transaction Descriptions via Support Vector Machine Short-Text Classification Based on a Specialized Labelled Corpus

García-Méndez, Silvia, Fernández-Gavilanes, Milagros, Juncal-Martínez, Jonathan, González-Castaño, Francisco J., Seara, Oscar Barba

arXiv.org Artificial IntelligenceMar-29-2024

Short texts are omnipresent in real-time news, social network commentaries, etc. Traditional text representation methods have been successfully applied to self-contained documents of medium size. However, information in short texts is often insufficient, due, for example, to the use of mnemonics, which makes them hard to classify. Therefore, the particularities of specific domains must be exploited. In this article we describe a novel system that combines Natural Language Processing techniques with Machine Learning algorithms to classify banking transaction descriptions for personal finance management, a problem that was not previously considered in the literature. We trained and tested that system on a labelled dataset with real customer transactions that will be available to other researchers on request. Motivated by existing solutions in spam detection, we also propose a short text similarity detector to reduce training set size based on the Jaccard distance. Experimental results with a two-stage classifier combining this detector with a SVM indicate a high accuracy in comparison with alternative approaches, taking into account complexity and computing time. Finally, we present a use case with a personal finance application, CoinScrap, which is available at Google Play and App Store.

classification, identifying banking transaction description, proceedings, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ACCESS.2020.2983584

2404.08664

Country:

Europe > Spain > Galicia > Madrid (0.04)
Europe > Spain > Galicia > Pontevedra Province > Pontevedra (0.04)
Asia > Pakistan (0.04)
(8 more...)

Genre: Overview (0.68)

Industry:

Banking & Finance (1.00)
Information Technology > Services (0.54)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Meteorologists and Students: A resource for language grounding of geographical descriptors

Ramos-Soto, Alejandro, Reiter, Ehud, van Deemter, Kees, Alonso, Jose M., Gatt, Albert

arXiv.org Artificial IntelligenceSep-7-2018

We present a data resource which can be useful for research purposes on language grounding tasks in the context of geographical referring expression generation. The resource is composed of two data sets that encompass 25 different geographical descriptors and a set of associated graphical representations, drawn as polygons on a map by two groups of human subjects: teenage students and expert meteorologists.

artificial intelligence, descriptor, natural language, (16 more...)

arXiv.org Artificial Intelligence

1809.02494

Country:

North America > United States > Montana (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Spain > Galicia > Pontevedra Province > Pontevedra (0.04)
(3 more...)

Genre: Research Report (0.40)

Industry: Education (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Generation (0.33)

Add feedback