AITopics | Karamanolakis, Giannis

Collaborating Authors

Karamanolakis, Giannis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs

Zhou, Yuhang, Karamanolakis, Giannis, Soto, Victor, Rumshisky, Anna, Kulkarni, Mayank, Huang, Furong, Ai, Wei, Lu, Jianhua

arXiv.org Artificial IntelligenceFeb-17-2025

The recent success of specialized Large Language Models (LLMs) in domains such as mathematical reasoning and coding has led to growing interest in methods for merging these expert LLMs into a unified Mixture-of-Experts (MoE) model, with the goal of enhancing performance in each domain while retaining effectiveness on general tasks. However, the effective merging of expert models remains an open challenge, especially for models with highly divergent weight parameters or different architectures. State-of-the-art MoE merging methods only work with homogeneous model architectures and rely on simple unweighted averaging to merge expert layers, which does not address parameter interference and requires extensive fine-tuning of the merged MoE to restore performance. To address these limitations, this paper introduces new MoE merging techniques, including strategies to mitigate parameter interference, routing heuristics to reduce the need for MoE fine-tuning, and a novel method for merging experts with different architectures. Extensive experiments across multiple domains demonstrate the effectiveness of our proposed methods, reducing fine-tuning costs, improving performance over state-of-the-art methods, and expanding the applicability of MoE merging.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.00997

Country: North America > United States (0.67)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Flood Event Extraction from News Media to Support Satellite-Based Flood Insurance

Pabari, Tejit, Tellman, Beth, Karamanolakis, Giannis, Thomas, Mitchell, Mauerman, Max, Wu, Eugene, Lall, Upmanu, Tedesco, Marco, Steckler, Michael S, Colosio, Paolo, Osgood, Daniel E, Braun, Melody, de Bruijn, Jens, Islam, Shammun

arXiv.org Artificial IntelligenceDec-5-2023

Floods cause large losses to property, life, and livelihoods across the world every year, hindering sustainable development. Safety nets to help absorb financial shocks in disasters, such as insurance, are often unavailable in regions of the world most vulnerable to floods, like Bangladesh. Index-based insurance has emerged as an affordable solution, which considers weather data or information from satellites to create a "flood index" that should correlate with the damage insured. However, existing flood event databases are often incomplete, and satellite sensors are not reliable under extreme weather conditions (e.g., because of clouds), which limits the spatial and temporal resolution of current approaches for index-based insurance. In this work, we explore a novel approach for supporting satellite-based flood index insurance by extracting high-resolution spatio-temporal information from news media. First, we publish a dataset consisting of 40,000 news articles covering flood events in Bangladesh by 10 prominent news sources, and inundated area estimates for each division in Bangladesh collected from a satellite radar sensor. Second, we show that keyword-based models are not adequate for this novel application, while context-based classifiers cover complex and implicit flood related patterns. Third, we show that time series extracted from news media have substantial correlation Spearman's rho$=0.70 with satellite estimates of inundated area. Our work demonstrates that news media is a promising source for improving the temporal resolution and expanding the spatial coverage of the available flood damage data.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2312.14943

Country: Asia > Bangladesh (0.93)

Genre:

Research Report (1.00)
Overview > Innovation (0.55)

Industry:

Media > News (1.00)
Information Technology (0.93)
Banking & Finance > Insurance (0.82)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Self-Training with Weak Supervision

Karamanolakis, Giannis, Mukherjee, Subhabrata, Zheng, Guoqing, Awadallah, Ahmed Hassan

arXiv.org Machine LearningApr-12-2021

State-of-the-art deep neural networks require large-scale labeled training data that is often expensive to obtain or not available for many tasks. Weak supervision in the form of domain-specific rules has been shown to be useful in such settings to automatically generate weakly labeled training data. However, learning with weak rules is challenging due to their inherent heuristic and noisy nature. An additional challenge is rule coverage and overlap, where prior work on weak supervision only considers instances that are covered by weak rules, thus leaving valuable unlabeled data behind. In this work, we develop a weak supervision framework (ASTRA) that leverages all the available data for a given task. To this end, we leverage task-specific unlabeled data through self-training with a model (student) that considers contextualized representations and predicts pseudo-labels for instances that may not be covered by weak rules. We further develop a rule attention network (teacher) that learns how to aggregate student pseudo-labels with weak rule labels, conditioned on their fidelity and the underlying context of an instance. Finally, we construct a semi-supervised learning objective for end-to-end training with unlabeled data, domain-specific rules, and a small amount of labeled data. Extensive experiments on six benchmark datasets for text classification demonstrate the effectiveness of our approach with significant improvements over state-of-the-art baselines.

dataset, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

2104.05514

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types

Dong, Xin Luna, He, Xiang, Kan, Andrey, Li, Xian, Liang, Yan, Ma, Jun, Xu, Yifan Ethan, Zhang, Chenwei, Zhao, Tong, Saldana, Gabriel Blanco, Deshpande, Saurabh, Manduca, Alexandre Michetti, Ren, Jay, Singh, Surender Pal, Xiao, Fan, Chang, Haw-Shiuan, Karamanolakis, Giannis, Mao, Yuning, Wang, Yaqing, Faloutsos, Christos, McCallum, Andrew, Han, Jiawei

arXiv.org Artificial IntelligenceJun-24-2020

Can one build a knowledge graph (KG) for all products in the world? Knowledge graphs have firmly established themselves as valuable sources of information for search and question answering, and it is natural to wonder if a KG can contain information about products offered at online retail sites. There have been several successful examples of generic KGs, but organizing information about products poses many additional challenges, including sparsity and noise of structured data for products, complexity of the domain with millions of product types and thousands of attributes, heterogeneity across large number of categories, as well as large and constantly growing number of products. We describe AutoKnow, our automatic (self-driving) system that addresses these challenges. The system includes a suite of novel techniques for taxonomy construction, product property identification, knowledge extraction, anomaly detection, and synonym discovery. AutoKnow is (a) automatic, requiring little human intervention, (b) multi-scalable, scalable in multiple dimensions (many domains, many products, and many attributes), and (c) integrative, exploiting rich customer behavior logs. AutoKnow has been operational in collecting product knowledge for over 11K product types.

artificial intelligence, product type, text processing, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3394486.3403323

2006.13473

Country: North America > United States (1.00)

Genre: Research Report > Promising Solution (0.34)

Industry:

Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.93)
Retail (0.88)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.70)
(4 more...)

Add feedback

Weakly Supervised Attention Networks for Fine-Grained Opinion Mining and Public Health

Karamanolakis, Giannis, Hsu, Daniel, Gravano, Luis

arXiv.org Machine LearningSep-30-2019

In many review classification applications, a fine-grained analysis of the reviews is desirable, because different segments (e.g., sentences) of a review may focus on different aspects of the entity in question. However, training supervised models for segment-level classification requires segment labels, which may be more difficult or expensive to obtain than review labels. In this paper, we employ Multiple Instance Learning (MIL) and use only weak supervision in the form of a single label per review. First, we show that when inappropriate MIL aggregation functions are used, then MIL-based networks are outperformed by simpler baselines. Second, we propose a new aggregation function based on the sigmoid attention mechanism and show that our proposed model outperforms the state-of-the-art models for segment-level sentiment classification (by up to 9.8% in F1). Finally, we highlight the importance of fine-grained predictions in an important public-health application: finding actionable reports of foodborne illness. We show that our model achieves 48.6% higher recall compared to previous models, thus increasing the chance of identifying previously unknown foodborne outbreaks.

aggregation function, deep learning, neural network, (25 more...)

arXiv.org Machine Learning

1910.00054

Country:

North America > United States > Illinois (0.14)
Europe > United Kingdom > England (0.14)

Genre:

Research Report > Experimental Study (0.46)
Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Public Health (0.61)
Health & Medicine > Epidemiology (0.53)
Health & Medicine > Therapeutic Area (0.47)
Food & Agriculture > Food Processing (0.37)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

Add feedback

Leveraging Just a Few Keywords for Fine-Grained Aspect Detection Through Weakly Supervised Co-Training

Karamanolakis, Giannis, Hsu, Daniel, Gravano, Luis

arXiv.org Machine LearningSep-1-2019

User-generated reviews can be decomposed into fine-grained segments (e.g., sentences, clauses), each evaluating a different aspect of the principal entity (e.g., price, quality, appearance). Automatically detecting these aspects can be useful for both users and downstream opinion mining applications. Current supervised approaches for learning aspect classifiers require many fine-grained aspect labels, which are labor-intensive to obtain. And, unfortunately, unsupervised topic models often fail to capture the aspects of interest. In this work, we consider weakly supervised approaches for training aspect classifiers that only require the user to provide a small set of seed words (i.e., weakly positive indicators) for the aspects of interest. First, we show that current weakly supervised approaches do not effectively leverage the predictive power of seed words for aspect detection. Next, we propose a student-teacher approach that effectively leverages seed words in a bag-of-words classifier (teacher); in turn, we use the teacher to train a second model (student) that is potentially more powerful (e.g., a neural network that uses pre-trained word embeddings). Finally, we show that iterative co-training can be used to cope with noisy seed words, leading to both improved teacher and student models. Our proposed approach consistently outperforms previous weakly supervised approaches (by 14.1 absolute F1 points on average) in six different domains of product reviews and six multilingual datasets of restaurant reviews.

deep learning, neural network, seed word, (20 more...)

arXiv.org Machine Learning

1909.00415

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Education (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.88)

Add feedback

Item Recommendation with Variational Autoencoders and Heterogenous Priors

Karamanolakis, Giannis, Cherian, Kevin Raji, Narayan, Ananth Ravi, Yuan, Jie, Tang, Da, Jebara, Tony

arXiv.org Machine LearningOct-6-2018

In recent years, Variational Autoencoders (VAEs) have been shown to be highly effective in both standard collaborative filtering applications and extensions such as incorporation of implicit feedback. We extend VAEs to collaborative filtering with side information, for instance when ratings are combined with explicit text feedback from the user. Instead of using a user-agnostic standard Gaussian prior, we incorporate user-dependent priors in the latent VAE space to encode users' preferences as functions of the review text. Taking into account both the rating and the text information to represent users in this multimodal latent space is promising to improve recommendation quality. Our proposed model is shown to outperform the existing VAE models for collaborative filtering (up to 29.41% relative improvement in ranking metric) along with other baselines that incorporate both user ratings and text for item recommendation.

deep learning, neural network, variational autoencoder, (20 more...)

arXiv.org Machine Learning

doi: 10.1145/3270323.327032

1807.06651

Country: North America > United States (0.29)

Genre: Research Report (0.50)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback