AITopics | Grabowicz, Przemyslaw A.

Collaborating Authors

Grabowicz, Przemyslaw A.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Multilingual Similarity Dataset for News Article Frame

Chen, Xi, Samory, Mattia, Hale, Scott, Jurgens, David, Grabowicz, Przemyslaw A.

arXiv.org Artificial IntelligenceMay-21-2024

Understanding the writing frame of news articles is vital for addressing social issues, and thus has attracted notable attention in the fields of communication studies. Yet, assessing such news article frames remains a challenge due to the absence of a concrete and unified standard dataset that considers the comprehensive nuances within news content. To address this gap, we introduce an extended version of a large labeled news article dataset with 16,687 new labeled pairs. Leveraging the pairwise comparison of news articles, our method frees the work of manual identification of frame classes in traditional news frame analysis studies. Overall we introduce the most extensive cross-lingual news article similarity dataset available to date with 26,555 labeled news article pairs across 10 languages. Each data point has been meticulously annotated according to a codebook detailing eight critical aspects of news content, under a human-in-the-loop framework. Application examples demonstrate its potential in unearthing country communities within global news coverage, exposing media bias among news outlets, and quantifying the factors related to news creation. We envision that this news similarity dataset will broaden our understanding of the media ecosystem in terms of news coverage of events and perspectives across countries, locations, languages, and other social constructs. By doing so, it can catalyze advancements in social science research and applied methodologies, thereby exerting a profound impact on our society.

data mining, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2405.13272

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Massachusetts (0.14)

Genre: Research Report > Experimental Study (0.94)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.47)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Add feedback

Automated Model Selection for Tabular Data

Amballa, Avinash, Mekala, Anmol, Akkinapalli, Gayathri, Madine, Manas, Yarrabolu, Naga Pavana Priya, Grabowicz, Przemyslaw A.

arXiv.org Artificial IntelligenceJan-1-2024

Structured data in the form of tabular datasets contain features that are distinct and discrete, with varying individual and relative importances to the target. Combinations of one or more features may be more predictive and meaningful than simple individual feature contributions. R's mixed effect linear models library allows users to provide such interactive feature combinations in the model design. However, given many features and possible interactions to select from, model selection becomes an exponentially difficult task. We aim to automate the model selection process for predictions on tabular datasets incorporating feature interactions while keeping computational costs small. The framework includes two distinct approaches for feature selection: a Priority-based Random Grid Search and a Greedy Search method. The Priority-based approach efficiently explores feature combinations using prior probabilities to guide the search. The Greedy method builds the solution iteratively by adding or removing features based on their impact. Experiments on synthetic demonstrate the ability to effectively capture predictive feature combinations.

artificial intelligence, interaction, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2401.00961

Country: North America > United States (0.14)

Genre: Research Report > Experimental Study (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Learning from Discriminatory Training Data

Grabowicz, Przemyslaw A., Perello, Nicholas, Takatsu, Kenta

arXiv.org Artificial IntelligenceApr-20-2023

Supervised learning systems are trained using historical data and, if the data was tainted by discrimination, they may unintentionally learn to discriminate against protected groups. We propose that fair learning methods, despite training on potentially discriminatory datasets, shall perform well on fair test datasets. Such dataset shifts crystallize application scenarios for specific fair learning methods. For instance, the removal of direct discrimination can be represented as a particular dataset shift problem. For this scenario, we propose a learning method that provably minimizes model error on fair datasets, while blindly training on datasets poisoned with direct additive discrimination. The method is compatible with existing legal systems and provides a solution to the widely discussed issue of protected groups' intersectionality by striking a balance between the protected groups. Technically, the method applies probabilistic interventions, has causal and counterfactual formulations, and is computationally lightweight - it can be used with any supervised learning model to prevent discrimination via proxies while maximizing model accuracy for business necessity.

artificial intelligence, inductive learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

1912.08189

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.47)

Industry:

Banking & Finance (1.00)
Law > Civil Rights & Constitutional Law (0.93)
Law > Labor & Employment Law (0.67)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Leveraging Browsing Patterns for Topic Discovery and Photostream Recommendation

Chiarandini, Luca (Universitat Pompeu Fabra and Yahoo! Research) | Grabowicz, Przemyslaw A. (IFISC (CSIC-UIB)) | Trevisiol, Michele (Universitat Pompeu Fabra and Yahoo! Research) | Jaimes, Alejandro (Yahoo! Research)

AAAI ConferencesJul-5-2013

In photo-sharing websites and in social networks, photographs are most often browsed as a sequence: users who view a photo are likely to click on those that follow. The sequences of photos (which we call photostreams), as opposed to individual images, can therefore be considered to be very important content units in their own right. In spite of their importance, those sequences have received little attention even though they are at the core of how people consume image content. In this paper, we focus on photostreams. First, we perform an analysis of a large dataset of user logs containing over 100 million pageviews, examining navigation patterns between photostreams. Based on observations from the analysis, we build a stream transition graph to analyze common stream topic transitions (e.g., users often view “train” photostreams followed by “firetruck” photostreams). We then implement two stream recommendation algorithms, based on collaborative filtering and on photo tags, and report the results of a user study involving 40 participants. Our analysis yields interesting insights into how people navigate between photostreams, while the results of the user study provide useful feedback for evaluating the performance and characteristics of the recommendation algorithms.

leveraging browsing pattern, topic discovery and photostream recommendation

AAAI Conferences

Seventh International AAAI Conference on Weblogs and Social Media

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.73)

Add feedback