AITopics | Serdyukov, Pavel

Collaborating Authors

Serdyukov, Pavel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information

Shishkin, Alexander, Bezzubtseva, Anastasia, Drutsa, Alexey, Shishkov, Ilia, Gladkikh, Ekaterina, Gusev, Gleb, Serdyukov, Pavel

Neural Information Processing SystemsFeb-14-2020, 16:25:50 GMT

This study introduces a novel feature selection approach CMICOT, which is a further evolution of filter methods with sequential forward selection (SFS) whose scoring functions are based on conditional mutual information (MI). We state and study a novel saddle point (max-min) optimization problem to build a scoring function that is able to identify joint interactions between several features. This method fills the gap of MI-based SFS techniques with high-order dependencies. In this high-dimensional case, the estimation of MI has prohibitively high sample complexity. We mitigate this cost using a greedy approximation and binary representatives what makes our technique able to be effectively used.

artificial intelligence, conditional mutual information, machine learning, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback

Latent Distribution Assumption for Unbiased and Consistent Consensus Modelling

Fedorova, Valentina, Gusev, Gleb, Serdyukov, Pavel

arXiv.org Machine LearningJun-20-2019

We study the problem of aggregation noisy labels. Usually, it is solved by proposing a stochastic model for the process of generating noisy labels and then estimating the model parameters using the observed noisy labels. A traditional assumption underlying previously introduced generative models is that each object has one latent true label. In contrast, we introduce a novel latent distribution assumption, implying that a unique true label for an object might not exist, but rather each object might have a specific distribution generating a latent subjective label each time the object is observed. Our experiments showed that the novel assumption is more suitable for difficult tasks, when there is an ambiguity in choosing a "true" label for certain objects.

artificial intelligence, bayesian inference, noisy label, (20 more...)

arXiv.org Machine Learning

1906.08776

Country:

Europe > Russia (0.14)
Asia (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

Finding Influential Training Samples for Gradient Boosted Decision Trees

Sharchilev, Boris, Ustinovsky, Yury, Serdyukov, Pavel, de Rijke, Maarten

arXiv.org Machine LearningMar-12-2018

We address the problem of finding influential training samples for a particular case of tree ensemble-based models, e.g., Random Forest (RF) or Gradient Boosted Decision Trees (GBDT). A natural way of formalizing this problem is studying how the model's predictions change upon leave-one-out retraining, leaving out each individual training sample. Recent work has shown that, for parametric models, this analysis can be conducted in a computationally efficient way. We propose several ways of extending this framework to non-parametric GBDT ensembles under the assumption that tree structures remain fixed. Furthermore, we introduce a general scheme of obtaining further approximations to our method that balance the trade-off between performance and computational complexity. We evaluate our approaches on various experimental setups and use-case scenarios and demonstrate both the quality of our approach to finding influential training samples in comparison to the baselines and its computational efficiency.

artificial intelligence, decision tree learning, leaf value, (16 more...)

arXiv.org Machine Learning

1802.0664

Country:

Europe > Netherlands (0.14)
North America > United States (0.14)
Europe > Russia (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information

Shishkin, Alexander, Bezzubtseva, Anastasia, Drutsa, Alexey, Shishkov, Ilia, Gladkikh, Ekaterina, Gusev, Gleb, Serdyukov, Pavel

Neural Information Processing SystemsDec-31-2016

This study introduces a novel feature selection approach CMICOT, which is a further evolution of filter methods with sequential forward selection (SFS) whose scoring functions are based on conditional mutual information (MI). We state and study a novel saddle point (max-min) optimization problem to build a scoring function that is able to identify joint interactions between several features. This method fills the gap of MIbased SFS techniques with high-order dependencies. In this high-dimensional case, the estimation of MI has prohibitively high sample complexity. We mitigate this cost using a greedy approximation and binary representatives whatmakes our technique able to be effectively used. The superiority of our approach is demonstrated by comparison with recently proposed interactionaware filtersand several interaction-agnostic state-of-the-art ones on ten publicly available benchmark datasets.

artificial intelligence, interaction, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > Spain (0.14)
Europe > Russia (0.14)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

User Model-Based Intent-Aware Metrics for Multilingual Search Evaluation

Drutsa, Alexey, Shutovich, Andrey, Pushnyakov, Philipp, Krokhalyov, Evgeniy, Gusev, Gleb, Serdyukov, Pavel

arXiv.org Machine LearningDec-13-2016

Despite the growing importance of multilingual aspect of web search, no appropriate offline metrics to evaluate its quality are proposed so far. At the same time, personal language preferences can be regarded as intents of a query. This approach translates the multilingual search problem into a particular task of search diversification. Furthermore, the standard intent-aware approach could be adopted to build a diversified metric for multilingual search on the basis of a classical IR metric such as ERR. The intent-aware approach estimates user satisfaction under a user behavior model. We show however that the underlying user behavior models is not realistic in the multilingual case, and the produced intent-aware metric do not appropriately estimate the user satisfaction. We develop a novel approach to build intent-aware user behavior models, which overcome these limitations and convert to quality metrics that better correlate with standard online metrics of user satisfaction.

artificial intelligence, click model, information management, (20 more...)

arXiv.org Machine Learning

1612.04418

Country:

North America > United States (0.16)
Europe > Spain (0.14)
Europe > Russia (0.14)

Genre:

Research Report > Promising Solution (0.48)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Information Management > Search (0.91)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)

Add feedback

Prediction of Video Popularity in the Absence of Reliable Data from Video Hosting Services: Utility of Traces Left by Users on the Web

Drutsa, Alexey, Gusev, Gleb, Serdyukov, Pavel

arXiv.org Machine LearningNov-28-2016

With the growth of user-generated content, we observe the constant rise of the number of companies, such as search engines, content aggregators, etc., that operate with tremendous amounts of web content not being the services hosting it. Thus, aiming to locate the most important content and promote it to the users, they face the need of estimating the current and predicting the future content popularity. In this paper, we approach the problem of video popularity prediction not from the side of a video hosting service, as done in all previous studies, but from the side of an operating company, which provides a popular video search service that aggregates content from different video hosting websites. We investigate video popularity prediction based on features from three primary sources available for a typical operating company: first, the content hosting provider may deliver its data via its API, second, the operating company makes use of its own search and browsing logs, third, the company crawls information about embeds of a video and links to a video page from publicly available resources on the Web. We show that video popularity prediction based on the embed and link data coupled with the internal search and browsing data significantly improves video popularity prediction based only on the data provided by the video hosting and can even adequately replace the API data in the cases when it is partly or completely unavailable.

information management, social media, video, (19 more...)

arXiv.org Machine Learning

1611.09083

Country:

Europe > Russia (0.14)
Asia (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.68)

Industry:

Media (1.00)
Information Technology > Services (0.93)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.34)

Add feedback

Personalized Landmark Recommendation Based on Geotags from Photo Sharing Sites

Shi, Yue (Delft University of Technology) | Serdyukov, Pavel (Yandex) | Hanjalic, Alan (Delft University of Technology) | Larson, Martha (Delft University of Technology)

AAAI ConferencesJul-12-2011

Geotagged photos of users on social media sites provide abundant location-based data, which can be exploited for various location-based services, such as travel recommendation. In this paper, we propose a novel approach to a new application, i.e., personalized landmark recommendation based on users’ geotagged photos. We formulate the landmark recommendation task as a collaborative filtering problem, for which we propose a category-regularized matrix factorization approach that integrates both user-landmark preference and category-based landmark similarity. We collected geotagged photos from Flickr and landmark categories from Wikipedia for our experiments. Our experimental results demonstrate that the proposed approach outperforms popularity-based landmark recommendation and a basic matrix factorization approach in recommending personalized landmarks that are less visited by the population as a whole.

artificial intelligence, landmark, social media, (17 more...)

AAAI Conferences

Fifth International AAAI Conference on Weblogs and Social Media

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry: Information Technology (0.36)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback