Overview
Explainable Predictive Maintenance
Pashami, Sepideh, Nowaczyk, Slawomir, Fan, Yuantao, Jakubowski, Jakub, Paiva, Nuno, Davari, Narjes, Bobek, Szymon, Jamshidi, Samaneh, Sarmadi, Hamid, Alabdallah, Abdallah, Ribeiro, Rita P., Veloso, Bruno, Sayed-Mouchaweh, Moamar, Rajaoarisoa, Lala, Nalepa, Grzegorz J., Gama, João
Explainable Artificial Intelligence (XAI) fills the role of a critical interface fostering interactions between sophisticated intelligent systems and diverse individuals, including data scientists, domain experts, end-users, and more. It aids in deciphering the intricate internal mechanisms of ``black box'' Machine Learning (ML), rendering the reasons behind their decisions more understandable. However, current research in XAI primarily focuses on two aspects; ways to facilitate user trust, or to debug and refine the ML model. The majority of it falls short of recognising the diverse types of explanations needed in broader contexts, as different users and varied application areas necessitate solutions tailored to their specific needs. One such domain is Predictive Maintenance (PdM), an exploding area of research under the Industry 4.0 \& 5.0 umbrella. This position paper highlights the gap between existing XAI methodologies and the specific requirements for explanations within industrial applications, particularly the Predictive Maintenance field. Despite explainability's crucial role, this subject remains a relatively under-explored area, making this paper a pioneering attempt to bring relevant challenges to the research community's attention. We provide an overview of predictive maintenance tasks and accentuate the need and varying purposes for corresponding explanations. We then list and describe XAI techniques commonly employed in the literature, discussing their suitability for PdM tasks. Finally, to make the ideas and claims more concrete, we demonstrate XAI applied in four specific industrial use cases: commercial vehicles, metro trains, steel plants, and wind farms, spotlighting areas requiring further research.
Towards Fairness in Personalized Ads Using Impression Variance Aware Reinforcement Learning
Timmaraju, Aditya Srinivas, Mashayekhi, Mehdi, Chen, Mingliang, Zeng, Qi, Fettes, Quintin, Cheung, Wesley, Xiao, Yihan, Kannadasan, Manojkumar Rangasamy, Tripathi, Pushkar, Gahagan, Sean, Bogen, Miranda, Roudani, Rob
Variances in ad impression outcomes across demographic groups are increasingly considered to be potentially indicative of algorithmic bias in personalized ads systems. While there are many definitions of fairness that could be applicable in the context of personalized systems, we present a framework which we call the Variance Reduction System (VRS) for achieving more equitable outcomes in Meta's ads systems. VRS seeks to achieve a distribution of impressions with respect to selected protected class (PC) attributes that more closely aligns the demographics of an ad's eligible audience (a function of advertiser targeting criteria) with the audience who sees that ad, in a privacy-preserving manner. We first define metrics to quantify fairness gaps in terms of ad impression variances with respect to PC attributes including gender and estimated race. We then present the VRS for re-ranking ads in an impression variance-aware manner. We evaluate VRS via extensive simulations over different parameter choices and study the effect of the VRS on the chosen fairness metric. We finally present online A/B testing results from applying VRS to Meta's ads systems, concluding with a discussion of future work. We have deployed the VRS to all users in the US for housing ads, resulting in significant improvement in our fairness metric. VRS is the first large-scale deployed framework for pursuing fairness for multiple PC attributes in online advertising.
Internet of Things Meets Robotics: A Survey of Cloud-based Robots
This work presents a survey of existing literature on the fusion of the Internet of Things (IoT) with robotics and explores the integration of these technologies for the development of the Internet of Robotics Things (IoRT). The survey focuses on the applications of IoRT in healthcare and agriculture, while also addressing key concerns regarding the adoption of IoT and robotics. Additionally, an online survey was conducted to examine how companies utilize IoT technology in their organizations. The findings highlight the benefits of IoT in improving customer experience, reducing costs, and accelerating product development. However, concerns regarding unauthorized access, data breaches, and privacy need to be addressed for successful IoT deployment.
Queer In AI: A Case Study in Community-Led Participatory AI
QueerInAI, Organizers Of, :, null, Ovalle, Anaelia, Subramonian, Arjun, Singh, Ashwin, Voelcker, Claas, Sutherland, Danica J., Locatelli, Davide, Breznik, Eva, Klubička, Filip, Yuan, Hang, J, Hetvi, Zhang, Huan, Shriram, Jaidev, Lehman, Kruno, Soldaini, Luca, Sap, Maarten, Deisenroth, Marc Peter, Pacheco, Maria Leonor, Ryskina, Maria, Mundt, Martin, Agarwal, Milind, McLean, Nyx, Xu, Pan, Pranav, A, Korpan, Raj, Ray, Ruchira, Mathew, Sarah, Arora, Sarthak, John, ST, Anand, Tanvi, Agrawal, Vishakha, Agnew, William, Long, Yanan, Wang, Zijie J., Talat, Zeerak, Ghosh, Avijit, Dennler, Nathaniel, Noseworthy, Michael, Jha, Sharvani, Baylor, Emi, Joshi, Aditya, Bilenko, Natalia Y., McNamara, Andrew, Gontijo-Lopes, Raphael, Markham, Alex, Dǒng, Evyn, Kay, Jackie, Saraswat, Manu, Vytla, Nikhil, Stark, Luke
We present Queer in AI as a case study for community-led participatory design in AI. We examine how participatory design and intersectional tenets started and shaped this community's programs over the years. We discuss different challenges that emerged in the process, look at ways this organization has fallen short of operationalizing participatory and intersectional principles, and then assess the organization's impact. Queer in AI provides important lessons and insights for practitioners and theorists of participatory methods broadly through its rejection of hierarchy in favor of decentralization, success at building aid and programs by and for the queer community, and effort to change actors and institutions outside of the queer community. Finally, we theorize how communities like Queer in AI contribute to the participatory design in AI more broadly by fostering cultures of participation in AI, welcoming and empowering marginalized participants, critiquing poor or exploitative participatory practices, and bringing participation to institutions outside of individual research projects. Queer in AI's work serves as a case study of grassroots activism and participatory methods within AI, demonstrating the potential of community-led participatory methods and intersectional praxis, while also providing challenges, case studies, and nuanced insights to researchers developing and using participatory methods.
Graph-based Time-Series Anomaly Detection: A Survey
Ho, Thi Kieu Khanh, Karami, Ali, Armanfard, Narges
With the recent advances in technology, a wide range of systems continue to collect a large amount of data over time and thus generate time series. Time-Series Anomaly Detection (TSAD) is an important task in various time-series applications such as e-commerce, cybersecurity, vehicle maintenance, and healthcare monitoring. However, this task is very challenging as it requires considering both the intra-variable dependency and the inter-variable dependency, where a variable can be defined as an observation in time series data. Recent graph-based approaches have made impressive progress in tackling the challenges of this field. In this survey, we conduct a comprehensive and up-to-date review of Graph-based TSAD (G-TSAD). First, we explore the significant potential of graph representation learning for time-series data. Then, we review state-of-the-art graph anomaly detection techniques in the context of time series and discuss their strengths and drawbacks. Finally, we discuss the technical challenges and potential future directions for possible improvements in this research field.
A Systematic Review of Automated Query Reformulations in Source Code Search
Rahman, Mohammad Masudur, Roy, Chanchal K.
Fixing software bugs and adding new features are two of the major maintenance tasks. Software bugs and features are reported as change requests. Developers consult these requests and often choose a few keywords from them as an ad hoc query. Then they execute the query with a search engine to find the exact locations within software code that need to be changed. Unfortunately, even experienced developers often fail to choose appropriate queries, which leads to costly trials and errors during a code search. Over the years, many studies attempt to reformulate the ad hoc queries from developers to support them. In this systematic literature review, we carefully select 70 primary studies on query reformulations from 2,970 candidate studies, perform an in-depth qualitative analysis (e.g., Grounded Theory), and then answer seven research questions with major findings. First, to date, eight major methodologies (e.g., term weighting, term co-occurrence analysis, thesaurus lookup) have been adopted to reformulate queries. Second, the existing studies suffer from several major limitations (e.g., lack of generalizability, vocabulary mismatch problem, subjective bias) that might prevent their wide adoption. Finally, we discuss the best practices and future opportunities to advance the state of research in search query reformulations.
A Systematic Literature Review on Client Selection in Federated Learning
With the arising concerns of privacy within machine learning, federated learning (FL) was invented in 2017, in which the clients, such as mobile devices, train a model and send the update to the centralized server. Choosing clients randomly for FL can harm learning performance due to different reasons. Many studies have proposed approaches to address the challenges of client selection of FL. However, no systematic literature review (SLR) on this topic existed. This SLR investigates the state of the art of client selection in FL and answers the challenges, solutions, and metrics to evaluate the solutions. We systematically reviewed 47 primary studies. The main challenges found in client selection are heterogeneity, resource allocation, communication costs, and fairness. The client selection schemes aim to improve the original random selection algorithm by focusing on one or several of the aforementioned challenges. The most common metric used is testing accuracy versus communication rounds, as testing accuracy measures the successfulness of the learning and preferably in as few communication rounds as possible, as they are very expensive. Although several possible improvements can be made with the current state of client selection, the most beneficial ones are evaluating the impact of unsuccessful clients and gaining a more theoretical understanding of the impact of fairness in FL.
Unified Embedding Based Personalized Retrieval in Etsy Search
Jha, Rishikesh, Subramaniyam, Siddharth, Benjamin, Ethan, Taula, Thrivikrama
Embedding-based neural retrieval is a prevalent approach to address the semantic gap problem which often arises in product search on tail queries. In contrast, popular queries typically lack context and have a broad intent where additional context from users historical interaction can be helpful. In this paper, we share our novel approach to address both: the semantic gap problem followed by an end to end trained model for personalized semantic retrieval. We propose learning a unified embedding model incorporating graph, transformer and term-based embeddings end to end and share our design choices for optimal tradeoff between performance and efficiency. We share our learnings in feature engineering, hard negative sampling strategy, and application of transformer model, including a novel pre-training strategy and other tricks for improving search relevance and deploying such a model at industry scale. Our personalized retrieval model significantly improves the overall search experience, as measured by a 5.58% increase in search purchase rate and a 2.63% increase in site-wide conversion rate, aggregated across multiple A/B tests - on live traffic.
SKG: A Versatile Information Retrieval and Analysis Framework for Academic Papers with Semantic Knowledge Graphs
Tu, Yamei, Qiu, Rui, Shen, Han-Wei
The number of published research papers has experienced exponential growth in recent years, which makes it crucial to develop new methods for efficient and versatile information extraction and knowledge discovery. To address this need, we propose a Semantic Knowledge Graph (SKG) that integrates semantic concepts from abstracts and other meta-information to represent the corpus. The SKG can support various semantic queries in academic literature thanks to the high diversity and rich information content stored within. To extract knowledge from unstructured text, we develop a Knowledge Extraction Module that includes a semi-supervised pipeline for entity extraction and entity normalization. We also create an ontology to integrate the concepts with other meta information, enabling us to build the SKG. Furthermore, we design and develop a dataflow system that demonstrates how to conduct various semantic queries flexibly and interactively over the SKG. To demonstrate the effectiveness of our approach, we conduct the research based on the visualization literature and provide real-world use cases to show the usefulness of the SKG. The dataset and codes for this work are available at https://osf.io/aqv8p/?view_only=2c26b36e3e3941ce999df47e4616207f.
Recent applications of machine learning, remote sensing, and iot approaches in yield prediction: a critical review
Bassine, Fatima Zahra, Epule, Terence Epule, Kechchour, Ayoub, Chehbouni, Abdelghani
The integration of remote sensing and machine learning in agriculture is transforming the industry by providing insights and predictions through data analysis. This combination leads to improved yield prediction and water management, resulting in increased efficiency, better yields, and more sustainable agricultural practices. Achieving the United Nations' Sustainable Development Goals, especially "zero hunger," requires the investigation of crop yield and precipitation gaps, which can be accomplished through, the usage of artificial intelligence (AI), machine learning (ML), remote sensing (RS), and the internet of things (IoT). By integrating these technologies, a robust agricultural mobile or web application can be developed, providing farmers and decision-makers with valuable information and tools for improving crop management and increasing efficiency. Several studies have investigated these new technologies and their potential for diverse tasks such as crop monitoring, yield prediction, irrigation management, etc. Through a critical review, this paper reviews relevant articles that have used RS, ML, cloud computing, and IoT in crop yield prediction. It reviews the current state-of-the-art in this field by critically evaluating different machine-learning approaches proposed in the literature for crop yield prediction and water management. It provides insights into how these methods can improve decision-making in agricultural production systems. This work will serve as a compendium for those interested in yield prediction in terms of primary literature but, most importantly, what approaches can be used for real-time and robust prediction.