Personal Assistant Systems
Bumble's Founder Wants to Make Dating Apps Even Worse Than They Already Are
Bumble, the company that distinguished itself from apps like Tinder by creating a "feminist dating app," hasn't done too many favors for that brand recently. Yes, there was the ad campaign that appeared to shame women who choose celibacy--which the company wisely retracted this week. There was also the tentative announcement that Bumble may roll back its defining "women make the first move" ethos. Then there were the strange remarks last week from Bumble founder and #girlboss icon Whitney Wolfe Herd, who informed the audience at Bloomberg's Tech Summit of "a world where your dating concierge could go and date for you with another dating concierge." Naturally, these "concierges" would make use of artificial intelligence software, which users could train by "shar[ing] your insecurities" and thus help to "train yourself into a better way of thinking about yourself," Wolfe Herd claimed.
The Download: rapid DNA analysis for disasters, and supercharged AI assistants
Last August, a wildfire tore through the Hawaiian island of Maui. The list of missing residents climbed into the hundreds, as friends and families desperately searched for their missing loved ones. But while some were rewarded with tearful reunions, others weren't so lucky. Over the past several years, as fires and other climate-change-fueled disasters have become more common and more cataclysmic, the way their aftermath is processed and their victims identified has been transformed. The grim work following a disaster remains--surveying rubble and ash, distinguishing a piece of plastic from a tiny fragment of bone--but landing a positive identification can now take just a fraction of the time it once did, which may in turn bring families some semblance of peace swifter than ever before.
Positional encoding is not the same as context: A study on positional encoding for Sequential recommendation
Lopez-Avila, Alejo, Du, Jinhua, Shimary, Abbas, Li, Ze
The expansion of streaming media and e-commerce has led to a boom in recommendation systems, including Sequential recommendation systems, which consider the user's previous interactions with items. In recent years, research has focused on architectural improvements such as transformer blocks and feature extraction that can augment model information. Among these features are context and attributes. Of particular importance is the temporal footprint, which is often considered part of the context and seen in previous publications as interchangeable with positional information. Other publications use positional encodings with little attention to them. In this paper, we analyse positional encodings, showing that they provide relative information between items that are not inferable from the temporal footprint. Furthermore, we evaluate different encodings and how they affect metrics and stability using Amazon datasets. We added some new encodings to help with these problems along the way. We found that we can reach new state-of-the-art results by finding the correct positional encoding, but more importantly, certain encodings stabilise the training.
How Far Are We From AGI
Feng, Tao, Jin, Chuanyang, Liu, Jingyu, Zhu, Kunlun, Tu, Haoqin, Cheng, Zirui, Lin, Guanyu, You, Jiaxuan
The evolution of artificial intelligence (AI) has profoundly impacted human society, driving significant advancements in multiple sectors. Yet, the escalating demands on AI have highlighted the limitations of AI's current offerings, catalyzing a movement towards Artificial General Intelligence (AGI). AGI, distinguished by its ability to execute diverse real-world tasks with efficiency and effectiveness comparable to human intelligence, reflects a paramount milestone in AI evolution. While existing works have summarized specific recent advancements of AI, they lack a comprehensive discussion of AGI's definitions, goals, and developmental trajectories. Different from existing survey papers, this paper delves into the pivotal questions of our proximity to AGI and the strategies necessary for its realization through extensive surveys, discussions, and original perspectives. We start by articulating the requisite capability frameworks for AGI, integrating the internal, interface, and system dimensions. As the realization of AGI requires more advanced capabilities and adherence to stringent constraints, we further discuss necessary AGI alignment technologies to harmonize these factors. Notably, we emphasize the importance of approaching AGI responsibly by first defining the key levels of AGI progression, followed by the evaluation framework that situates the status-quo, and finally giving our roadmap of how to reach the pinnacle of AGI. Moreover, to give tangible insights into the ubiquitous impact of the integration of AI, we outline existing challenges and potential pathways toward AGI in multiple domains. In sum, serving as a pioneering exploration into the current state and future trajectory of AGI, this paper aims to foster a collective comprehension and catalyze broader public discussions among researchers and practitioners on AGI.
Wisdom of Committee: Distilling from Foundation Model to Specialized Application Model
Liu, Zichang, Liu, Qingyun, Li, Yuening, Liu, Liang, Shrivastava, Anshumali, Bi, Shuchao, Hong, Lichan, Chi, Ed H., Zhao, Zhe
Recent advancements in foundation models have yielded impressive performance across a wide range of tasks. Meanwhile, for specific applications, practitioners have been developing specialized application models. To enjoy the benefits of both kinds of models, one natural path is to transfer the knowledge in foundation models into specialized application models, which are generally more efficient for serving. Techniques from knowledge distillation may be applied here, where the application model learns to mimic the foundation model. However, specialized application models and foundation models have substantial gaps in capacity, employing distinct architectures, using different input features from different modalities, and being optimized on different distributions. These differences in model characteristics lead to significant challenges for distillation methods. In this work, we propose creating a teaching committee comprising both foundation model teachers and complementary teachers. Complementary teachers possess model characteristics akin to the student's, aiming to bridge the gap between the foundation model and specialized application models for a smoother knowledge transfer. Further, to accommodate the dissimilarity among the teachers in the committee, we introduce DiverseDistill, which allows the student to understand the expertise of each teacher and extract task knowledge. Our evaluations demonstrate that adding complementary teachers enhances student performance. Finally, DiverseDistill consistently outperforms baseline distillation methods, regardless of the teacher choices, resulting in significantly improved student performance.
Google Project Astra hands-on: Full of potential, but it's going to be a while
At I/O 2024, Google's teaser for Project Astra gave us a glimpse at where AI assistants are going in the future. It's a multi-modal feature that combines the smarts of Gemini with the kind of image recognition abilities you get in Google Lens, as well as powerful natural language responses. However, while the promo video was slick, after getting to try it out in person, it's clear there's a long way to go before something like Astra lands on your phone. So here are three takeaways from our first experience with Google's next-gen AI. Currently, most people interact with digital assistants using their voice, so right away Astra's multi-modality (i.e. using sight and sound in addition to text/speech) to communicate with an AI is relatively novel.
How to Surprisingly Consider Recommendations? A Knowledge-Graph-based Approach Relying on Complex Network Metrics
Baumann, Oliver, Nandini, Durgesh, Rossanez, Anderson, Schoenfeld, Mirco, Reis, Julio Cesar dos
Traditional recommendation proposals, including content-based and collaborative filtering, usually focus on similarity between items or users. Existing approaches lack ways of introducing unexpectedness into recommendations, prioritizing globally popular items over exposing users to unforeseen items. This investigation aims to design and evaluate a novel layer on top of recommender systems suited to incorporate relational information and suggest items with a user-defined degree of surprise. We propose a Knowledge Graph (KG) based recommender system by encoding user interactions on item catalogs. Our study explores whether network-level metrics on KGs can influence the degree of surprise in recommendations. We hypothesize that surprisingness correlates with certain network metrics, treating user profiles as subgraphs within a larger catalog KG. The achieved solution reranks recommendations based on their impact on structural graph metrics. Our research contributes to optimizing recommendations to reflect the metrics. We experimentally evaluate our approach on two datasets of LastFM listening histories and synthetic Netflix viewing profiles. We find that reranking items based on complex network metrics leads to a more unexpected and surprising composition of recommendation lists.
A Click-Through Rate Prediction Method Based on Cross-Importance of Multi-Order Features
Most current click-through rate prediction(CTR)models create explicit or implicit high-order feature crosses through Hadamard product or inner product, with little attention to the importance of feature crossing; only few models are either limited to the second-order explicit feature crossing, implicitly to high-order feature crossing, or can learn the importance of high-order explicit feature crossing but fail to provide good interpretability for the model. This paper proposes a new model, FiiNet (Multiple Order Feature Interaction Importance Neural Networks). The model first uses the selective kernel network (SKNet) to explicitly construct multi-order feature crosses. It dynamically learns the importance of feature interaction combinations in a fine grained manner, increasing the attention weight of important feature cross combinations and reducing the weight of featureless crosses. To verify that the FiiNet model can dynamically learn the importance of feature interaction combinations in a fine-grained manner and improve the model's recommendation performance and interpretability, this paper compares it with many click-through rate prediction models on two real datasets, proving that the FiiNet model incorporating the selective kernel network can effectively improve the recommendation effect and provide better interpretability. FiiNet model implementations are available in PyTorch.
The Power of Combined Modalities in Interactive Robot Learning
Beierling, Helen, Vollmer, Anna-Lisa
This study contributes to the evolving field of robot learning in interaction with humans, examining the impact of diverse input modalities on learning outcomes. It introduces the concept of "meta-modalities" which encapsulate additional forms of feedback beyond the traditional preference and scalar feedback mechanisms. Unlike prior research that focused on individual meta-modalities, this work evaluates their combined effect on learning outcomes. Through a study with human participants, we explore user preferences for these modalities and their impact on robot learning performance. Our findings reveal that while individual modalities are perceived differently, their combination significantly improves learning behavior and usability. This research not only provides valuable insights into the optimization of human-robot interactive task learning but also opens new avenues for enhancing the interactive freedom and scaffolding capabilities provided to users in such settings.
DynLLM: When Large Language Models Meet Dynamic Graph Recommendation
Zhao, Ziwei, Lin, Fake, Zhu, Xi, Zheng, Zhi, Xu, Tong, Shen, Shitian, Li, Xueying, Yin, Zikai, Chen, Enhong
Last year has witnessed the considerable interest of Large Language Models (LLMs) for their potential applications in recommender systems, which may mitigate the persistent issue of data sparsity. Though large efforts have been made for user-item graph augmentation with better graph-based recommendation performance, they may fail to deal with the dynamic graph recommendation task, which involves both structural and temporal graph dynamics with inherent complexity in processing time-evolving data. To bridge this gap, in this paper, we propose a novel framework, called DynLLM, to deal with the dynamic graph recommendation task with LLMs. Specifically, DynLLM harnesses the power of LLMs to generate multi-faceted user profiles based on the rich textual features of historical purchase records, including crowd segments, personal interests, preferred categories, and favored brands, which in turn supplement and enrich the underlying relationships between users and items. Along this line, to fuse the multi-faceted profiles with temporal graph embedding, we engage LLMs to derive corresponding profile embeddings, and further employ a distilled attention mechanism to refine the LLM-generated profile embeddings for alleviating noisy signals, while also assessing and adjusting the relevance of each distilled facet embedding for seamless integration with temporal graph embedding from continuous time dynamic graphs (CTDGs). Extensive experiments on two real e-commerce datasets have validated the superior improvements of DynLLM over a wide range of state-of-the-art baseline methods.