Goto

Collaborating Authors

 in-depth investigation


An In-depth Investigation of Sparse Rate Reduction in Transformer-like Models

Neural Information Processing Systems

Deep neural networks have long been criticized for being black-box. To unveil the inner workings of modern neural architectures, a recent work proposed an information-theoretic objective function called Sparse Rate Reduction (SRR) and interpreted its unrolled optimization as a Transformer-like model called Coding Rate Reduction Transformer (CRATE). However, the focus of the study was primarily on the basic implementation, and whether this objective is optimized in practice and its causal relationship to generalization remain elusive. Going beyond this study, we derive different implementations by analyzing layer-wise behaviors of CRATE, both theoretically and empirically. To reveal the predictive power of SRR on generalization, we collect a set of model variants induced by varied implementations and hyperparameters and evaluate SRR as a complexity measure based on its correlation with generalization.


Data Exposure from LLM Apps: An In-depth Investigation of OpenAI's GPTs

arXiv.org Artificial Intelligence

LLM app ecosystems are quickly maturing and supporting a wide range of use cases, which requires them to collect excessive user data. Given that the LLM apps are developed by third-parties and that anecdotal evidence suggests LLM platforms currently do not strictly enforce their policies, user data shared with arbitrary third-parties poses a significant privacy risk. In this paper we aim to bring transparency in data practices of LLM apps. As a case study, we study OpenAI's GPT app ecosystem. We develop an LLM-based framework to conduct the static analysis of natural language-based source code of GPTs and their Actions (external services) to characterize their data collection practices. Our findings indicate that Actions collect expansive data about users, including sensitive information prohibited by OpenAI, such as passwords. We find that some Actions, including related to advertising and analytics, are embedded in multiple GPTs, which allow them to track user activities across GPTs. Additionally, co-occurrence of Actions exposes as much as 9.5x more data to them, than it is exposed to individual Actions. Lastly, we develop an LLM-based privacy policy analysis framework to automatically check the consistency of data collection by Actions with disclosures in their privacy policies. Our measurements indicate that the disclosures for most of the collected data types are omitted in privacy policies, with only 5.8% of Actions clearly disclosing their data collection practices.


EU to Further Probe Microsoft's Deal for Activision

WSJ.com: WSJD - Technology

The European Union's competition watchdog said it would pursue an in-depth investigation into Microsoft Corp.'s planned $75 billion acquisition of Activision Blizzard Inc., adding to the global scrutiny of whether the deal could harm competition in the videogame industry. The European Commission, which opened its initial, formal probe of the deal in late September, said it is concerned that the deal may reduce competition in the markets for console and personal computer distribution, videogames and PC operating systems. It said it was concerned that Microsoft may block access to Activision Blizzard games to other game distributors, especially the publisher's most successful franchises such as Call of Duty. "We must ensure that opportunities remain for future and existing distributors of PC and console videogames, as well as for rival suppliers of PC operating systems," the Commission said. "The point is to ensure that the gaming ecosystem remains vibrant to the benefit of users in a sector that is evolving at a fast pace."