Plotting

 Information Technology


List-Decodable Sparse Mean Estimation

Neural Information Processing Systems

In this paper, we consider that the underlying distribution D is Gaussian with k-sparse mean. Our main contribution is the first polynomial-time algorithm that enjoys sample complexity O poly(k, log d), i.e. poly-logarithmic in the dimension. One of our core algorithmic ingredients is using low-degree sparse polynomials to filter outliers, which may find more applications.


Generative Forests

Neural Information Processing Systems

We focus on generative AI for a type of data that still represent one of the most prevalent form of data: tabular data. Our paper introduces two key contributions: a new powerful class of forest-based models fit for such tasks and a simple training algorithm with strong convergence guarantees in a boosting model that parallels that of the original weak / strong supervised learning setting. This algorithm can be implemented by a few tweaks to the most popular induction scheme for decision tree induction (i.e.



Task-Free Continual Learning via Online Discrepancy Distance Learning

Neural Information Processing Systems

Learning from non-stationary data streams, also called Task-Free Continual Learning (TFCL) remains challenging due to the absence of explicit task information in most applications. Even though recently some algorithms have been proposed for TFCL, these methods lack theoretical guarantees. Moreover, there are no theoretical studies about forgetting during TFCL. This paper develops a new theoretical analysis framework that derives generalization bounds based on the discrepancy distance between the visited samples and the entire information made available for training the model. This analysis provides new insights into the forgetting behaviour in classification tasks. Inspired by this theoretical model, we propose a new approach enabled with the dynamic component expansion mechanism for a mixture model, namely Online Discrepancy Distance Learning (ODDL). ODDL estimates the discrepancy between the current memory and the already accumulated knowledge as an expansion signal aiming to ensure a compact network architecture with optimal performance. We then propose a new sample selection approach that selectively stores the samples into the memory buffer through the discrepancybased measure, further improving the performance. We perform several TFCL experiments with the proposed methodology, which demonstrate that the proposed approach achieves the state of the art performance.


Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization

Neural Information Processing Systems

Despite the significant interests and many progresses in decentralized multi-player multi-armed bandits (MP-MAB) problems in recent years, the regret gap to the natural centralized lower bound in the heterogeneous MP-MAB setting remains open. In this paper, we propose BEACON - Batched Exploration with Adaptive COmmunicatioN - that closes this gap. BEACON accomplishes this goal with novel contributions in implicit communication and efficient exploration. For the former, we propose a novel adaptive differential communication (ADC) design that significantly improves the implicit communication efficiency. For the latter, a carefully crafted batched exploration scheme is developed to enable incorporation of the combinatorial upper confidence bound (CUCB) principle. We then generalize the existing linear-reward MP-MAB problems, where the system reward is always the sum of individually collected rewards, to a new MP-MAB problem where the system reward is a general (nonlinear) function of individual rewards. We extend BEACON to solve this problem and prove a logarithmic regret. BEACON bridges the algorithm design and regret analysis of combinatorial MAB (CMAB) and MP-MAB, two largely disjointed areas in MAB, and the results in this paper suggest that this previously ignored connection is worth further investigation.


Hidden city built 5,000 years ago by lost advanced civilization discovered underneath vast desert

Daily Mail - Science & tech

For centuries, the Rub' al-Khali desert near Saudi Arabia and Dubai -- known as the Empty Quarter -- was dismissed as a lifeless sea of sand. In 2002, Sheikh Mohammed bin Rashid Al Maktoum, ruler of Dubai, spotted unusual dune formations and a large black deposit while flying over the desert. That led to the discovery of Saruq Al-Hadid, an archaeological site rich in remnants of copper and iron smelting, which is now believed to be part of a 5,000-year-old civilization buried beneath the sands. Researchers have now found traces of this ancient society approximately 10 feet beneath the desert surface, hidden in plain sight and long overlooked due to the harsh environment and shifting dunes of the Empty Quarter. This discovery brings fresh life to the legend of a mythical city known as'Atlantis of the Sands.'


Chatbots will be able to teach children TWICE as fast as teachers in the next 10 years, says the 'godfather of AI'

Daily Mail - Science & tech

Chatbots will be able to teach children more than twice as fast as teachers can within the next decade, the so-called godfather of AI has predicted. Geoffrey Hinton, who won a Nobel Prize for his work on the technology, also claimed AI personal tutors would'be much more efficient and less boring'. Speaking at Gitex Europe, the British computer scientist said: 'It's not there yet, but it's coming, and so we'll get much better education at many levels.' AI personal tutors are already being trialled in UK schools, with the technology now able to talk directly to the student and adapt lesson plans to their knowledge level. The government has already funnelled millions of pounds into AI education initiatives โ€“ though it has claimed the technology will'absolutely not' replace teachers.


Amazons latest AI shopping feature produces quick audio product summaries

Mashable

Amazon is aiming to make shopping just a bit easier. This week, Amazon launched a new generative AI feature that produces short audio summaries, detailing everything you need to know about a product. The audio descriptions, which Amazon is calling "hear the highlights", are created from on-page product summaries, reviews, and information from other websites, crafting short snippets that deliver everything you need to know about a product. The product summaries are now available on a limited number of items on Amazon and for US customers only. To access "Hear the highlights", you can do so in the Amazon app.


JD Vance calls dating apps destructive

Mashable

Dating apps are getting a lot of flak lately. Daters are opting for in-person events -- even dungeon sound baths -- and moving away from increasing AI features and apps that seem to be copying each other. Vice President JD Vance also has no love for dating apps, apparently. In an interview on the New York Times's "Interesting Times" podcast, Vance spoke about his "noneconomic" concerns with AI and tech. He told host and Times opinion columnist Ross Douthat, "If you look at basic dating behavior among young people -- and I think a lot of this is that the dating apps are probably more destructive than we fully appreciate."


On the Benefits of Public Representations for Private Transfer Learning under Distribution Shift

Neural Information Processing Systems

Public pretraining is a promising approach to improve differentially private model training. However, recent work has noted that many positive research results studying this paradigm only consider in-distribution tasks, and may not apply to settings where there is distribution shift between the pretraining and finetuning data--a scenario that is likely when finetuning private tasks due to the sensitive nature of the data. In this work, we show empirically across three tasks that even in settings with large distribution shift, where both zero-shot performance from public data and training from scratch with private data give unusably weak results, public features can in fact improve private training accuracy by up to 67% over private training from scratch. We provide a theoretical explanation for this phenomenon, showing that if the public and private data share a low-dimensional representation, public representations can improve the sample complexity of private training even if it is impossible to learn the private task from the public data alone. Altogether, our results provide evidence that public data can indeed make private training practical in realistic settings of extreme distribution shift.