compendium
Friend or Foe
Cherendichenko, Oleksandr, Solowiej-Wedderburn, Josephine, Carroll, Laura M., Libby, Eric
A fundamental challenge in microbial ecology is determining whether bacteria compete or cooperate in different environmental conditions. With recent advances in genome-scale metabolic models, we are now capable of simulating interactions between thousands of pairs of bacteria in thousands of different environmental settings at a scale infeasible experimentally. These approaches can generate tremendous amounts of data that can be exploited by state-of-the-art machine learning algorithms to uncover the mechanisms driving interactions. Here, we present Friend or Foe, a compendium of 64 tabular environmental datasets, consisting of more than 26M shared environments for more than 10K pairs of bacteria sampled from two of the largest collections of metabolic models. The Friend or Foe datasets are curated for a wide range of machine learning tasks -- supervised, unsupervised, and generative -- to address specific questions underlying bacterial interactions. We benchmarked a selection of the most recent models for each of these tasks and our results indicate that machine learning can be successful in this application to microbial ecology. Going beyond, analyses of the Friend or Foe compendium can shed light on the predictability of bacterial interactions and highlight novel research directions into how bacteria infer and navigate their relationships.
- North America > United States (0.15)
- Europe > Sweden > Västerbotten County > Umeå (0.05)
Federated In-Context LLM Agent Learning
Wu, Panlong, Li, Kangshuo, Nan, Junbao, Wang, Fangxin
Large Language Models (LLMs) have revolutionized intelligent services by enabling logical reasoning, tool use, and interaction with external systems as agents. The advancement of LLMs is frequently hindered by the scarcity of high-quality data, much of which is inherently sensitive. Federated learning (FL) offers a potential solution by facilitating the collaborative training of distributed LLMs while safeguarding private data. However, FL frameworks face significant bandwidth and computational demands, along with challenges from heterogeneous data distributions. The emerging in-context learning capability of LLMs offers a promising approach by aggregating natural language rather than bulky model parameters. Yet, this method risks privacy leakage, as it necessitates the collection and presentation of data samples from various clients during aggregation. In this paper, we propose a novel privacy-preserving Federated In-Context LLM Agent Learning (FICAL) algorithm, which to our best knowledge for the first work unleashes the power of in-context learning to train diverse LLM agents through FL. In our design, knowledge compendiums generated by a novel LLM-enhanced Knowledge Compendiums Generation (KCG) module are transmitted between clients and the server instead of model parameters in previous FL methods. Apart from that, an incredible Retrieval Augmented Generation (RAG) based Tool Learning and Utilizing (TLU) module is designed and we incorporate the aggregated global knowledge compendium as a teacher to teach LLM agents the usage of tools. We conducted extensive experiments and the results show that FICAL has competitive performance compared to other SOTA baselines with a significant communication cost decrease of $\mathbf{3.33\times10^5}$ times.
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Asia > Singapore (0.04)
- (4 more...)
A compendium of data sources for data science, machine learning, and artificial intelligence
Bilokon, Paul, Bilokon, Oleksandr, Amen, Saeed
Recent advances in data science, machine learning, and artificial intelligence, such as the emergence of large language models, are leading to an increasing demand for data that can be processed by such models. While data sources are application-specific, and it is impossible to produce an exhaustive list of such data sources, it seems that a comprehensive, rather than complete, list would still benefit data scientists and machine learning experts of all levels of seniority. The goal of this publication is to provide just such an (inevitably incomplete) list -- or compendium -- of data sources across multiple areas of applications, including finance and economics, legal (laws and regulations), life sciences (medicine and drug discovery), news sentiment and social media, retail and ecommerce, satellite imagery, and shipping and logistics, and sports.
Live a Live review: a lost Japanese RPG gem from the 1990s
In a year where Kate Bush and Metallica re-entered the charts, it's fitting that 2022's most intriguing game so far has been plucked from the past. Directed by Takashi Tokita of Chrono Trigger fame, for decades Live a Live appeared destined to remain the RPG that time forgot. Its initial Japanese release on the Super Famicom (SNES) in 1994 was a commercial flop, ensuring it never left its homeland – until now. Resurrected for Nintendo's fittingly anachronistic current console, the Switch, this eyebrow-raising relic has been reanimated using Square Enix's gorgeous 2D-HD engine, a graphical style that melds rich high-definition backgrounds with retro 16-bit sprites. The results are glorious, injecting once-flat environments with a playful, eye-catching charm that never quite loses its magic.
The Machine & Deep Learning Compendium Open Book - KDnuggets
Nearly a year ago, I announced the Machine & Deep Learning Compendium, a Google document that I have been writing for the last 4 years. The ML Compendium contains over 500 topics, and it is over 400 pages long. Today, I'm announcing that the Compendium is fully open. It is now a project on GitBook and GitHub (please star it!). I believe in knowledge sharing, and the Compendium will always be free to everyone.
Population sequencing data reveal a compendium of mutational processes in the human germ line
It has become increasing clear that mutation affects phenotypic variation and disease risk across humans. However, there are many different types of mutation. Seplyarskiy et al. applied a matrix factorization method to large human genomic datasets to identify germline mutational processes in an unsupervised manner. From this survey, nine robust mutational components were identified and specific mechanisms generating seven of these processes were proposed from correlations with genomic features. These results confirm and improve upon our understanding of mutational processes and reveal likely mechanisms of mutation in the human genome. Science , aba7408, this issue p. [1030][1] Biological mechanisms underlying human germline mutations remain largely unknown. We statistically decompose variation in the rate and spectra of mutations along the genome using volume-regularized nonnegative matrix factorization. The analysis of a sequencing dataset (TOPMed) reveals nine processes that explain the variation in mutation properties between loci. We provide a biological interpretation for seven of these processes. We associate one process with bulky DNA lesions that are resolved asymmetrically with respect to transcription and replication. Two processes track direction of replication fork and replication timing, respectively. We identify a mutagenic effect of active demethylation primarily acting in regulatory regions and a mutagenic effect of long interspersed nuclear elements. We localize a mutagenic process specific to oocytes from population sequencing data. This process appears transcriptionally asymmetric. [1]: /lookup/doi/10.1126/science.aba7408
Deloitte AI Institute Unveils the AI Dossier, a Compendium of the Top Business Use Cases for AI
The Deloitte AI Institute unveiled a new report that examines the most compelling business use cases for artificial intelligence (AI) across six major industries. The report, "The AI Dossier," helps business leaders understand the value AI can deliver today and in the future so that they can make smarter decisions about when, where and how to deploy AI within their organizations. "The AI Dossier" illustrates use cases across six industries, including consumer; energy, resources and industrial; financial services; government and public services; life sciences and health care; and technology, media and telecommunications. For each industry, the report highlights the most valuable, business-ready use cases for AI-related technologies – examining the key business issues and opportunities, how AI can help, and the benefits that are likely to be achieved. The report also highlights the top emerging AI use cases that are expected to have a major impact on the industry's future.
- Banking & Finance > Financial Services (0.73)
- Professional Services (0.71)
Real-time Interpretation: The next frontier in radiology AI - MedCity News
In the nine years since AlexNet spawned the age of deep learning, artificial intelligence (AI) has made significant technological progress in medical imaging, with more than 80 deep-learning algorithms approved by the U.S. FDA since 2012 for clinical applications in image detection and measurement. A 2020 survey found that more than 82% of imaging providers believe AI will improve diagnostic imaging over the next 10 years and the market for AI in medical imaging is expected to grow 10-fold in the same period. Despite this optimistic outlook, AI still falls short of widespread clinical adoption in radiology. A 2020 survey by the American College of Radiology (ACR) revealed that only about a third of radiologists use AI, mostly to enhance image detection and interpretation; of the two thirds who did not use AI, the majority said they saw no benefit to it. In fact, most radiologists would say that AI has not transformed image reading or improved their practices.
- Health & Medicine > Nuclear Medicine (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Energy Expenditure Estimation Through Daily Activity Recognition Using a Smart-phone
De Bois, Maxime, Amroun, Hamdi, Ammi, Mehdi
This paper presents a 3-step system that estimates the real-time energy expenditure of an individual in a non-intrusive way. First, using the user's smart-phone's sensors, we build a Decision Tree model to recognize his physical activity (\textit{running}, \textit{standing}, ...). Then, we use the detected physical activity, the time and the user's speed to infer his daily activity (\textit{watching TV}, \textit{going to the bathroom}, ...) through the use of a reinforcement learning environment, the Partially Observable Markov Decision Process framework. Once the daily activities are recognized, we translate this information into energy expenditure using the compendium of physical activities. By successfully detecting 8 physical activities at 90\%, we reached an overall accuracy of 80\% in recognizing 17 different daily activities. This result leads us to estimate the energy expenditure of the user with a mean error of 26\% of the expected estimation.
- Research Report (0.65)
- Workflow (0.46)
- Health & Medicine > Consumer Health (1.00)
- Education > Health & Safety > School Nutrition (1.00)
Singapore And World Economic Forum Driving AI Adoption And Innovation - dotlah!
Fifteen global companies have taken up Singapore's AI Model Governance Framework; Practical examples for organisations to follow suit. Singapore sees Artificial Intelligence ("AI") as an important and fundamental technology for the Digital Economy, with AI-powered products offering a level of personalised service at scale that was previously unimaginable. In the global discourse on AI ethics and governance issue, Singapore believes that its balanced approach can facilitate innovation, safeguard consumer interests, and serve as a common global reference point. These initiatives follow Singapore's launch of the Model AI Governance Framework in Davos in 2019, as well as the announcement of Singapore's National AI Strategy in November 2019, and demonstrate the progress made in supporting organisations in deploying responsible AI. The new initiatives were announced by Mr S Iswaran, Singapore's Minister for Communications and Information, and Ms Kay Firth-Butterfield, AI Portfolio Lead at the World Economic Forum, at a joint press conference with the WEF's Centre for the Fourth Industrial Revolution ("WEF C4IR") at WEF's Annual Meeting in Davos.