society
Society of Agents: Regret Bounds of Concurrent Thompson Sampling
We consider the concurrent reinforcement learning problem where $n$ agents simultaneously learn to make decisions in the same environment by sharing experience with each other. Existing works in this emerging area have empirically demonstrated that Thompson sampling (TS) based algorithms provide a particularly attractive alternative for inducing cooperation, because each agent can independently sample a belief environment (and compute a corresponding optimal policy) from the joint posterior computed by aggregating all agents' data, which induces diversity in exploration among agents while benefiting shared experience from all agents. However, theoretical guarantees in this area remain under-explored; in particular, no regret bound is known on TS based concurrent RL algorithms. In this paper, we fill in this gap by considering two settings. In the first, we study the simple finite-horizon episodic RL setting, where TS is naturally adapted into the concurrent setup by having each agent sample from the current joint posterior at the beginning of each episode. We establish a $\tilde{O}(HS\sqrt{\frac{AT}{n}})$ per-agent regret bound, where $H$ is the horizon of the episode, $S$ is the number of states, $A$ is the number of actions, $T$ is the number of episodes and $n$ is the number of agents. In the second setting, we consider the infinite-horizon RL problem, where a policy is measured by its long-run average reward. Here, despite not having natural episodic breakpoints, we show that by a doubling-horizon schedule, we can adapt TS to the infinite-horizon concurrent learning setting to achieve a regret bound of $\tilde{O}(DS\sqrt{ATn})$, where $D$ is the standard notion of diameter of the underlying MDP and $T$ is the number of timesteps. Note that in both settings, the per-agent regret decreases at an optimal rate of $\Theta(\frac{1}{\sqrt{n}})$, which manifests the power of cooperation in concurrent RL.
Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets
Language models can generate harmful and biased outputs and exhibit undesirable behavior according to a given cultural context. We propose a Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets, an iterative process to significantly change model behavior by crafting and fine-tuning on a dataset that reflects a predetermined set of target values. We evaluate our process using three metrics: quantitative metrics with human evaluations that score output adherence to a target value, toxicity scoring on outputs; and qualitative metrics analyzing the most common word associated with a given social category. Through each iteration, we add additional training dataset examples based on observed shortcomings from evaluations. PALMS performs significantly better on all metrics compared to baseline and control models for a broad range of GPT-3 language model sizes without compromising capability integrity. We find that the effectiveness of PALMS increases with model size. We show that significantly adjusting language model behavior is feasible with a small, hand-curated dataset.
Saved from the shredder, Alan Turing's papers sell for 627,000
Breakthroughs, discoveries, and DIY tips sent every weekday. A trove of forgotten papers penned by famed World War II codebreaker Alan Turing has sold for the record-setting price of 627,000. But the June 17 auction almost never happened. At one point, the long-lost archival materials from the father of modern computer science were nearly pulverized by a paper shredder. Alan Turing was many things during his brief and ultimately tragic life: renowned mathematician, computer theorist, marathon runner, philosopher, and an invaluable codebreaker.
- Europe > United Kingdom (0.31)
- Europe > Germany (0.05)
- Information Technology > Security & Privacy (0.83)
- Government (0.51)
Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents
As AI systems pervade human life, ensuring that large language models (LLMs) make safe decisions remains a significant challenge. We introduce the Governance of the Commons Simulation (GovSim), a generative simulation platform designed to study strategic interactions and cooperative decision-making in LLMs. In GovSim, a society of AI agents must collectively balance exploiting a common resource with sustaining it for future use. This environment enables the study of how ethical considerations, strategic planning, and negotiation skills impact cooperative outcomes. We develop an LLM-based agent architecture and test it with the leading open and closed LLMs.
PrecisePK Collaborates with Wolters Kluwer to Enhance Dose Optimization
PrecisePK announced that they will collaborate with Wolters Kluwer, a global provider of trusted clinical technology and evidence-based solutions, to offer an integrated Bayesian dosing solution through Sentri7 Pharmacy in early 2023. With PrecisePK's model-informed precision dosing (MIPD) software, Sentri7 Pharmacy will deliver a comprehensive drug package that supports vancomycin and 20 other medications. "Our PrecisePK relationship will enable our users to leverage data and information to make better medication dosing decisions, improve patient safety, and drive better clinical outcomes," said Karen Kobelski, Vice President & General Manager, Clinical Surveillance Compliance & Data Solutions, Wolters Kluwer, Health. "Hospitals are short-staffed and clinicians are busier than ever, so we're always looking for ways to simplify clinician workloads and facilitate patient management. This relationship allows us to deliver a solution to help achieve these goals."
Generative Adversarial Networks for Image Generation: Mao, Xudong, Li, Qing: 9789813360471: Amazon.com: Books
Qing Li is currently a Chair Professor at the Hong Kong Polytechnic University. He also serves/served as a Guest Professor of Zhejiang University, an Adjunct Professor of the University of Science and Technology of China, and a Visiting Professor at the Wuhan University and the Hunan University. His research interests include database modeling, multimedia retrieval and management, social media computing and e-learning systems. Dr. Li has published over 400 papers in technical journals and international conferences in these areas, and is actively involved in the research community by serving as a journal reviewer, program committee chair/co-chair, and as an organizer/co-organizer of numerous international conferences. Currently he is the Chairman of the Hong Kong Web Society, a councillor of the Database Society of Chinese Computer Federation (CCF), a member of the CCF Big Data Experts Committee, and a member of the international WISE Society's steering committee.
- Education > Educational Setting (0.67)
- Retail > Online (0.40)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.31)
- Health & Medicine > Therapeutic Area > Immunology (0.31)
Real-World Challenges for AGI
Note: This post is a summary of a talk given at CERN Sparks! Serendipity Forum in September 2021, which can be viewed here. When people picture a world with artificial general intelligence (AGI), robots are more likely to come to mind than enabling solutions to society's most intractable problems. But I believe the latter is much closer to the truth. AI is already enabling huge leaps in tackling fundamental challenges: from solving protein folding to predicting accurate weather patterns, scientists are increasingly using AI to deduce the rules and principles that underpin highly complex real-world domains - ones they might never have discovered unaided.
- Europe > United Kingdom (0.05)
- Europe > Switzerland > Vaud > Lausanne (0.05)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.43)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.43)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)
AI ethics groups are repeating one of society's classic mistakes – MIT Technology Review
International organizations and corporations are racing to develop global guidelines for the ethical use of artificial intelligence. Declarations, manifestos, and recommendations are flooding the internet. But these efforts will be futile if they fail to account for the cultural and regional contexts in which AI operates. AI systems have repeatedly been shown to cause problems that disproportionately affect marginalized groups while benefiting a privileged few. The global AI ethics efforts under way today--of which there are dozens--aim to help everyone benefit from this technology, and to prevent it from causing harm. Generally speaking, they do this by creating guidelines and principles for developers, funders, and regulators to follow.
- North America > Canada > Quebec > Montreal (0.07)
- Europe > Middle East (0.06)
- Asia > Middle East (0.06)
- (9 more...)
AI ethics groups are repeating one of society's classic mistakes
International organizations and corporations are racing to develop global guidelines for the ethical use of artificial intelligence. Declarations, manifestos, and recommendations are flooding the internet. But these efforts will be futile if they fail to account for the cultural and regional contexts in which AI operates. AI systems have repeatedly been shown to cause problems that disproportionately affect marginalized groups while benefiting a privileged few. The global AI ethics efforts under way today--of which there are dozens--aim to help everyone benefit from this technology, and to prevent it from causing harm.
- North America (0.10)
- Asia > East Asia (0.07)
- Europe > Western Europe (0.05)
- (5 more...)