Goto

Collaborating Authors

 society


Society of Agents: Regret Bounds of Concurrent Thompson Sampling

Neural Information Processing Systems

We consider the concurrent reinforcement learning problem where $n$ agents simultaneously learn to make decisions in the same environment by sharing experience with each other. Existing works in this emerging area have empirically demonstrated that Thompson sampling (TS) based algorithms provide a particularly attractive alternative for inducing cooperation, because each agent can independently sample a belief environment (and compute a corresponding optimal policy) from the joint posterior computed by aggregating all agents' data, which induces diversity in exploration among agents while benefiting shared experience from all agents. However, theoretical guarantees in this area remain under-explored; in particular, no regret bound is known on TS based concurrent RL algorithms. In this paper, we fill in this gap by considering two settings. In the first, we study the simple finite-horizon episodic RL setting, where TS is naturally adapted into the concurrent setup by having each agent sample from the current joint posterior at the beginning of each episode. We establish a $\tilde{O}(HS\sqrt{\frac{AT}{n}})$ per-agent regret bound, where $H$ is the horizon of the episode, $S$ is the number of states, $A$ is the number of actions, $T$ is the number of episodes and $n$ is the number of agents. In the second setting, we consider the infinite-horizon RL problem, where a policy is measured by its long-run average reward. Here, despite not having natural episodic breakpoints, we show that by a doubling-horizon schedule, we can adapt TS to the infinite-horizon concurrent learning setting to achieve a regret bound of $\tilde{O}(DS\sqrt{ATn})$, where $D$ is the standard notion of diameter of the underlying MDP and $T$ is the number of timesteps. Note that in both settings, the per-agent regret decreases at an optimal rate of $\Theta(\frac{1}{\sqrt{n}})$, which manifests the power of cooperation in concurrent RL.


Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets

Neural Information Processing Systems

Language models can generate harmful and biased outputs and exhibit undesirable behavior according to a given cultural context. We propose a Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets, an iterative process to significantly change model behavior by crafting and fine-tuning on a dataset that reflects a predetermined set of target values. We evaluate our process using three metrics: quantitative metrics with human evaluations that score output adherence to a target value, toxicity scoring on outputs; and qualitative metrics analyzing the most common word associated with a given social category. Through each iteration, we add additional training dataset examples based on observed shortcomings from evaluations. PALMS performs significantly better on all metrics compared to baseline and control models for a broad range of GPT-3 language model sizes without compromising capability integrity. We find that the effectiveness of PALMS increases with model size. We show that significantly adjusting language model behavior is feasible with a small, hand-curated dataset.


Saved from the shredder, Alan Turing's papers sell for 627,000

Popular Science

Breakthroughs, discoveries, and DIY tips sent every weekday. A trove of forgotten papers penned by famed World War II codebreaker Alan Turing has sold for the record-setting price of 627,000. But the June 17 auction almost never happened. At one point, the long-lost archival materials from the father of modern computer science were nearly pulverized by a paper shredder. Alan Turing was many things during his brief and ultimately tragic life: renowned mathematician, computer theorist, marathon runner, philosopher, and an invaluable codebreaker.


Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents

Neural Information Processing Systems

As AI systems pervade human life, ensuring that large language models (LLMs) make safe decisions remains a significant challenge. We introduce the Governance of the Commons Simulation (GovSim), a generative simulation platform designed to study strategic interactions and cooperative decision-making in LLMs. In GovSim, a society of AI agents must collectively balance exploiting a common resource with sustaining it for future use. This environment enables the study of how ethical considerations, strategic planning, and negotiation skills impact cooperative outcomes. We develop an LLM-based agent architecture and test it with the leading open and closed LLMs.


'Meta has stolen books': authors to protest in London against AI trained using 'shadow library'

The Guardian

Novelists Kate Mosse and Tracy Chevalier as well as poet and former Royal Society of Literature chair Daljit Nagra will be among those in attendance outside the company's King's Cross office. Protesters will meet at Granary Square at 1.30pm and a letter to Meta from the Society of Authors (SoA) will be hand-delivered at 1.45pm. It will also be sent to Meta headquarters in the US. Earlier this year, a US court filing alleged that Meta CEO Mark Zuckerberg approved the company's use of a notorious "shadow library", LibGen, which contains more than 7.5 million books. Last month, the Atlantic republished a searchable database of the titles contained in LibGen, through which many authors discovered their works may have been used to train Meta's AI models.


British authors want Meta to answer for alleged copyright infringement

Engadget

A March 20 article in The Atlantic served as the letter's impetus. It reported that Meta had used LibGen, a pirated collection of over 7.5 million books, to train its AI models. Anyone on the internet over the last few weeks has likely seen videos of distraught authors learning that their work is available on the database (and potentially used by Meta without their permission). A lawsuit in the US alleges Meta CEO Mark Zuckerberg approved the use of LibGen's data to train its AI. The lawsuit's plaintiffs include writers Sarah Silverman and Ta-Nehisi Coates.


Interpretable Visualizations of Data Spaces for Classification Problems

Jorgensen, Christian, Lin, Arthur Y., Cersonsky, Rose K.

arXiv.org Machine Learning

How do classification models "see" our data? Based on their success in delineating behaviors, there must be some lens through which it is easy to see the boundary between classes; however, our current set of visualization techniques makes this prospect difficult. In this work, we propose a hybrid supervised-unsupervised technique distinctly suited to visualizing the decision boundaries determined by classification problems. This method provides a human-interpretable map that can be analyzed qualitatively and quantitatively, which we demonstrate through visualizing and interpreting a decision boundary for chemical neurotoxicity. While we discuss this method in the context of chemistry-driven problems, its application can be generalized across subfields for "unboxing" the operations of machine-learning classification models.


Responsible Artificial Intelligence Systems: A Roadmap to Society's Trust through Trustworthy AI, Auditability, Accountability, and Governance

Herrera-Poyatos, Andrés, Del Ser, Javier, de Prado, Marcos López, Wang, Fei-Yue, Herrera-Viedma, Enrique, Herrera, Francisco

arXiv.org Artificial Intelligence

Artificial intelligence (AI) has matured as a technology, necessitating the development of responsibility frameworks that are fair, inclusive, trustworthy, safe and secure, transparent, and accountable. By establishing such frameworks, we can harness the full potential of AI while mitigating its risks, particularly in high-risk scenarios. This requires the design of responsible AI systems based on trustworthy AI technologies and ethical principles, with the aim of ensuring auditability and accountability throughout their design, development, and deployment, adhering to domain-specific regulations and standards. This paper explores the concept of a responsible AI system from a holistic perspective, which encompasses four key dimensions: 1) regulatory context; 2) trustworthy AI technology along with standardization and assessments; 3) auditability and accountability; and 4) AI governance. The aim of this paper is double. First, we analyze and understand these four dimensions and their interconnections in the form of an analysis and overview. Second, the final goal of the paper is to propose a roadmap in the design of responsible AI systems, ensuring that they can gain society's trust. To achieve this trustworthiness, this paper also fosters interdisciplinary discussions on the ethical, legal, social, economic, and cultural aspects of AI from a global governance perspective. Last but not least, we also reflect on the current state and those aspects that need to be developed in the near future, as ten lessons learned.



PrecisePK Collaborates with Wolters Kluwer to Enhance Dose Optimization

#artificialintelligence

PrecisePK announced that they will collaborate with Wolters Kluwer, a global provider of trusted clinical technology and evidence-based solutions, to offer an integrated Bayesian dosing solution through Sentri7 Pharmacy in early 2023. With PrecisePK's model-informed precision dosing (MIPD) software, Sentri7 Pharmacy will deliver a comprehensive drug package that supports vancomycin and 20 other medications. "Our PrecisePK relationship will enable our users to leverage data and information to make better medication dosing decisions, improve patient safety, and drive better clinical outcomes," said Karen Kobelski, Vice President & General Manager, Clinical Surveillance Compliance & Data Solutions, Wolters Kluwer, Health. "Hospitals are short-staffed and clinicians are busier than ever, so we're always looking for ways to simplify clinician workloads and facilitate patient management. This relationship allows us to deliver a solution to help achieve these goals."