Goto

Collaborating Authors

 conscience


Bergeron: Combating Adversarial Attacks through a Conscience-Based Alignment Framework

Pisano, Matthew, Ly, Peter, Sanders, Abraham, Yao, Bingsheng, Wang, Dakuo, Strzalkowski, Tomek, Si, Mei

arXiv.org Artificial Intelligence

Modern Large language models (LLMs) can still generate responses that may not be aligned with human expectations or values. While many weight-based alignment methods have been proposed, many of them still leave models vulnerable to attacks when used on their own. To help mitigate this issue, we introduce Bergeron, a framework designed to improve the robustness of LLMs against adversarial attacks. Bergeron employs a two-tiered architecture. Here, a secondary LLM serves as a simulated conscience that safeguards a primary LLM. We do this by monitoring for and correcting potentially harmful text within both the prompt inputs and the generated outputs of the primary LLM. Empirical evaluation shows that Bergeron can improve the alignment and robustness of several popular LLMs without costly fine-tuning. It aids both open-source and black-box LLMs by complementing and reinforcing their existing alignment training.


Natural Selection Favors AIs over Humans

Hendrycks, Dan

arXiv.org Artificial Intelligence

For billions of years, evolution has been the driving force behind the development of life, including humans. Evolution endowed humans with high intelligence, which allowed us to become one of the most successful species on the planet. Today, humans aim to create artificial intelligence systems that surpass even our own intelligence. As artificial intelligences (AIs) evolve and eventually surpass us in all domains, how might evolution shape our relations with AIs? By analyzing the environment that is shaping the evolution of AIs, we argue that the most successful AI agents will likely have undesirable traits. Competitive pressures among corporations and militaries will give rise to AI agents that automate human roles, deceive others, and gain power. If such agents have intelligence that exceeds that of humans, this could lead to humanity losing control of its future. More abstractly, we argue that natural selection operates on systems that compete and vary, and that selfish species typically have an advantage over species that are altruistic to other species. This Darwinian logic could also apply to artificial agents, as agents may eventually be better able to persist into the future if they behave selfishly and pursue their own interests with little regard for humans, which could pose catastrophic risks. To counteract these risks and evolutionary forces, we consider interventions such as carefully designing AI agents' intrinsic motivations, introducing constraints on their actions, and institutions that encourage cooperation. These steps, or others that resolve the problems we pose, will be necessary in order to ensure the development of artificial intelligence is a positive one.


Towards Human-like AI. An attempt to make AI more general with…

#artificialintelligence

After making many permutations to the thought stream lookback range, model temperature, and few-shot examples, the messages produced seem to be qualitatively worse when using a long thought stream than when using an arbitrarily short one, though this requires more experimentation, and a good benchmark. Intuitively this makes sense because a GPT trained on the internet wouldn't have many training examples of what a human was thinking (at least in a direct access format like this) before they said or wrote something. I'll need to rethink the way thoughts are incorporated or if they can be removed entirely. Perhaps thinking is an emergent property of intelligence and does not need to be explicitly included.


Google's Artificial Intelligence has a conscience, experts say

#artificialintelligence

It all started when Blake Lemoine was working on the advances of LaMDA (Language Model for Dialogue Applications), a Google Artificial Intelligence designed to have conversations and eventually perfect Google searches that works by analyzing sentences and patterns in conversation, one of the multiple ways of operating and learning from artificial intelligences. The main goal of this project is to ensure that the responses of Google's Artificial Intelligence in conversations transcend the automatic responses that characterize bots, for example, and that on this occasion different topics can be covered and progress in the depth of a dialogue naturally. That was how Blake Lemoine was testing this potential of LaMDA and found himself in one of the most incredible and influential conversations of his life, according to himself.


What AI Can? What AI Cannot?

#artificialintelligence

We all will collectively accept the truth that nowadays almost everyone is selling their software by saying that it contains AI and it is really very popular amongst everyone. It's like if you want to start a discussion about any IT topic, the hot topic you find there will be AI. And everyone loves to speak about it confidently and pour out their own self-imagined concepts, which are mostly nowhere near to reality, though it's good that people are taking interest in such topics. Also, I will try not to be too much technical and will explain most of the things in Layman's Terms. If you want a more detailed and technical article about AI you can check my other article.


Morality in the Age of Machines

#artificialintelligence

This is a book with three authors, which is both unusual and tricky because, while reading it, you're constantly wondering who might have written the section or sentence before you. Unsurprisingly, it is a book incapable of entering into functional relationships. You cannot settle down with it or get to know the mind that created it, so as to succumb to or fight against it. This book has an insinuating purpose that is not literary, not purposefully discursive, not even argumentative. What it advances is a rather sly, self-interested, and one-sided brief for how the most pressing issue currently facing the human race might be boxed off to the benefit of you-know-who.


Beware amoral humans, not artificial intelligence

#artificialintelligence

The tech elite possess extraordinary technical education, yet they are extraordinarily uneducated in history, philosophy, literature, and theology. As I saw building tech businesses in both Silicon Valley and Seattle, tech leaders weren't trained in the liberal arts. They weren't taught to think critically about tough ethical questions. And they weren't even instructed in how properly to accept the concept of truth (with a capital T), instead preferring a vague spirituality. They fancy themselves freethinkers and yet, cut off from the philosophical foundations of freedom, their thought is profoundly constrained and uniform.


The Facebook Portal Plus is great for video calls, hard on your conscience

USATODAY - Tech Top Stories

The Portal Plus is an expensive device with a $349 price tag that belies its capabilities. Although the Portal Plus boasts a big screen and some fun tools, it's far more limited than, say, the current-generation Amazon Echo Show 10 and Google Nest Hub Max. In fact, the big screen is arguably the sole reason to choose the Portal Plus over Facebook's $199 Portal Go, a nearly identical device save for its 10-inch display and battery-powered portability. Whether or not you'll be truly satisfied with either one, though, depends on what you want--and whether you have a Facebook or WhatsApp account, one of which is required to use any Portal. The Portal Plus' base is also its speaker.


Artificial Intelligence In The Corporate Boardroom

#artificialintelligence

Alphabet, the parent company of Google GOOG, is the leading tech company that decided to invest a lot of resources and funding in artificial intelligence. So much so, that the WSJ recently announced that AI is central to Google's future. Not surprisingly, Google has been dealing with different challenges concerning its top AI executives and researchers. Activists shareholders are also showing interest in this. Recently, there is a rise in shareholder proposals calling on boards to ensure proper AI governance.


Can AI develop a sense of right and wrong?

#artificialintelligence

Can artificial intelligence learn the moral values of human societies? Can an AI system make decisions in situations where it must weigh and balance between damage and benefits to different people or groups of people? Can AI develop a sense of right and wrong? In short, will artificial intelligence have a conscience? This question might sound irrelevant when considering today's AI systems, which are only capable of accomplishing very narrow tasks.