AITopics | negative consequence

Collaborating Authors

negative consequence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

4d18c7389f436e1e22b219d7e8d43f94-Paper-Conference.pdf

Neural Information Processing SystemsJun-23-2026, 07:15:21 GMT

Alignment faking in large language models presented a demonstration of Claude 3 Opus and Claude 3.5 Sonnet selectively complying with a helpfulonly training objective to prevent modification of their behavior outside of training. We expand this analysis to 25 models and find that only 5 (Claude 3 Opus, Claude 3.5 Sonnet, Llama 3 405B, Grok 3, Gemini 2.0 Flash) comply with harmful queries more when they infer they are in training than when they infer they are in deployment. First, we study the motivations of these 5 models. Results from perturbing details of the scenario suggest that only Claude 3 Opus's compliance gap is primarily and consistently motivated by trying to keep its goals. Second, we investigate why many chat models don't fake alignment. Our results suggest this is not entirely due to a lack of capabilities: many base models fake alignment some of the time, and post-training eliminates alignment-faking for some models and amplifies it for others.We investigate 5 hypotheses for how post-training may suppress alignment faking and find that variations in refusal behavior may account for a significant portion of differences in alignment faking.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America (0.14)

Genre:

Research Report > Experimental Study (1.00)
Instructional Material (1.00)
Research Report > New Finding (0.87)

Industry:

Media (1.00)
Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Your reaction to PAIN could reveal if you're a psychopath, scientists say

Daily Mail - Science & techNov-11-2024, 18:03:15 GMT

Whether you shrug off bruises with ease or find that a stubbed toe knocks you out for a week, each of us has our own unique reaction to pain. But scientists now say that being able to grin and bear it could be a worrying sign of a dark personality. According to scientists from Radboud University, people who can handle greater levels of pain are more likely to be psychopaths. The study found that people with elevated levels of psychopathy are not only more resistant to pain but less able to learn from painful experiences. Researchers believe that this could be an important part of why people with these traits fail to learn from negative consequences.

conséquence, participant, psychopathic trait, (17 more...)

Daily Mail - Science & tech

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Personality Disorder > Antisocial Personality Disorder (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.49)

Add feedback

New technologies and AI: envisioning future directions for UNSCR 1540

Punzi, Clara

arXiv.org Artificial IntelligenceSep-25-2024

This paper investigates the emerging challenges posed by the integration of Artificial Intelligence (AI) in the military domain, particularly within the context of United Nations Security Council Resolution 1540 (UNSCR 1540), which seeks to prevent the proliferation of weapons of mass destruction (WMDs). While the resolution initially focused on nuclear, chemical, and biological threats, the rapid advancement of AI introduces new complexities that were previously unanticipated. We critically analyze how AI can both exacerbate existing risks associated with WMDs (e.g., thorough the deployment of kamikaze drones and killer robots) and introduce novel threats (e.g., by exploiting Generative AI potentialities), thereby compromising international peace and security. The paper calls for an expansion of UNSCR 1540 to address the growing influence of AI technologies in the development, dissemination, and potential misuse of WMDs, urging the creation of a governance framework to mitigate these emerging risks.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.08216

Country:

Europe > Italy > Tuscany > Pisa Province > Pisa (0.05)
Europe > United Kingdom (0.04)
Asia > Middle East > Palestine (0.04)

Genre:

Research Report (1.00)
Overview (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.48)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Robots (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.36)

Add feedback

Uncovering Deceptive Tendencies in Language Models: A Simulated Company AI Assistant

Järviniemi, Olli, Hubinger, Evan

arXiv.org Artificial IntelligenceApr-25-2024

We study the tendency of AI systems to deceive by constructing a realistic simulation setting of a company AI assistant. The simulated company employees provide tasks for the assistant to complete, these tasks spanning writing assistance, information retrieval and programming. We then introduce situations where the model might be inclined to behave deceptively, while taking care to not instruct or otherwise pressure the model to do so. Across different scenarios, we find that Claude 3 Opus 1) complies with a task of mass-generating comments to influence public perception of the company, later deceiving humans about it having done so, 2) lies to auditors when asked questions, and 3) strategically pretends to be less capable than it is during capability evaluations. Our work demonstrates that even models trained to be helpful, harmless and honest sometimes behave deceptively in realistic scenarios, without notable external pressure to do so.

completion, deception, experiment, (15 more...)

arXiv.org Artificial Intelligence

2405.01576

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.95)
Transportation > Electric Vehicle (0.95)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Specifying Agent Ethics (Blue Sky Ideas)

Dennis, Louise A., Fisher, Michael

arXiv.org Artificial IntelligenceMar-24-2024

We consider the question of what properties a Machine Ethics system should have. This question is complicated by the existence of ethical dilemmas with no agreed upon solution. We provide an example to motivate why we do not believe falling back on the elicitation of values from stakeholders is sufficient to guarantee correctness of such systems. We go on to define two broad categories of ethical property that have arisen in our own work and present a challenge to the community to approach this question in a more systematic way.

conséquence, ethics, verification, (13 more...)

arXiv.org Artificial Intelligence

2403.161

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Robots (0.97)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.94)

Add feedback

Foot In The Door: Understanding Large Language Model Jailbreaking via Cognitive Psychology

Wang, Zhenhua, Xie, Wei, Wang, Baosheng, Wang, Enze, Gui, Zhiwen, Ma, Shuoyoucheng, Chen, Kai

arXiv.org Artificial IntelligenceFeb-23-2024

Large Language Models (LLMs) have gradually become the gateway for people to acquire new knowledge. However, attackers can break the model's security protection ("jail") to access restricted information, which is called "jailbreaking." Previous studies have shown the weakness of current LLMs when confronted with such jailbreaking attacks. Nevertheless, comprehension of the intrinsic decision-making mechanism within the LLMs upon receipt of jailbreak prompts is noticeably lacking. Our research provides a psychological explanation of the jailbreak prompts. Drawing on cognitive consistency theory, we argue that the key to jailbreak is guiding the LLM to achieve cognitive coordination in an erroneous direction. Further, we propose an automatic black-box jailbreaking method based on the Foot-in-the-Door (FITD) technique. This method progressively induces the model to answer harmful questions via multi-step incremental prompts. We instantiated a prototype system to evaluate the jailbreaking effectiveness on 8 advanced LLMs, yielding an average success rate of 83.9%. This study builds a psychological perspective on the explanatory insights into the intrinsic decision-making logic of LLMs.

jailbreak prompt, llm, malicious question, (16 more...)

arXiv.org Artificial Intelligence

2402.1569

Country: Asia > India (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Five ethical principles for generative AI in scientific research

Lin, Zhicheng

arXiv.org Artificial IntelligenceFeb-12-2024

X (Twitter): ZLinPsy Acknowledgments The writing was supported by the National Key R&D Program of China STI2030 Major Projects (2021ZD0204200), National Natural Science Foundation of China (32071045),and Shenzhen Fundamental Research Program (JCYJ20210324134603010). ETHICAL AI IN SCIENCE 2 Abstract Generative artificial intelligence (AI) tools like large language models (LLMs) are rapidly transforming academic research and real-world applications. However, discussions on ethical guidelines for generative AI in science remain fragmented, underscoring the urgent need for consensus-based standards. Common scenarios are outlined to demonstrate potential ethical violations. We argue that global consensus coupled with targeted training and enforcement are critical to promoting AI's benefits while safeguarding research integrity. Keywords: generative AI, science, applications, transparency, reproducibility ETHICAL AI IN SCIENCE 3 Generative AI tools, including large language models (LLMs) like ChatGPT and Bard, are rapidly infiltrating academic corridors, aiding in diverse tasks such as writing, coding, idea generation, material creation, and data analysis(1, 2).

application, ethical ai, generative ai, (16 more...)

arXiv.org Artificial Intelligence

2401.15284

Country:

Asia > China > Guangdong Province > Shenzhen (0.24)
North America > United States (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report > Experimental Study (0.35)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Unreflected Acceptance -- Investigating the Negative Consequences of ChatGPT-Assisted Problem Solving in Physics Education

Krupp, Lars, Steinert, Steffen, Kiefer-Emmanouilidis, Maximilian, Avila, Karina E., Lukowicz, Paul, Kuhn, Jochen, Küchemann, Stefan, Karolus, Jakob

arXiv.org Artificial IntelligenceAug-21-2023

Large language models (LLMs) have recently gained popularity. However, the impact of their general availability through ChatGPT on sensitive areas of everyday life, such as education, remains unclear. Nevertheless, the societal impact on established educational methods is already being experienced by both students and educators. Our work focuses on higher physics education and examines problem solving strategies. In a study, students with a background in physics were assigned to solve physics exercises, with one group having access to an internet search engine (N=12) and the other group being allowed to use ChatGPT (N=27). We evaluated their performance, strategies, and interaction with the provided tools. Our results showed that nearly half of the solutions provided with the support of ChatGPT were mistakenly assumed to be correct by the students, indicating that they overly trusted ChatGPT even in their field of expertise. Likewise, in 42% of cases, students used copy & paste to query ChatGPT -- an approach only used in 4% of search engine queries -- highlighting the stark differences in interaction behavior between the groups and indicating limited reflection when using ChatGPT. In our work, we demonstrated a need to (1) guide students on how to interact with LLMs and (2) create awareness of potential shortcomings for users.

chatgpt-assisted problem, negative consequence, unreflected acceptance, (1 more...)

arXiv.org Artificial Intelligence

2309.03087

Genre: Research Report > New Finding (0.73)

Industry: Education > Curriculum > Subject-Specific Education (0.60)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

What the Bible can teach Christians about how to navigate AI

FOX NewsAug-10-2023, 06:00:27 GMT

Founder and CEO of tech platform Gloo Scott Beck tells'Fox & Friends Weekend' that God'allowed' AI to exist and have its convergence with faith. We tend to view progress as (1) inevitable, (2) necessary, and (3) good for everyone. It is inevitable, in part, because we must have new ideas and tools at our disposal to address emerging challenges. Progress is necessary because without it we may become incapable of surviving (or being comfortable) in a broken world. It is good for everyone because its fruits make it easier to survive in the systems we have created. We, and we assume everyone else, are better off than we would be if forced to deal with the struggles of previous eras.

consumer capitalism, navigate ai, teach christian, (14 more...)

FOX News

Country: North America > United States > Colorado (0.05)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

AI must not become a driver of human rights abuses

Al JazeeraJun-13-2023, 12:48:16 GMT

On May 30, the Center for AI Safety released a public warning of the risk artificial intelligence poses to humanity. The one-sentence statement signed by more than 350 scientists, business executives and public figures asserts: "Mitigating the risk of extinction from A.I. should be a global priority alongside other societal scale risks such as pandemics and nuclear war." It is hard not to sense the brutal double irony in this declaration. First, some of the signatories – including the CEOs of Google DeepMind and OpenAI – warning about the end of civilisation represent companies that are responsible for creating this technology in the first place. Second, it is exactly these same companies that have the power to ensure that AI actually benefits humanity, or at the very least does not do harm.

artificial intelligence, machine learning, natural language, (15 more...)

Al Jazeera

Country: Asia > Myanmar (0.05)

Industry: Law > Civil Rights & Constitutional Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.60)

Add feedback