Goto

Collaborating Authors

 redirect


CWEval: Outcome-driven Evaluation on Functionality and Security of LLM Code Generation

Peng, Jinjun, Cui, Leyi, Huang, Kele, Yang, Junfeng, Ray, Baishakhi

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have significantly aided developers by generating or assisting in code writing, enhancing productivity across various tasks. While identifying incorrect code is often straightforward, detecting vulnerabilities in functionally correct code is more challenging, especially for developers with limited security knowledge, which poses considerable security risks of using LLM-generated code and underscores the need for robust evaluation benchmarks that assess both functional correctness and security. Current benchmarks like CyberSecEval and SecurityEval attempt to solve it but are hindered by unclear and impractical specifications, failing to assess both functionality and security accurately. To tackle these deficiencies, we introduce CWEval, a novel outcome-driven evaluation framework designed to enhance the evaluation of secure code generation by LLMs. This framework not only assesses code functionality but also its security simultaneously with high-quality task specifications and outcome-driven test oracles which provides high accuracy. Coupled with CWEval-bench, a multilingual, security-critical coding benchmark, CWEval provides a rigorous empirical security evaluation on LLM-generated code, overcoming previous benchmarks' shortcomings. Through our evaluations, CWEval reveals a notable portion of functional but insecure code produced by LLMs, and shows a serious inaccuracy of previous evaluations, ultimately contributing significantly to the field of secure code generation. We open-source our artifact at: https://github.com/Co1lin/CWEval .


Red Teaming GPT-4 Was Valuable. Violet Teaming Will Make It Better

WIRED

Last year, I was asked to break GPT-4--to get it to output terrible things. I and other interdisciplinary researchers were given advance access and attempted to prompt GPT-4 to show biases, generate hateful propaganda, and even take deceptive actions in order to help OpenAI understand the risks it posed, so they could be addressed before its public release. This is called AI red teaming: attempting to get an AI system to act in harmful or unintended ways. Aviv Ovadya consults for funders and companies on AI governance and is an affiliate with Harvard's Berkman Klein Center and GovAI. Red teaming is a valuable step toward building AI models that won't harm society.


Where to Learn Machine Learning in 2023

#artificialintelligence

Machine learning is a subset of artificial intelligence that involves the development of algorithms and statistical models that allow a computer to learn from data, without being explicitly programmed.


How to protect artists from AI? – 株式会社スガノワークス

#artificialintelligence

After he finally obtains QR code, issued by a reliable organization (of course maybe art university might be involved in this process) which is generated from the certificate, and if anyone capture this QR code with each smartphone, it redirects them to the artist authentication page which assures the picture is drawn by the artist. If fraudsters with AI tried to acclaim that he had drawn the picture by himself, and if he disguised QR code with either copying or stable-diffusioning or making his own website, those process is totally useless because copying redirects to other artist's auth page and stable-diffusioning redirects the QR to nowhere and making own website is not reliable. Each time you upload your digital paintings on line, you add this QR at the edge of your picture. In the authentication page the sample picture and drawings also accessible, plus the identity of the artists is with the authentication page, such that it looks not that easy to disguise the other artists ceritifcate. This is also applicable to any other fields, such as music.


Towards Adaptive Planning of Assistive-care Robot Tasks

Hamilton, Jordan, Stefanakos, Ioannis, Calinescu, Radu, Cámara, Javier

arXiv.org Artificial Intelligence

Whilst assistive robots [7] have been embedded into social and health care environments [1, 2, 10], they have largely been limited to simple applications, such as support for social and physical activities and hall monitoring, but often without considering potential interactions with humans. To expand the range of these applications, the human user and the robot need to interact in order to perform tasks together [4]. As such, this interaction, which is still underexplored in the social care domain, should be prioritised, with an emphasis on the safety of the human [3, 9]. To enable the development of applications that support such interaction and to ensure its safety, we propose an adaptive mission and path finding framework for an autonomous robot operating in a homecare environment. The framework models the environment as a graph, with nodes representing key locations within the environment where the robot can perform local tasks. Missions are modelled as a repertoire of locations within the environment where a task requires completion. The main contributions of our'research preview' paper are: (i) a generalised approach for modelling environments as graphs with edges represented as levels of risk, (ii) a modified Dijkstra's algorithm for performing path finding in uncertain environments with a cost function to reduce risk, (iii) simple human predictive behaviour model that forecasts human intention allowing for adaptive path finding using heat maps to artificially increase the risk associated with specific edges in the graph, (iv) a framework that combines modelling methods, adaptive path finding techniques and run-time probabilistic model generation for safety verification into an end-to-end solution for autonomous robotic mission planning, (v) finally, a simulation-based case study that shows the effectiveness of the framework.


What is Chatbot? Why Should You opt One For Your WordPress Website?

#artificialintelligence

Imagine owning a business website and having to respond to all your visitors' queries all day, every day. Well, rather than hiring a team of responders, why not take the technologically-advanced way and opt for a chatbot? You must have certainly seen one while browsing online, so you are well aware of the benefits that come with its integration. In simple terminology, a chatbot is an AI software that can help build a conversation in a natural flow of language with your website's users through chatting or voice commands. We shall dive into the basic functionalities of chatbots and understand why you should prefer them for your WordPress website. The chatbot market is estimated to reach $77.6 billion in 2022, a big jump from $24 billion in 2018.


How to Increase Your Google Page Speed Score

#artificialintelligence

How many times has your website taken a while to load? How many times have you said, "Meh. Your Google page speed score and your core web vitals are more important than ever. Even if you're making sales right now, it's only a matter of time before your competition decides it's better to be the hare and not the tortoise. All of the great content, social media promotion, and keyword research in the world won't matter if your website is a slug on a rainy day.


digitalmarketing_2021-12-06_16-03-43.xlsx

#artificialintelligence

The graph represents a network of 3,452 Twitter users whose tweets in the requested range contained "digitalmarketing", or who were replied to or mentioned in those tweets. The network was obtained from the NodeXL Graph Server on Tuesday, 07 December 2021 at 00:17 UTC. The requested start date was Monday, 06 December 2021 at 01:01 UTC and the maximum number of days (going backward) was 14. The maximum number of tweets collected was 7,500. The tweets in the network were tweeted over the 3-day, 3-hour, 48-minute period from Thursday, 02 December 2021 at 14:05 UTC to Sunday, 05 December 2021 at 17:53 UTC.


Artificial Intelligence - A.I. Aggressive Technologies

#artificialintelligence

Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device.


6 Tips for Writing User-Friendly Chatbot Error Messages - The Chatbot

#artificialintelligence

There's no denying that chatbots have ushered in a new age of online sales and interaction between brands and their customer bases. This is even truer in 2020, as customers around the globe have become fond of chatbots and don't mind interacting with AI instead of agents. Whether you operate as an eCommerce platform or a SAAS company, however, errors are bound to happen when you least expect them to, especially in the'learning' phase of the chatbot's Natural Language Processing. Once that happens, it's crucial that your chatbot can successfully respond to negative user experiences, complaints, purchase issues, or just simply being unable to understand the user's intents. It's good to devote some attention to error messages rather than automatically put up a'sorry I do not understand' error message.