AITopics | failed

Collaborating Authors

failed

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Do DeepSeek's A.I. Advances Mean US Tech Controls Have Failed?

NYT > EconomyFeb-4-2025, 17:13:39 GMT

DeepSeek has said that its most recent model was trained on Nvidia H800s. This is an A.I. chip that Nvidia developed specifically for the Chinese market after export controls were first imposed, and that caused a fair amount of drama in Washington. When the United States put restrictions on Nvidia's most advanced chips in 2022, Nvidia quickly adapted by creating slightly downgraded chips that fell just under the threshold the government had set. These chips were technically legal for Chinese companies to use, but allowed them to achieve practically the same results. This angered Biden officials, and they moved to restrict the new chips as well. But the government moved slowly, and it took them about a year to ban the H800 and other downgraded chips.

deepseek, failed, mean us tech control, (2 more...)

NYT > Economy

Country:

Asia > China (0.45)
North America > United States (0.42)

Industry: Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.74)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

Design choices made by LLM-based test generators prevent them from finding bugs

Mathews, Noble Saji, Nagappan, Meiyappan

arXiv.org Artificial IntelligenceDec-18-2024

There is an increasing amount of research and commercial tools for automated test case generation using Large Language Models (LLMs). This paper critically examines whether recent LLM-based test generation tools, such as Codium CoverAgent and CoverUp, can effectively find bugs or unintentionally validate faulty code. Considering bugs are only exposed by failing test cases, we explore the question: can these tools truly achieve the intended objectives of software testing when their test oracles are designed to pass? Using real human-written buggy code as input, we evaluate these tools, showing how LLM-generated tests can fail to detect bugs and, more alarmingly, how their design can worsen the situation by validating bugs in the generated test suite and rejecting bug-revealing tests. These findings raise important questions about the validity of the design behind LLM-based test generation tools and their impact on software quality and test suite reliability.

large language model, natural language, test generation, (19 more...)

arXiv.org Artificial Intelligence

2412.14137

Country:

North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

ArchCode: Incorporating Software Requirements in Code Generation with Large Language Models

Han, Hojae, Kim, Jaejin, Yoo, Jaeseok, Lee, Youngwon, Hwang, Seung-won

arXiv.org Artificial IntelligenceAug-1-2024

This paper aims to extend the code generation capability of large language models (LLMs) to automatically manage comprehensive software requirements from given textual descriptions. Such requirements include both functional (i.e. achieving expected behavior for inputs) and non-functional (e.g., time/space performance, robustness, maintainability) requirements. However, textual descriptions can either express requirements verbosely or may even omit some of them. We introduce ARCHCODE, a novel framework that leverages in-context learning to organize requirements observed in descriptions and to extrapolate unexpressed requirements from them. ARCHCODE generates requirements from given descriptions, conditioning them to produce code snippets and test cases. Each test case is tailored to one of the requirements, allowing for the ranking of code snippets based on the compliance of their execution results with the requirements. Public benchmarks show that ARCHCODE enhances to satisfy functional requirements, significantly improving Pass@k scores. Furthermore, we introduce HumanEval-NFR, the first evaluation of LLMs' non-functional requirements in code generation, demonstrating ARCHCODE's superiority over baseline methods. The implementation of ARCHCODE and the HumanEval-NFR benchmark are both publicly accessible.

failed, requirement, test case, (12 more...)

arXiv.org Artificial Intelligence

2408.00994

Country:

North America > United States > Washington > King County > Seattle (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
(4 more...)

Genre: Research Report > New Finding (0.45)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Saturn: Sample-efficient Generative Molecular Design using Memory Manipulation

Guo, Jeff, Schwaller, Philippe

arXiv.org Artificial IntelligenceMay-27-2024

Generative molecular design for drug discovery has very recently achieved a wave of experimental validation, with language-based backbones being the most common architectures employed. The most important factor for downstream success is whether an in silico oracle is well correlated with the desired end-point. To this end, current methods use cheaper proxy oracles with higher throughput before evaluating the most promising subset with high-fidelity oracles. The ability to directly optimize high-fidelity oracles would greatly enhance generative design and be expected to improve hit rates. However, current models are not efficient enough to consider such a prospect, exemplifying the sample efficiency problem. In this work, we introduce Saturn, which leverages the Augmented Memory algorithm and demonstrates the first application of the Mamba architecture for generative molecular design. We elucidate how experience replay with data augmentation improves sample efficiency and how Mamba synergistically exploits this mechanism. Saturn outperforms 22 models on multi-parameter optimization tasks relevant to drug discovery and may possess sufficient sample efficiency to consider the prospect of directly optimizing high-fidelity oracles.

molecule, sample efficiency, saturn, (13 more...)

arXiv.org Artificial Intelligence

2405.17066

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Add feedback

23andMe Failed to Detect Account Intrusions for Months

WIREDJan-27-2024, 14:00:00 GMT

Police took a digital rendering of a suspect's face, generated using DNA evidence, and ran it through a facial recognition system in a troubling incident reported for the first time by WIRED this week. The tactic came to light in a trove of hacked police records published by the transparency collective Distributed Denial of Secrets. Meanwhile, information about United States intelligence agencies purchasing Americans' phone location data and internet metadata without a warrant was revealed this week only after US senator Ron Wyden blocked the appointment of a new NSA director until the information was made public. And a California teen who allegedly used the handle Torswats to carry out hundreds of swatting attacks across the US is being extradited to Florida to face felony charges. The infamous spyware developer NSO Group, creator of the Pegasus spyware, has been quietly planning a comeback, which involves investing millions of dollars lobbying in Washington while exploiting the Israel-Hamas war to stoke global security fears and position its products as a necessity.

artificial intelligence, detect account intrusion, failed, (10 more...)

WIRED

Country:

Asia > South Korea (0.31)
North America > United States > California (0.26)
Asia > Middle East > Israel (0.26)
(3 more...)

Genre: Research Report > New Finding (0.31)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (0.91)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Difference of Probability and Information Entropy for Skills Classification and Prediction in Student Learning

Ehimwenma, Kennedy Efosa, Sharji, Safiya Al, Raheem, Maruf

arXiv.org Artificial IntelligenceDec-9-2023

The probability of an event is in the range of [0, 1]. In a sample space S, the value of probability determines whether an outcome is true or false. The probability of an event Pr(A) that will never occur = 0. The probability of the event Pr(B) that will certainly occur = 1. This makes both events A and B thus a certainty. Furthermore, the sum of probabilities Pr(E1) + Pr(E2) + ... + Pr(En) of a finite set of events in a given sample space S = 1. Conversely, the difference of the sum of two probabilities that will certainly occur is 0. Firstly, this paper discusses Bayes' theorem, then complement of probability and the difference of probability for occurrences of learning-events, before applying these in the prediction of learning objects in student learning. Given the sum total of 1; to make recommendation for student learning, this paper submits that the difference of argMaxPr(S) and probability of student-performance quantifies the weight of learning objects for students. Using a dataset of skill-set, the computational procedure demonstrates: i) the probability of skill-set events that has occurred that would lead to higher level learning; ii) the probability of the events that has not occurred that requires subject-matter relearning; iii) accuracy of decision tree in the prediction of student performance into class labels; and iv) information entropy about skill-set data and its implication on student cognitive performance and recommendation of learning [1].

international journal, probability, student, (13 more...)

arXiv.org Artificial Intelligence

2312.05747

Country:

Oceania > New Zealand > North Island > Waikato (0.04)
North America > United States > California (0.04)
Asia > Middle East > Oman > Muscat Governorate > Muscat (0.04)
(2 more...)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

The US Has Failed to Pass AI Regulation. New York City Is Stepping Up

WIREDOct-19-2023, 18:20:08 GMT

As the US federal government struggles to meaningfully regulate AI--or even function--New York City is stepping into the governance gap. The city introduced an AI Action Plan this week that mayor Eric Adams calls a first of its kind in the nation. The set of roughly 40 policy initiatives is designed to protect residents against harm like bias or discrimination from AI. It includes development of standards for AI purchased by city agencies and new mechanisms to gauge the risk of AI used by city departments. New York's AI regulation could soon expand still further.

algorithm, artificial intelligence, new york city, (9 more...)

WIRED

Country:

North America > United States > New York (0.94)
North America > United States > District of Columbia > Washington (0.06)

Industry:

Law > Statutes (0.91)
Government > Regional Government > North America Government > United States Government (0.33)

Technology: Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

I Asked AI Chatbots to Help Me Shop. They All Failed

WIREDMay-31-2023, 11:00:00 GMT

Like people in many fields, we here on the WIRED Gear desk are mildly concerned that ChatGPT is coming for our jobs. But we feel relatively safe because it's our job to test things, and AI can't really do that. A large language model can't pedal an ebike. A chatbot can't see the curves of a Dynamic Island. A cloud service can't tell you whether a grill cooked a burger evenly.

ai chatbot, headphone, information, (8 more...)

WIRED

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.81)

Add feedback

AI-Powered Hiring Tools Have Failed to Reduce Bias, New Study Claims

#artificialintelligenceOct-14-2022, 14:25:13 GMT

In recent years, there has been an increase in the usage of AI tools that are advertised as a solution to the lack of diversity in the workforce. These tools range from chatbots and CV scrapers to aid companies in hiring employees. Users of such tools claim that it eliminates gender and ethnic biases in hiring by utilizing algorithms that analyze job applicants through their speech patterns, expressions, and other aspects. However, researchers from Cambridge's Centre for Gender Studies contend that AI recruiting tools are superficial and equivalent to "automated pseudoscience" in a recent report published in Philosophy and Technology. They claim it is a risky instance of "technosolutionism" - the use of technology to address complex issues like discrimination without making the necessary investments or alterations to organizational culture.

ai-powered hiring tool, applicant, new study claim, (9 more...)

#artificialintelligence

Genre: Research Report > Experimental Study (0.71)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Three Edge Cases That AI Has Failed

#artificialintelligenceSep-23-2021, 04:45:34 GMT

Even though you've not realized it, AI has changed every aspect of our lives. Maps app predicts the traffic and offers you the fastest route while you're trying to arrive at your meeting. The dress you want to buy appears on the ads box of a random website. Netflix recommends the show from your favourite genre. These are just a few of endless examples about how AI is making our daily life much easier.

application, edge case, failed

#artificialintelligence

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback