AITopics

2306.06528

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre:

Research Report (0.64)
Instructional Material (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningSep-29-2023

Statistical physics, Bayesian inference and neural information processing

Grant, Erin, Nestler, Sandra, Şimşek, Berfin, Solla, Sara

Lecture notes from the course given by Professor Sara A. Solla at the Les Houches summer school on "Statistical physics of Machine Learning". The notes discuss neural information processing through the lens of Statistical Physics. Contents include Bayesian inference and its connection to a Gibbs description of learning and generalization, Generalized Linear Models as a controlled alternative to backpropagation through time, and linear and non-linear techniques for dimensionality reduction.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

2309.17006

Country:

North America > United States > Utah (0.04)
North America > United States > Illinois > Cook County > Evanston (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(4 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Arbel, Julyan, Pitas, Konstantinos, Vladimirova, Mariia, Fortuin, Vincent

A Primer on Bayesian Neural Networks: Review and Debates

arXiv.org Machine LearningSep-28-2023

Neural networks have achieved remarkable performance across various problem domains, but their widespread applicability is hindered by inherent limitations such as overconfidence in predictions, lack of interpretability, and vulnerability to adversarial attacks. To address these challenges, Bayesian neural networks (BNNs) have emerged as a compelling extension of conventional neural networks, integrating uncertainty estimation into their predictive capabilities. This comprehensive primer presents a systematic introduction to the fundamental concepts of neural networks and Bayesian inference, elucidating their synergistic integration for the development of BNNs. The target audience comprises statisticians with a potential background in Bayesian methods but lacking deep learning expertise, as well as machine learners proficient in deep neural networks but with limited exposure to Bayesian statistics. We provide an overview of commonly employed priors, examining their impact on model behavior and performance. Additionally, we delve into the practical considerations associated with training and inference in BNNs. Furthermore, we explore advanced topics within the realm of BNN research, acknowledging the existence of ongoing debates and controversies. By offering insights into cutting-edge developments, this primer not only equips researchers and practitioners with a solid foundation in BNNs, but also illuminates the potential applications of this dynamic field. As a valuable resource, it fosters an understanding of BNNs and their promising prospects, facilitating further advancements in the pursuit of knowledge and innovation.

international conference, neural information processing system, neural network, (12 more...)

arXiv.org Machine Learning

2309.16314

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
(6 more...)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.65)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceSep-28-2023

RLLTE: Long-Term Evolution Project of Reinforcement Learning

Yuan, Mingqi, Zhang, Zequn, Xu, Yang, Luo, Shihao, Li, Bo, Jin, Xin, Zeng, Wenjun

We present RLLTE: a long-term evolution, extremely modular, and open-source framework for reinforcement learning (RL) research and application. Beyond delivering top-notch algorithm implementations, RLLTE also serves as a toolkit for developing algorithms. More specifically, RLLTE decouples the RL algorithms completely from the exploitation-exploration perspective, providing a large number of components to accelerate algorithm development and evolution. In particular, RLLTE is the first RL framework to build a complete and luxuriant ecosystem, which includes model training, evaluation, deployment, benchmark hub, and large language model (LLM)-empowered copilot. RLLTE is expected to set standards for RL engineering practice and be highly stimulative for industry and academia.

algorithm, env, rllte, (14 more...)

2309.16382

Country:

Europe > Portugal > Braga > Braga (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Hong Kong (0.04)

Genre:

Research Report (0.50)
Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Shirafuji, Atsushi, Rahman, Md. Mostafizer, Amin, Md Faizul Ibne, Watanobe, Yutaka

Program Repair with Minimal Edits Using CodeT5

Programmers often struggle to identify and fix bugs in their programs. In recent years, many language models (LMs) have been proposed to fix erroneous programs and support error recovery. However, the LMs tend to generate solutions that differ from the original input programs. This leads to potential comprehension difficulties for users. In this paper, we propose an approach to suggest a correct program with minimal repair edits using CodeT5. We fine-tune a pre-trained CodeT5 on code pairs of wrong and correct programs and evaluate its performance with several baseline models. The experimental results show that the fine-tuned CodeT5 achieves a pass@100 of 91.95% and an average edit distance of the most similar correct program of 6.84, which indicates that at least one correct program can be suggested by generating 100 candidate programs. We demonstrate the effectiveness of LMs in suggesting program repair with minimal edits for solving introductory programming problems.

correct program, edit distance, program repair, (15 more...)

doi: 10.1109/iCAST57874.2023.10359288

2309.1476

Country:

Asia > Japan (0.05)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre:

Instructional Material (0.68)
Research Report > New Finding (0.34)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Jamieson, Peter, Bhunia, Suman, Rao, Dhananjai M.

With ChatGPT, do we have to rewrite our learning objectives -- CASE study in Cybersecurity

With the emergence of Artificial Intelligent chatbot tools such as ChatGPT and code writing AI tools such as GitHub Copilot, educators need to question what and how we should teach our courses and curricula in the future. In reality, automated tools may result in certain academic fields being deeply reduced in the number of employable people. In this work, we make a case study of cybersecurity undergrad education by using the lens of ``Understanding by Design'' (UbD). First, we provide a broad understanding of learning objectives (LOs) in cybersecurity from a computer science perspective. Next, we dig a little deeper into a curriculum with an undergraduate emphasis on cybersecurity and examine the major courses and their LOs for our cybersecurity program at Miami University. With these details, we perform a thought experiment on how attainable the LOs are with the above-described tools, asking the key question ``what needs to be enduring concepts?'' learned in this process. If an LO becomes something that the existence of automation tools might be able to do, we then ask ``what level is attainable for the LO that is not a simple query to the tools?''. With this exercise, we hope to establish an example of how to prompt ChatGPT to accelerate students in their achievements of LOs given the existence of these new AI tools, and our goal is to push all of us to leverage and teach these tools as powerful allies in our quest to improve human existence and knowledge.

bloom, chatgpt, curriculum, (16 more...)

2311.06261

Country:

North America > United States > Ohio (0.05)
North America > United States > Virginia > Alexandria County > Alexandria (0.04)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

HPCR: Holistic Proxy-based Contrastive Replay for Online Continual Learning

Lin, Huiwei, Feng, Shanshan, Zhang, Baoquan, Li, Xutao, Ong, Yew-soon, Ye, Yunming

Online continual learning (OCL) aims to continuously learn new data from a single pass over the online data stream. It generally suffers from the catastrophic forgetting issue. Existing replay-based methods effectively alleviate this issue by replaying part of old data in a proxy-based or contrastive-based replay manner. In this paper, we conduct a comprehensive analysis of these two replay manners and find they can be complementary. Inspired by this finding, we propose a novel replay-based method called proxy-based contrastive replay (PCR), which replaces anchor-to-sample pairs with anchor-to-proxy pairs in the contrastive-based loss to alleviate the phenomenon of forgetting. Based on PCR, we further develop a more advanced method named holistic proxy-based contrastive replay (HPCR), which consists of three components. The contrastive component conditionally incorporates anchor-to-sample pairs to PCR, learning more fine-grained semantic information with a large training batch. The second is a temperature component that decouples the temperature coefficient into two parts based on their impacts on the gradient and sets different values for them to learn more novel knowledge. The third is a distillation component that constrains the learning process to keep more historical knowledge. Experiments on four datasets consistently demonstrate the superiority of HPCR over various state-of-the-art methods.

continual learning, gradient, learning, (16 more...)

2309.15038

Country:

Asia > China > Heilongjiang Province > Harbin (0.05)
Asia > China > Guangdong Province > Shenzhen (0.05)
Asia > Singapore (0.04)
(3 more...)

Genre:

Research Report (1.00)
Instructional Material > Online (0.71)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Stateful Defenses for Machine Learning Models Are Not Yet Secure Against Black-box Attacks

Feng, Ryan, Hooda, Ashish, Mangaokar, Neal, Fawaz, Kassem, Jha, Somesh, Prakash, Atul

Recent work has proposed stateful defense models (SDMs) as a compelling strategy to defend against a black-box attacker who only has query access to the model, as is common for online machine learning platforms. Such stateful defenses aim to defend against black-box attacks by tracking the query history and detecting and rejecting queries that are "similar" and thus preventing black-box attacks from finding useful gradients and making progress towards finding adversarial attacks within a reasonable query budget. Recent SDMs (e.g., Blacklight and PIHA) have shown remarkable success in defending against state-of-the-art black-box attacks. In this paper, we show that SDMs are highly vulnerable to a new class of adaptive black-box attacks. We propose a novel adaptive black-box attack strategy called Oracle-guided Adaptive Rejection Sampling (OARS) that involves two stages: (1) use initial query patterns to infer key properties about an SDM's defense; and, (2) leverage those extracted properties to design subsequent query patterns to evade the SDM's defense while making progress towards finding adversarial inputs. OARS is broadly applicable as an enhancement to existing black-box attacks - we show how to apply the strategy to enhance six common black-box attacks to be more effective against current class of SDMs. For example, OARS-enhanced versions of black-box attacks improved attack success rate against recent stateful defenses from almost 0% to to almost 100% for multiple datasets within reasonable query budgets.

proposal distribution, query, sdm, (15 more...)

doi: 10.1145/3576915.3623116

2303.0628

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.05)
(4 more...)

Genre:

Research Report (1.00)
Instructional Material > Online (0.34)

Industry:

Transportation > Air (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

WIREDSep-25-2023, 21:07:09 GMT

FBI Agents Are Using Face Recognition Without Proper Training

The US Federal Bureau of Investigation (FBI) has done tens of thousands of face recognition searches using software from outside providers in recent years. Yet only 5 percent of the 200 agents with access to the technology have taken the bureau's three-day training course on how to use it, a report from the Government Accountability Office (GAO) this month reveals. The bureau has no policy for face recognition use in place to protect privacy, civil rights, or civil liberties. Lawmakers and others concerned about face recognition have said that adequate training on the technology and how to interpret its output is needed to reduce improper use or errors, although some experts say training can lull law enforcement and the public into thinking face recognition is low risk. Since the false arrest of Robert Williams near Detroit in 2020, multiple instances have surfaced in the US of arrests after a face recognition model wrongly identified a person.

artificial intelligence, face recognition, recognition, (8 more...)

WIRED

Country: North America > United States (1.00)

Genre: Instructional Material > Course Syllabus & Notes (0.57)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology: Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)

Lindsey, Mark, Shah, Ankit, Kubala, Francis, Stern, Richard M.

Online Active Learning For Sound Event Detection

arXiv.org Artificial IntelligenceSep-25-2023

Data collection and annotation is a laborious, time-consuming prerequisite for supervised machine learning tasks. Online Active Learning (OAL) is a paradigm that addresses this issue by simultaneously minimizing the amount of annotation required to train a classifier and adapting to changes in the data over the duration of the data collection process. Prior work has indicated that fluctuating class distributions and data drift are still common problems for OAL. This work presents new loss functions that address these challenges when OAL is applied to Sound Event Detection (SED). Experimental results from the SONYC dataset and two Voice-Type Discrimination (VTD) corpora indicate that OAL can reduce the time and effort required to train SED classifiers by a factor of 5 for SONYC, and that the new methods presented here successfully resolve issues present in existing OAL methods.

event detection, online active learning

2309.1446

Genre:

Instructional Material > Online (0.60)
Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)