Goto

Collaborating Authors

 django


Django: Detecting Trojans in Object Detection Models via Gaussian Focus Calibration

Neural Information Processing Systems

Object detection models are vulnerable to backdoor or trojan attacks, where an attacker can inject malicious triggers into the model, leading to altered behavior during inference. As a defense mechanism, trigger inversion leverages optimization to reverse-engineer triggers and identify compromised models. While existing trigger inversion methods assume that each instance from the support set is equally affected by the injected trigger, we observe that the poison effect can vary significantly across bounding boxes in object detection models due to its dense prediction nature, leading to an undesired optimization objective misalignment issue for existing trigger reverse-engineering methods. To address this challenge, we propose the first object detection backdoor detection framework Django (Detecting Trojans in Object Detection Models via Gaussian Focus Calibration). It leverages a dynamic Gaussian weighting scheme that prioritizes more vulnerable victim boxes and assigns appropriate coefficients to calibrate the optimization objective during trigger inversion. In addition, we combine Django with a novel label proposal pre-processing technique to enhance its efficiency. We evaluate Django on 3 object detection image datasets, 3 model architectures, and 2 types of attacks, with a total of 168 models. Our experimental results show that Django outperforms 6 state-of-the-art baselines, with up to 38% accuracy improvement and 10x reduced overhead.


R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents

Jain, Naman, Singh, Jaskirat, Shetty, Manish, Zheng, Liang, Sen, Koushik, Stoica, Ion

arXiv.org Artificial Intelligence

Improving open-source models on real-world SWE tasks (solving GITHUB issues) faces two key challenges: 1) scalable curation of execution environments to train these models, and, 2) optimal scaling of test-time compute. We introduce AgentGym, the largest procedurally-curated executable gym environment for training real-world SWE-agents, consisting of more than 8.7K tasks. AgentGym is powered by two main contributions: 1) SYNGEN: a synthetic data curation recipe that enables scalable curation of executable environments using test-generation and back-translation directly from commits, thereby reducing reliance on human-written issues or unit tests. We show that this enables more scalable training leading to pass@1 performance of 34.4% on SWE-Bench Verified benchmark with our 32B model. 2) Hybrid Test-time Scaling: we provide an in-depth analysis of two test-time scaling axes; execution-based and execution-free verifiers, demonstrating that they exhibit complementary strengths and limitations. Test-based verifiers suffer from low distinguishability, while execution-free verifiers are biased and often rely on stylistic features. Surprisingly, we find that while each approach individually saturates around 42-43%, significantly higher gains can be obtained by leveraging their complementary strengths. Overall, our approach achieves 51% on the SWE-Bench Verified benchmark, reflecting a new state-of-the-art for open-weight SWE-agents and for the first time showing competitive performance with proprietary models such as o1, o1-preview and sonnet-3.5-v2 (with tools). We will open-source our environments, models, and agent trajectories.


Unveiling Pitfalls: Understanding Why AI-driven Code Agents Fail at GitHub Issue Resolution

Chen, Zhi, Ma, Wei, Jiang, Lingxiao

arXiv.org Artificial Intelligence

AI-driven software development has rapidly advanced with the emergence of software development agents that leverage large language models (LLMs) to tackle complex, repository-level software engineering tasks. These agents go beyond just generation of final code; they engage in multi-step reasoning, utilize various tools for code modification and debugging, and interact with execution environments to diagnose and iteratively resolve issues. However, most existing evaluations focus primarily on static analyses of final code outputs, yielding limited insights into the agents' dynamic problem-solving processes. To fill this gap, we conduct an in-depth empirical study on 3,977 solving-phase trajectories and 3,931 testing-phase logs from 8 top-ranked agents evaluated on 500 GitHub issues in the SWE-Bench benchmark. Our exploratory analysis shows that Python execution errors during the issue resolution phase correlate with lower resolution rates and increased reasoning overheads. We have identified the most prevalent errors -- such as ModuleNotFoundError and TypeError -- and highlighted particularly challenging errors like OSError and database-related issues (e.g., IntegrityError) that demand significantly more debugging effort. Furthermore, we have discovered 3 bugs in the SWE-Bench platform that affect benchmark fairness and accuracy; these issues have been reported to and confirmed by the maintainers. To promote transparency and foster future research, we publicly share our datasets and analysis scripts.


Django: Detecting Trojans in Object Detection Models via Gaussian Focus Calibration

Neural Information Processing Systems

Object detection models are vulnerable to backdoor or trojan attacks, where an attacker can inject malicious triggers into the model, leading to altered behavior during inference. As a defense mechanism, trigger inversion leverages optimization to reverse-engineer triggers and identify compromised models. While existing trigger inversion methods assume that each instance from the support set is equally affected by the injected trigger, we observe that the poison effect can vary significantly across bounding boxes in object detection models due to its dense prediction nature, leading to an undesired optimization objective misalignment issue for existing trigger reverse-engineering methods. To address this challenge, we propose the first object detection backdoor detection framework Django (Detecting Trojans in Object Detection Models via Gaussian Focus Calibration). It leverages a dynamic Gaussian weighting scheme that prioritizes more vulnerable victim boxes and assigns appropriate coefficients to calibrate the optimization objective during trigger inversion.


SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

Yang, John, Jimenez, Carlos E., Wettig, Alexander, Lieret, Kilian, Yao, Shunyu, Narasimhan, Karthik, Press, Ofir

arXiv.org Artificial Intelligence

Language model (LM) agents are increasingly being used to automate complicated tasks in digital environments. Just as humans benefit from powerful software applications, such as integrated development environments, for complex tasks like software engineering, we posit that LM agents represent a new category of end users with their own needs and abilities, and would benefit from specially-built interfaces to the software they use. We investigate how interface design affects the performance of language model agents. As a result of this exploration, we introduce SWE-agent: a system that facilitates LM agents to autonomously use computers to solve software engineering tasks. SWE-agent's custom agent-computer interface (ACI) significantly enhances an agent's ability to create and edit code files, navigate entire repositories, and execute tests and other programs. We evaluate SWE-agent on SWE-bench and HumanEvalFix, achieving state-of-the-art performance on both with a pass@1 rate of 12.5% and 87.7%, respectively, far exceeding the previous state-of-the-art achieved with non-interactive LMs. Finally, we provide insight on how the design of the ACI can impact agents' behavior and performance.


Top 100 Python Interview Questions You Must Know

#artificialintelligence

In this Python Interview Questions tutorial, I will introduce you to the most frequently asked questions in Python interviews. Our Python Interview Questions is the one-stop resource from where you can boost your interview preparation. We have 100 questions on Python Programming basics which will help you with different expertise levels to reap the maximum benefit from our blog. What is the difference between list and tuples in Python? What are the key features of Python? What type of language is python? How is Python an interpreted language? How is memory managed in Python? What is name space in Python?


The Data Science Pro Bootcamp 2022: 75 Projects In 75 Days

#artificialintelligence

In the last century, oil was considered as the'black gold'. But, with the industrial revolution and the emergence of the automotive industry, oil became the main driving source of human civilization. However, with time, its value dwindled due to the gradual exhaustion and resorting to alternative renewable sources of energy. In the 21st century, the new driving force behind industries is Data. As a matter of fact, even automobile industries are using data to impart autonomy and improve the safety of their vehicles.


Sr. Software Engineer (Python, Django) - Remote Tech Jobs

#artificialintelligence

We are seeking a Sr. Software Engineer (Python, Django) to join an innovative company bringing automation and optimization services to new heights. This company is applying for cutting-edge advances in operations research and machine learning to solve real-world challenges that will transform navigation for the future. Based in the Greater Boston area, you will have the chance to solve complex problems and see your solutions come to life in different industries through the use of an ML microservice platform that utilizes Natural Language Processing, Deep Learning, and Computer Vision. We can offer our Sr.


Remote Django openings near you -Updated September 18, 2022 - Remote Tech Jobs

#artificialintelligence

Role requiring'No experience data provided' months of experience in None We are seeking a Sr. Software Engineer (Python, Django) to join an innovative company bringing automation and optimization services to new heights. This company is applying for cutting-edge advances in operations research and machine learning to solve real-world challenges that will transform navigation for the future. Based in the Greater Boston area, you will have the chance to solve complex problems and see your solutions come to life in different industries through the use of an ML microservice platform that utilizes Natural Language Processing, Deep Learning, and Computer Vision. We can offer our Sr. Role requiring'No experience data provided' months of experience in Houston Highly Reputable Nationwide Healthcare Company seeks a Software Engineer!


Machine Learning Book Classification

#artificialintelligence

This is step by step course on how to create book classification using machine learning. It covers Numpy, Pandas, Matplotlib, Scikit learn and Django and at the end predictive model is deployed on Django. Most of things machine learning beginner do not know is how they can deploy a created model. How to put created model into application? Training model and getting 80%, 85% or 90% accuracy does not matter. As Artificial Intelligence Engineer you should be able to put created model into application.