performance optimization
PerfBench: Can Agents Resolve Real-World Performance Bugs?
Garg, Spandan, Moghaddam, Roshanak Zilouchian, Sundaresan, Neel
Performance bugs are inefficiencies in software that waste computational resources without causing functional failures, making them particularly challenging to detect and fix. While recent advances in Software Engineering agents have shown promise in automated bug fixing, existing benchmarks primarily focus on functional correctness and fail to evaluate agents' abilities to identify and resolve non-functional issues like performance bugs. We introduce PerfBench, a benchmark comprising 81 real-world performance bug-fixing tasks from popular .NET repositories on GitHub. Unlike existing benchmarks that rely on pre-existing test suites, PerfBench features a novel evaluation harness that allows agents to generate their own performance benchmarks and validates fixes by comparing execution metrics collected for developer fix and agent fix. Each task in PerfBench is derived from actual developer fixes linked to performance-related issues, which are then verified by human experts, ensuring real-world relevance. Our evaluation reveals that current state-of-the-art coding agents struggle with performance optimization tasks, with baseline OpenHands agent achieving only a ~3% success rate on our benchmark. We develop OpenHands-Perf-Agent, which incorporates performance-aware tooling and instructions and achieves a ~20% success rate on the benchmark. We show that by ensuring the agent has proper instructions to benchmark its changes and tooling for benchmark output processing, we can improve the agent performance significantly, but room for improvement still remains. PerfBench provides a challenging test set for furthering the capabilities of agents in fixing performance issues.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > Washington > King County > Redmond (0.04)
- (3 more...)
An advanced AI driven database system
Tedeschi, M., Rizwan, S., Shringi, C., Chandgir, V. Devram, Belich, S.
Contemporary database systems, while effective, suffer severe issues related to complexity and usability, especially among individuals who lack technical expertise but are unfamiliar with query languages like Structured Query Language (SQL). This paper presents a new database system supported by Artificial Intelligence (AI), which is intended to improve the management of data using natural language processing (NLP) - based intuitive interfaces, and automatic creation of structured queries and semi-structured data formats like yet another markup language (YAML), java script object notation (JSON), and application program interface (API) documentation. The system is intended to strengthen the potential of databases through the integration of Large Language Models (LLMs) and advanced machine learning algorithms. The integration is purposed to allow the automation of fundamental tasks such as data modeling, schema creation, query comprehension, and performance optimization. We present in this paper a system that aims to alleviate the main problems with current database technologies. It is meant to reduce the need for technical skills, manual tuning for better performance, and the potential for human error. The AI database employs generative schema inference and format selection to build its schema models and execution formats.
- North America > United States > New York (0.05)
- North America > United States > Florida > Hillsborough County > University (0.05)
- North America > United States > Massachusetts (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
EditLord: Learning Code Transformation Rules for Code Editing
Li, Weichen, Jan, Albert, Ray, Baishakhi, Yang, Junfeng, Mao, Chengzhi, Pei, Kexin
Code editing is a foundational task in software development, where its effectiveness depends on whether it introduces desired code property changes without changing the original code's intended functionality. Existing approaches often formulate code editing as an implicit end-to-end task, omitting the fact that code-editing procedures inherently consist of discrete and explicit steps. Thus, they suffer from suboptimal performance and lack of robustness and generalization. We introduce EditLord, a code editing framework that makes the code transformation steps explicit. Our key insight is to employ a language model (LM) as an inductive learner to extract code editing rules from the training code pairs as concise meta-rule sets. Such rule sets will be manifested for each training sample to augment them for finetuning or assist in prompting- and iterative-based code editing. EditLord outperforms the state-of-the-art by an average of 22.7% in editing performance and 58.1% in robustness while achieving 20.2% higher functional correctness across critical software engineering and security applications, LM models, and editing modes.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > Canada (0.04)
- Asia (0.04)
QiMeng: Fully Automated Hardware and Software Design for Processor Chip
Zhang, Rui, Wen, Yuanbo, Cheng, Shuyao, Huang, Di, Peng, Shaohui, Guo, Jiaming, Jin, Pengwei, Zhao, Jiacheng, Ma, Tianrui, Zhu, Yaoyu, Hao, Yifan, Zhao, Yongwei, Liang, Shengwen, Wang, Ying, Hu, Xing, Du, Zidong, Cui, Huimin, Li, Ling, Guo, Qi, Chen, Yunji
Processor chip design technology serves as a key frontier driving breakthroughs in computer science and related fields. With the rapid advancement of information technology, conventional design paradigms face three major challenges: the physical constraints of fabrication technologies, the escalating demands for design resources, and the increasing diversity of ecosystems. Automated processor chip design has emerged as a transformative solution to address these challenges. While recent breakthroughs in Artificial Intelligence (AI), particularly Large Language Models (LLMs) techniques, have opened new possibilities for fully automated processor chip design, substantial challenges remain in establishing domain-specific LLMs for processor chip design. In this paper, we propose QiMeng, a novel system for fully automated hardware and software design of processor chips. QiMeng comprises three hierarchical layers. In the bottom-layer, we construct a domain-specific Large Processor Chip Model (LPCM) that introduces novel designs in architecture, training, and inference, to address key challenges such as knowledge representation gap, data scarcity, correctness assurance, and enormous solution space. In the middle-layer, leveraging the LPCM's knowledge representation and inference capabilities, we develop the Hardware Design Agent and the Software Design Agent to automate the design of hardware and software for processor chips. Currently, several components of QiMeng have been completed and successfully applied in various top-layer applications, demonstrating significant advantages and providing a feasible solution for efficient, fully automated hardware/software design of processor chips. Future research will focus on integrating all components and performing iterative top-down and bottom-up design processes to establish a comprehensive QiMeng system.
- Asia > China (0.04)
- North America > United States > Illinois (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Semiconductors & Electronics (1.00)
- Information Technology (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Performance Optimization of Deep Learning Sparse Matrix Kernels on Intel Max Series GPU
Zubair, Mohammad, Bauinger, Christoph
In this paper, we focus on three sparse matrix operations that are relevant for machine learning applications, namely, the sparse-dense matrix multiplication (SPMM), the sampled dense-dense matrix multiplication (SDDMM), and the composition of the SDDMM with SPMM, also termed as FusedMM. We develop optimized implementations for SPMM, SDDMM, and FusedMM operations utilizing Intel oneAPI's Explicit SIMD (ESIMD) SYCL extension API. In contrast to CUDA or SYCL, the ESIMD API enables the writing of explicitly vectorized kernel code. Sparse matrix algorithms implemented with the ESIMD API achieved performance close to the peak of the targeted Intel Data Center GPU. We compare our performance results to Intel's oneMKL library on Intel GPUs and to a recent CUDA implementation for the sparse matrix operations on NVIDIA's V100 GPU and demonstrate that our implementations for sparse matrix operations outperform either.
Performance Optimization for Variable Bitwidth Federated Learning in Wireless Networks
Wang, Sihua, Chen, Mingzhe, Brinton, Christopher G., Yin, Changchuan, Saad, Walid, Cui, Shuguang
This paper considers improving wireless communication and computation efficiency in federated learning (FL) via model quantization. In the proposed bitwidth FL scheme, edge devices train and transmit quantized versions of their local FL model parameters to a coordinating server, which aggregates them into a quantized global model and synchronizes the devices. The goal is to jointly determine the bitwidths employed for local FL model quantization and the set of devices participating in FL training at each iteration. We pose this as an optimization problem that aims to minimize the training loss of quantized FL under a per-iteration device sampling budget and delay requirement. However, the formulated problem is difficult to solve without (i) a concrete understanding of how quantization impacts global ML performance and (ii) the ability of the server to construct estimates of this process efficiently. To address the first challenge, we analytically characterize how limited wireless resources and induced quantization errors affect the performance of the proposed FL method. Our results quantify how the improvement of FL training loss between two consecutive iterations depends on the device selection and quantization scheme as well as on several parameters inherent to the model being learned. Then, we show that the FL training process can be described as a Markov decision process and propose a model-based reinforcement learning (RL) method to optimize action selection over iterations. Compared to model-free RL, this model-based RL approach leverages the derived mathematical characterization of the FL training process to discover an effective device selection and quantization scheme without imposing additional device communication overhead. Simulation results show that the proposed FL algorithm can reduce the convergence time.
- North America > United States > California > San Francisco County > San Francisco (0.16)
- North America > United States > Georgia > Fulton County > Atlanta (0.05)
- Europe > United Kingdom > England (0.05)
Performance Embeddings: A Similarity-based Approach to Automatic Performance Optimization
Trümper, Lukas, Ben-Nun, Tal, Schaad, Philipp, Calotoiu, Alexandru, Hoefler, Torsten
Performance optimization is an increasingly challenging but often repetitive task. While each platform has its quirks, the underlying code transformations rely on data movement and computational characteristics that recur across applications. This paper proposes to leverage those similarities by constructing an embedding space for subprograms. The continuous space captures both static and dynamic properties of loop nests via symbolic code analysis and performance profiling, respectively. Performance embeddings enable direct knowledge transfer of performance tuning between applications, which can result from autotuning or tailored improvements. We demonstrate this transfer tuning approach on case studies in deep neural networks, dense and sparse linear algebra compositions, and numerical weather prediction stencils. Transfer tuning reduces the search complexity by up to four orders of magnitude and outperforms the MKL library in sparse-dense matrix multiplication. The results exhibit clear correspondences between program characteristics and optimizations, outperforming prior specialized state-of-the-art approaches and generalizing beyond their capabilities.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- (13 more...)
Performance Evaluation, Optimization and Dynamic Decision in Blockchain Systems: A Recent Overview
Li, Quan-Lin, Chang, Yan-Xia, Wang, Qing
With rapid development of blockchain technology as well as integration of various application areas, performance evaluation, performance optimization, and dynamic decision in blockchain systems are playing an increasingly important role in developing new blockchain technology. This paper provides a recent systematic overview of this class of research, and especially, developing mathematical modeling and basic theory of blockchain systems. Important examples include (a) performance evaluation: Markov processes, queuing theory, Markov reward processes, random walks, fluid and diffusion approximations, and martingale theory; (b) performance optimization: Linear programming, nonlinear programming, integer programming, and multi-objective programming; (c) optimal control and dynamic decision: Markov decision processes, and stochastic optimal control; and (d) artificial intelligence: Machine learning, deep reinforcement learning, and federated learning. So far, a little research has focused on these research lines. We believe that the basic theory with mathematical methods, algorithms and simulations of blockchain systems discussed in this paper will strongly support future development and continuous innovation of blockchain technology.
- Asia > China > Beijing > Beijing (0.04)
- Oceania > Australia (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (5 more...)
- Research Report (1.00)
- Overview (1.00)
- Information Technology > Security & Privacy (1.00)
- Banking & Finance > Trading (1.00)
- Information Technology > Services > e-Commerce Services (0.93)
- (2 more...)
- Information Technology > e-Commerce > Financial Technology (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Risk Data Analyst
Signifyd leads the world in bringing the insights, innovation and compassion required to foster fearless commerce in a time of increasing digital threats. Working with some of the industry's most recognizable retailers and brands, we are focused on using technology to enhance customer lifetime value and protect enterprises from fraud so they can focus on growing their business. We process billions in ecommerce transactions annually through our Commerce Network of thousands of merchants selling in more than 100 countries. We focus every day on harnessing machine learning and artificial intelligence in more powerful ways to maximize our customers' revenue and their security. None of that happens without the right people.
- Information Technology > Artificial Intelligence (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.40)
McLaren partners with AI specialist for performance optimization
McLaren Racing has announced a new partnership with AI cloud platform developer DataRobot, which offers a unified platform that reportedly allows organizations to unlock the full potential of AI. Under the partnership, DataRobot's AI cloud technology platform will be integrated into the McLaren Racing infrastructure, delivering AI-powered predictions and insights to maximize performance and optimize simulations. Zak Brown, CEO of McLaren Racing, commented, "DataRobot is a leader in its field, bringing its innovative technology and platform to top businesses around the globe. McLaren Racing continues to lead in innovation and technology, and partnerships with the likes of DataRobot allow us to progress, improve and support our team in our ongoing push for optimum performance. We are delighted to welcome DataRobot as they join our partner family for the Qatar Grand Prix this weekend."