Goto

Collaborating Authors

 data structure and algorithm


ProBench: Benchmarking Large Language Models in Competitive Programming

Yang, Lei, Jin, Renren, Shi, Ling, Peng, Jianxiang, Chen, Yue, Xiong, Deyi

arXiv.org Artificial Intelligence

With reasoning language models such as OpenAI-o3 and DeepSeek-R1 emerging, large language models (LLMs) have entered a new phase of development. However, existing benchmarks for coding evaluation are gradually inadequate to assess the capability of advanced LLMs in code reasoning. To bridge the gap for high-level code reasoning assessment, we propose ProBench to benchmark LLMs in competitive programming, drawing inspiration from the International Collegiate Programming Contest. ProBench collects a comprehensive set of competitive programming problems from Codeforces, Luogu, and Nowcoder platforms during the period from July to December 2024, obtaining real test results through online submissions to ensure the fairness and accuracy of the evaluation. We establish a unified problem attribute system, including difficulty grading and algorithm tagging. With carefully collected and annotated data in ProBench, we systematically assess 9 latest LLMs in competitive programming across multiple dimensions, including thought chain analysis, error type diagnosis, and reasoning depth evaluation. Experimental results show that QwQ-32B-Preview achieves the best score of 20.93 followed by DeepSeek-V3 with a score of 16.38, suggesting that models trained with specialized reasoning tasks significantly outperform general-purpose models (even larger than reasoning-oriented models) in programming. Further analysis also reveals key areas for programming capability enhancement, e.g., algorithm adaptability and reasoning sufficiency, providing important insights for the future development of reasoning models.


Artificial-Intelligence Generated Code Considered Harmful: A Road Map for Secure and High-Quality Code Generation

Chong, Chun Jie, Yao, Zhihao, Neamtiu, Iulian

arXiv.org Artificial Intelligence

Generating code via a LLM (rather than writing code from scratch), has exploded in popularity. However, the security implications of LLM-generated code are still unknown. We performed a study that compared the security and quality of human-written code with that of LLM-generated code, for a wide range of programming tasks, including data structures, algorithms, cryptographic routines, and LeetCode questions. To assess code security we used unit testing, fuzzing, and static analysis. For code quality, we focused on complexity and size. We found that LLM can generate incorrect code that fails to implement the required functionality, especially for more complicated tasks; such errors can be subtle. For example, for the cryptographic algorithm SHA1, LLM generated an incorrect implementation that nevertheless compiles. In cases where its functionality was correct, we found that LLM-generated code is less secure, primarily due to the lack of defensive programming constructs, which invites a host of security issues such as buffer overflows or integer overflows. Fuzzing has revealed that LLM-generated code is more prone to hangs and crashes than human-written code. Quality-wise, we found that LLM generates bare-bones code that lacks defensive programming constructs, and is typically more complex (per line of code) compared to human-written code. Next, we constructed a feedback loop that asked the LLM to re-generate the code and eliminate the found issues (e.g., malloc overflow, array index out of bounds, null dereferences). We found that the LLM fails to eliminate such issues consistently: while succeeding in some cases, we found instances where the re-generated, supposedly more secure code, contains new issues; we also found that upon prompting, LLM can introduce issues in files that were issues-free before prompting.


Generative AI and CS Education

Communications of the ACM

I have spent most of my career working on computer science (CS) education whether teaching undergraduate CS or managing technical education for software engineers at Google. In the early 1990s, when Pascal was the language of choice, I began teaching CS1 and CS2 at Stanford. Over the next few years, I saw the transition from Pascal to C to object-oriented programming. I also saw the pace at which we had to consistently update our course materials and projects, whether it was in the introductory courses or later electives such as graphics or compilers. Languages, software frameworks, libraries, APIs, and so forth change rapidly.


How to use Binary Search Trees part1(Data Structures and Algorithms)

#artificialintelligence

Abstract: This paper presents a parallel solution based on the coarse-grained multicomputer (CGM) model using the four-splitting technique to solve the optimal binary search tree problem. The well-known sequential algorithm of Knuth solves this problem in O(n2) time and space, where n is the number of keys used to build the optimal binary search tree. To parallelize this algorithm on the CGM model, the irregular partitioning technique, consisting in subdividing the dependency graph into subgraphs (or blocks) of variable size, has been proposed to tackle the trade-off of minimizing the number of communication rounds and balancing the load of processors. This technique however induces a high latency time of processors (which accounts for most of the global communication time) because varying the blocks' sizes does not enable them to start evaluating some blocks as soon as the data they need are available. The four-splitting technique proposed in this paper solves this shortcoming by evaluating a block as a sequence of computation and communication steps of four subblocks.


[100%OFF] Master Coding Interview :Data Structures + Algorithms

#artificialintelligence

I'm sure you'll love this course and so we're offering a full money-back guarantee for 30 days in case you are not sure at the moment! Enroll today and see you inside the course! Let's make your dreams come true




How to Learn Python (Step-by-Step Guide) in 2022 [Updated]

#artificialintelligence

Python is a versatile programming language loved by developers, data scientists, software engineers. Python is recommended for both beginners and advanced developers as it is easy to learn and has clean syntax. The language is used to build large and small web and mobile applications as it offers a lot of useful libraries, frameworks, and modules. If you are not convinced about Python, let's see the major benefits that Python offers. This article will give you complete brief about how to learn python.


Beginners Learning Path for Machine Learning

#artificialintelligence

Made your mind towards machine learning but are confused so much that where to get started. I faced the same confusion that what should be a good start? Should I learn Python, or go for R? Mathematics was always a scary part for me and I was always worried that from where should I learn math? I was also worried that how should I get a strong basis for Machine Learning. Anyways you should be congratulated that at least you have made your mind.


Data Structures and Algorithms in python

#artificialintelligence

This course Data Structures and Algorithms in python includes explanation of various data structures with coding examples, provided with detail explanation of code side by side with concept building. Linked List, Binary Search Tree, stack are explained in detail with concepts made easy to understand. Selection Sort, Insertion Sort are part of this course. This course is for students who want good understanding of data structures and algorithms and want to understand code. By taking this course students will be able to use these skills to write and understand data structures in other languages as well, because concepts build from this course are very generic regarding Data structures and Algorithms.