AITopics

2510.04349

Country:

Europe > Netherlands (0.15)
Europe > Serbia (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Software > Programming Languages (0.93)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

arXiv.org Artificial IntelligenceMay-29-2024

Kotlin ML Pack: Technical Report

Titov, Sergey, Evtikhiev, Mikhail, Shapkin, Anton, Smirnov, Oleg, Boytsov, Sergei, Boytsov, Sergei, Karaeva, Dariia, Sheptyakov, Maksim, Arkhipov, Mikhail, Bryksin, Timofey, Bogomolov, Egor

In this technical report, we present three novel datasets of Kotlin code: KStack, KStack-clean, and KExercises. We also describe the results of fine-tuning CodeLlama and DeepSeek models on this data. Additionally, we present a version of the HumanEval benchmark rewritten by human experts into Kotlin - both the solutions and the tests. Our results demonstrate that small, high-quality datasets (KStack-clean and KExercises) can significantly improve model performance on code generation tasks, achieving up to a 16-point increase in pass rate on the HumanEval benchmark. Lastly, we discuss potential future work in the field of improving language modeling for Kotlin, including the use of static analysis tools in the learning process and the introduction of more intricate and realistic benchmarks.

dataset, kotlin, pass rate, (17 more...)

2405.1925

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Middle East > Cyprus > Pafos > Paphos (0.04)
Europe > Middle East > Cyprus > Limassol > Limassol (0.04)
(2 more...)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Le, Triet H. M., Babar, M. Ali, Thai, Tung Hoang

Software Vulnerability Prediction in Low-Resource Languages: An Empirical Study of CodeBERT and ChatGPT

arXiv.org Artificial IntelligenceApr-25-2024

Background: Software Vulnerability (SV) prediction in emerging languages is increasingly important to ensure software security in modern systems. However, these languages usually have limited SV data for developing high-performing prediction models. Aims: We conduct an empirical study to evaluate the impact of SV data scarcity in emerging languages on the state-of-the-art SV prediction model and investigate potential solutions to enhance the performance. Method: We train and test the state-of-the-art model based on CodeBERT with and without data sampling techniques for function-level and line-level SV prediction in three low-resource languages - Kotlin, Swift, and Rust. We also assess the effectiveness of ChatGPT for low-resource SV prediction given its recent success in other domains. Results: Compared to the original work in C/C++ with large data, CodeBERT's performance of function-level and line-level SV prediction significantly declines in low-resource languages, signifying the negative impact of data scarcity. Regarding remediation, data sampling techniques fail to improve CodeBERT; whereas, ChatGPT showcases promising results, substantially enhancing predictive performance by up to 34.4% for the function level and up to 53.5% for the line level. Conclusion: We have highlighted the challenge and made the first promising step for low-resource SV prediction, paving the way for future research in this direction.

codebert, prediction, sv prediction, (11 more...)

2404.1711

Country:

Europe > Italy (0.05)
Oceania > Australia > South Australia > Adelaide (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > New Finding (0.69)
Research Report > Promising Solution (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Devaraj, Pranav Nataraj, P, Rakesh Teja V, Gangrade, Aaryav, R, Manoj Kumar

Development of a Legal Document AI-Chatbot

arXiv.org Artificial IntelligenceNov-21-2023

With the exponential growth of digital data and the increasing complexity of legal documentation, there is a pressing need for efficient and intelligent tools to streamline the handling of legal documents.With the recent developments in the AI field, especially in chatbots, it cannot be ignored as a very compelling solution to this problem.An insight into the process of creating a Legal Documentation AI Chatbot with as many relevant features as possible within the given time frame is presented.The development of each component of the chatbot is presented in detail.Each component's workings and functionality has been discussed.Starting from the build of the Android app and the Langchain query processing code till the integration of both through a Flask backend and REST API methods.

application, chatbot, query, (13 more...)

2311.12719

Country: Asia > India > Tamil Nadu > Chennai (0.06)

Genre: Research Report (0.64)

Industry: Law (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)

arXiv.org Artificial IntelligenceMar-28-2023

Multi-lingual Evaluation of Code Generation Models

Athiwaratkun, Ben, Gouda, Sanjay Krishna, Wang, Zijian, Li, Xiaopeng, Tian, Yuchen, Tan, Ming, Ahmad, Wasi Uddin, Wang, Shiqi, Sun, Qing, Shang, Mingyue, Gonugondla, Sujan Kumar, Ding, Hantian, Kumar, Varun, Fulton, Nathan, Farahani, Arash, Jain, Siddhartha, Giaquinto, Robert, Qian, Haifeng, Ramanathan, Murali Krishna, Nallapati, Ramesh, Ray, Baishakhi, Bhatia, Parminder, Sengupta, Sudipta, Roth, Dan, Xiang, Bing

We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X. These datasets cover over 10 programming languages and are generated using a scalable conversion framework that transpiles prompts and test cases from the original Python datasets into the corresponding data in the target language. Using these benchmarks, we are able to assess the performance of code generation models in a multi-lingual fashion, and discovered generalization ability of language models on out-of-domain languages, advantages of multi-lingual models over mono-lingual, the ability of few-shot prompting to teach the model new languages, and zero-shot translation abilities even on mono-lingual settings. Furthermore, we use our code generation model to perform large-scale bootstrapping to obtain synthetic canonical solutions in several languages, which can be used for other code-related evaluations such as code insertion, robustness, or summarization tasks. Overall, our benchmarks represents a significant step towards a deeper understanding of language models' code generation abilities. We publicly release our code and datasets at https://github.com/amazon-research/mxeval.

large language model, machine learning, natural language, (21 more...)

2210.14868

Country:

North America > Dominican Republic (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

#artificialintelligenceApr-20-2022, 15:40:23 GMT

FinTech Futures Jobs: Three UK tech jobs with great benefits

Doing great, fulfilling work is important – after all, our jobs take up a large part of our time each week, so it matters that we're engaged in what we're doing. The research backs it up: 70% of employees say that their sense of purpose is defined by their work. We think a few great benefits help sweeten the deal even more, so we've found some roles below where the work is interesting and fulfilling – and you'll get some sweet wellness, flexible working, holiday and extra benefits on top too. For even more open roles, you can head on over to our Job Board. About the company: Experian is a global information services company that provides data and analytical tools used to manage credit risk to clients around the world.

fintech futures job portal, job board, stand-out benefit, (10 more...)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > e-Commerce > Financial Technology (0.45)
Information Technology > Artificial Intelligence > Machine Learning (0.31)

#artificialintelligenceJan-26-2022, 09:33:49 GMT

Top 10 Programming Languages Recruiters are Looking For in 2022

Post pandemic, AI has become one of the top agendas for businesses as it offers enhanced customer experience, resilience, and reliability. With the advancements in machine learning, data analytics, and conversational AI, companies are finding it feasible and affordable to deploy AI tools that allow them to solve problems and increase efficiency. Here are the 10 most popular programming languages among job seekers. Python can be regarded as the future of programming languages. As per the latest statistics, Python is the main coding language for around 80% of developers.

application, programming language, programming language recruiter, (11 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence (1.00)

#artificialintelligenceJul-26-2021, 15:52:01 GMT

7 Best programming languages for beginners to learn in 2021

The world is expanding digitally and with every aspect of our lives becoming digital, the demand for computer experts is skyrocketing each day. Therefore, having knowledge of programming languages has become crucial for every IT professional. In fact, programming languages sit at the epicentre of this ever-growing field of Computer Science. If you are a beginner in programming, learning a new language or a new framework is essential. As a fresher in the field of programming language, make sure that you remain steady in both learning and coding.

beginner, javascript, programming language, (14 more...)

Industry:

Information Technology (0.73)
Leisure & Entertainment > Games (0.30)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence (1.00)

#artificialintelligenceMar-31-2021, 23:00:23 GMT

5 Software Development Trends To Embrace in 2021

In many ways, 2020 feels like a lost year. Remote work and no travel have taken a toll on the best of us. The pandemic has forced a lot of businesses to have an online presence in one way or another. Software development services have never been more important to businesses. This is why it is so important to be in the loop of the current trends taking place in the industry.

application, platform, software development trend, (7 more...)

Country: Europe (0.05)

Industry:

Information Technology (0.72)
Health & Medicine > Therapeutic Area (0.37)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Software Engineering (0.91)
Information Technology > Software > Programming Languages (0.35)
Information Technology > Communications > Social Media (0.31)

EngadgetFeb-17-2021, 15:55:28 GMT

Learn how to code in 2021 with training on the 12 most popular programming languages

The more dependent we become on apps, the more demand there'll be for skilled programmers. It just so happens that learning how to code is easier than ever in 2021. In fact, we've rounded up 12 amazing deals on courses and training programs that will teach you the skills you need to start creating your own software, and they're on sale for a limited time! Go, or GoLang, is Google's open-source programming language that's designed to simplify many programming tasks. This course is perfect for beginners, as Go is one of the fastest-growing languages in the industry thanks to its ease of use and familiar syntax.

bundle, certification bundle, programming language, (13 more...)

Engadget

Genre: Instructional Material > Course Syllabus & Notes (0.96)

Industry: Education (1.00)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)