Goto

Collaborating Authors

 introductory


GiFT: Gibbs Fine-Tuning for Code Generation

arXiv.org Artificial Intelligence

Training Large Language Models (LLMs) with synthetic data is a prevalent practice in code generation. A key approach is self-training, where LLMs are iteratively trained on self-generated correct code snippets. In this case, the self-generated codes are drawn from a conditional distribution, conditioned on a specific seed description. However, the seed description is not the only valid representation that aligns with its intended meaning. With all valid descriptions and codes forming a joint space, codes drawn from the conditional distribution would lead to an underrepresentation of the full description-code space. As such, we propose Gibbs Fine-Tuning (GiFT), a novel self-training method inspired by Gibbs sampling. GiFT allows self-generated data to be drawn from the marginal distribution of the joint space, thereby mitigating the biases inherent in conditional sampling. We provide a theoretical analysis demonstrating the potential benefits of fine-tuning LLMs with code derived from the marginal distribution. Furthermore, we propose a perplexity-based code selection method to mitigate the imbalanced long-tail distribution of the self-generated codes. Empirical evaluation of two LLMs across four datasets demonstrates that GiFT achieves superior performance, particularly on more challenging benchmarks.


Course Machine Intelligence - an Introductory

#artificialintelligence

This course focuses on the theoretical aspects of the field of Data Science and Machine Learning. It helps the students to quickly gain an in-depth overview of different algorithmic techniques used in various domains and applications. This course features external links to further enhance the experience and reinforce the concepts acquired. It also provides easy explanations of popular and useful research papers that are driving this field forward.


Assessing the Future of Artificial Intelligence

#artificialintelligence

And how can businesses, as well as members of the public, best keep themselves informed about the extent to which advances in AI may impact on the economy, as well as our society? A recent consultation by the UK House of Lords Select Committee on Artificial Intelligence has called for evidence on the economic, ethical and social implications of advances in artificial intelligence [call for evidence PDF]. The consultation poses a range of questions in particular topic areas, such as the impact of AI on society and the public perception of it, as well as ethical considerations and the role of the governemnt in responding to AI's development and use. For example, one question, targeted at experts in the field, asks "What is the current state of artificial intelligence and what factors have contributed to this?". Another, that could be answered by a much wider audience, seeks to explore the extent to which "efforts [should be] be made to improve the public's understanding of, and engagement with, artificial intelligence" and how they should be pursued.