Goto

Collaborating Authors

 language program


LangProBe: a Language Programs Benchmark

arXiv.org Artificial Intelligence

Composing language models (LMs) into multi-step language programs and automatically optimizing their modular prompts is now a mainstream paradigm for building AI systems, but the tradeoffs in this space have only scarcely been studied before. We introduce LangProBe, the first large-scale benchmark for evaluating the architectures and optimization strategies for language programs, with over 2000 combinations of tasks, architectures, optimizers, and choices of LMs. Using LangProBe, we are the first to study the impact of program architectures and optimizers (and their compositions together and with different models) on tradeoffs of quality and cost. We find that optimized language programs offer strong cost--quality Pareto improvement over raw calls to models, but simultaneously demonstrate that human judgment (or empirical decisions) about which compositions to pursue is still necessary for best performance. We will open source the code and evaluation data for LangProBe.


AI: The pattern is not in the data, it's in the machine

#artificialintelligence

A neural network transforms input, the circles on the left, to output, on the right. How that happens is a transformation of weights, center, which we often confuse for patterns in the data itself. It's a commonplace of artificial intelligence to say that machine learning, which depends on vast amounts of data, functions by finding patterns in data. The phrase, "finding patterns in data," in fact, has been a staple phrase of things such as data mining and knowledge discovery for years now, and it has been assumed that machine learning, and its deep learning variant especially, are just continuing the tradition of finding such patterns. AI programs do, indeed, result in patterns, but, just as "The fault, dear Brutus, lies not in our stars but in ourselves," the fact of those patterns is not something in the data, it is what the AI program makes of the data.


AI: The pattern is not in the data, it's in the machine

#artificialintelligence

A neural network transforms input, the circles on the left, to output, on the right. How that happens is a transformation of weights, center, which we often confuse for patterns in the data itself. It's a commonplace of artificial intelligence to say that machine learning, which depends on vast amounts of data, functions by finding patterns in data. The phrase, "finding patterns in data," in fact, has been a staple phrase of things such as data mining and knowledge discovery for years now, and it has been assumed that machine learning, and its deep learning variant especially, are just continuing the tradition of finding such patterns. AI programs do, indeed, result in patterns, but, just as "The fault, dear Brutus, lies not in our stars but in ourselves," the fact of those patterns is not something in the data, it is what the AI program makes of the data.


Watch out, GPT-3, here comes AI21's 'Jurassic' language model

#artificialintelligence

Such is just one of the attributes of Jurassic, a computer program introduced Wednesday by Tel Aviv-based artificial intelligence startup AI21 Labs. GPT-3, of course, is the language program from the San Francisco-based startup OpenAI that rocked the world in 2020 by generating sentences and whole articles that seemed quite human-like. GPT-3 also shocked the world by being kept inside a fairly restrictive beta testing arrangement by OpenAI. AI21 is promising to go OpenAI not one better, but two better, with what it claims are superior benchmark results on a test known as « few shot learning, » and a more open program for beta testers. On the latter score, AI21 is making development use of the program available as an « open beta, » it said, where anyone can sign up to use the program and there is « no wait list.