AITopics | language model work

Collaborating Authors

language model work

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Does In-IDE Calibration of Large Language Models work at Scale?

Koohestani, Roham, Sergeyuk, Agnia, Gros, David, Spiess, Claudio, Titov, Sergey, Devanbu, Prem, Izadi, Maliheh

arXiv.org Artificial IntelligenceOct-28-2025

The introduction of large language models into integrated development environments (IDEs) is revolutionizing software engineering, yet it poses challenges to the usefulness and reliability of Artificial Intelligence-generated code. Post-hoc calibration of internal model confidences aims to align probabilities with an acceptability measure. Prior work suggests calibration can improve alignment, but at-scale evidence is limited. In this work, we investigate the feasibility of applying calibration of code models to an in-IDE context. We study two aspects of the problem: (1) the technical method for implementing confidence calibration and improving the reliability of code generation models, and (2) the human-centered design principles for effectively communicating reliability signal to developers. First, we develop a scalable and flexible calibration framework which can be used to obtain calibration weights for open-source models using any dataset, and evaluate whether calibrators improve the alignment between model confidence and developer acceptance behavior. Through a large-scale analysis of over 24 million real-world developer interactions across multiple programming languages, we find that a general, post-hoc calibration model based on Platt-scaling does not, on average, improve the reliability of model confidence signals. We also find that while dynamically personalizing calibration to individual users can be effective, its effectiveness is highly dependent on the volume of user interaction data. Second, we conduct a multi-phase design study with 3 expert designers and 153 professional developers, combining scenario-based design, semi-structured interviews, and survey validation, revealing a clear preference for presenting reliability signals via non-numerical, color-coded indicators within the in-editor code generation workflow.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.22614

Country:

Europe (0.68)
North America > United States > California > Yolo County > Davis (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

We know remarkably little about how AI language models work

MIT Technology ReviewSep-5-2023, 10:03:05 GMT

A growing number of experts have called for these tests to be ditched, saying they boost AI hype and create "the illusion that [AI language models] have greater capabilities than what truly exists." What stood out to me in Will's story is that we know remarkably little about how AI language models work and why they generate the things they do. With these tests, we're trying to measure and glorify their "intelligence" based on their outputs, without fully understanding how they function under the hood. Our tendency to anthropomorphize makes this messy: "People have been giving human intelligence tests--IQ tests and so on--to machines since the very beginning of AI," says Melanie Mitchell, an artificial-intelligence researcher at the Santa Fe Institute in New Mexico. "The issue throughout has been what it means when you test a machine like this. It doesn't mean the same thing that it means for a human."

ai language model work, language model work, university, (2 more...)

MIT Technology Review

Country:

North America > United States > New Mexico (0.27)
North America > United States > California > Los Angeles County > Los Angeles (0.19)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.19)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Creativity & Intelligence (0.59)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.43)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Meta AI Giving Away Its New Large Language Model

#artificialintelligenceMay-12-2022, 00:23:58 GMT

AI researchers at Meta have created a massive new language model to rival OpenAI's GPT-3 and advance our understanding of large language models. And it is giving it away as part of its effort to democratize AI. Open Pretrained Transformer (OPT-175B) is a language model with 175 billion parameters trained on publicly available data sets. According to Meta, 992 A100 GPUs equipped with 80GB of onboard memory from Nvidia were used over a training period of two months. To facilitate "community engagement", the release includes both the pre-trained model, extensive notes about its development, logbook detailing the training process, and the code needed to train and use the model.

language model, meta ai, opt-175b, (9 more...)

#artificialintelligence

Industry: Information Technology (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Learning Data Science from Real-World Projects

#artificialintelligenceJan-14-2022, 07:25:12 GMT

Mixed-integer programming saves the day. Taking a cue from consumer supply chains and the data-driven advances that have revolutionized them in recent decades, Gabe Verzino walks us through a scheduling program that would empower both patients and healthcare providers to use their time more efficiently. Bayes' Theorem might sound, well, theoretical. As Khuyen Tran shows in her recent tutorial (based on the traffic patterns of her own website), it can also be a powerful tool for detecting and analyzing change points in your data. The road to the perfect shot of espresso passes through a lot of data.

learning data science, patient and healthcare provider, real-world project, (12 more...)

#artificialintelligence

Industry: Health & Medicine (0.62)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.42)

Add feedback

Generating Beatles' Lyrics with Machine Learning - Towards Data Science

#artificialintelligenceJul-19-2019, 23:39:30 GMT

The Beatles were a huge cultural phenomenon. Their timeless music still resonates with people today, both young and old. In my humble opinion, they are the greatest band to have ever lived¹. Their songs are full of interesting lyrics and deep ideas. When you've seen beyond yourself Then you may find peace of mind is waiting there² However, the thing that made the Beatles great was their versatility.

large language model, lyric, machine learning, (17 more...)

#artificialintelligence

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Industry:

Leisure & Entertainment (0.69)
Media > Music (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.32)

Add feedback