AITopics | bigger model

Collaborating Authors

bigger model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

2025: The Year of the AI App

WIREDFeb-7-2025, 15:00:00 GMT

What a great idea I had for the first Plaintext of 2025. After following the frantic competition between OpenAI, Google, Meta, and Anthropic to churn out brainier and deeper "frontier" foundation models, I settled on a thesis about what's ahead: In the new year, those mighty trailblazers will consume billions of dollars, countless gigawatts, and all the silicon Nvidia can muster in their pursuit of AGI. We'll be bombarded by press releases boasting advanced reasoning, more tokens, and maybe even assurances that their models won't make up crazy facts. But people are tired of hearing about how AI is transformational and seeing few transformations to their day-to-day existence. Getting an AI summary of Google search results or having Facebook ask if you want to pose a follow-up question on a post doesn't make you a traveler to the neo-human future.

ai app, deepseek, wider audience, (5 more...)

WIRED

Industry: Information Technology (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.38)

Add feedback

Accurate estimation of feature importance faithfulness for tree models

Gajewski, Mateusz, Karczmarz, Adam, Rapicki, Mateusz, Sankowski, Piotr

arXiv.org Artificial IntelligenceApr-4-2024

One of the key challenges in deploying modern machine learning models in such areas as medical diagnosis lies in the ability to indicate why a certain prediction has been made. Such an indication may be of critical importance when a human decides whether the prediction can be relied on. This is one of the reasons various aspects of explainability of machine learning models have been the subject of extensive research lately (see, e.g., [BH21]). For some basic types of models (e.g., single decision trees), the rationale behind a prediction is easy to understand by a human. However, predictions of more complex models (that offer much better accuracy, e.g., based on neural networks or decision tree ensembles) are also much more difficult to interpret. Accurate and concise explanations understandable to humans might not always exist. In such cases, it is still beneficial to have methods giving a flavor of what factors might have influenced the prediction the most.

algorithm, dataset, pgi 2, (13 more...)

arXiv.org Artificial Intelligence

2404.03426

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > Poland > Greater Poland Province > Poznań (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.93)

Add feedback

In AI, is bigger always better?

#artificialintelligenceMar-9-2023, 04:55:15 GMT

Artificial-intelligence systems that can churn out fluent text, such as OpenAI's ChatGPT, are the newest darlings of the technology industry. But when faced with mathematical queries that require reasoning to answer, these large language models (LLMs) often stumble. A line parallel to y 4x 6 passes through (5, 10). What is the y-coordinate of the point where this line crosses the y-axis? Although LLMs can sometimes answer these types of question correctly, they more often get them wrong. In one early test of its reasoning abilities, ChatGPT scored just 26% when faced with a sample of questions from the'MATH' data set of secondary-school-level mathematical problems1. This is to be expected: given input text, an LLM simply generates new text in accordance with statistical regularities in the words, symbols and sentences that make up the model's training data.

llm, minerva, neural network, (17 more...)

#artificialintelligence

Country:

North America > Canada > Quebec > Montreal (0.15)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Washington > King County > Redmond (0.04)
(6 more...)

Genre: Research Report > New Finding (0.47)

Industry:

Energy (1.00)
Education > Educational Setting (0.54)
Information Technology > Services (0.47)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

Add feedback

Closer to AGI? – O'Reilly

#artificialintelligenceFeb-21-2023, 12:00:34 GMT

DeepMind's new model, Gato, has sparked a debate on whether artificial general intelligence (AGI) is nearer–almost at hand–just a matter of scale. Gato is a model that can solve multiple unrelated problems: it can play a large number of different games, label images, chat, operate a robot, and more. Not so many years ago, one problem with AI was that AI systems were only good at one thing. After IBM's Deep Blue defeated Garry Kasparov in chess, it was easy to say "But the ability to play chess isn't really what we mean by intelligence." A model that plays chess can't also play space wars.

expertise, general intelligence, intelligence, (17 more...)

#artificialintelligence

Country: Europe > Ukraine (0.04)

Industry: Leisure & Entertainment > Games > Chess (1.00)

Technology:

Information Technology > Artificial Intelligence > Games > Chess (0.54)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

TinyML in a Nutshell

#artificialintelligenceMay-3-2022, 00:31:04 GMT

Most Machine Learning models are created to realize that you want to see 50% Memes and 50% cute cats. To do just that they use huge clusters of computers using CPUs and GPUs and even TPUs to deliver these outstanding state-of-the-art Artificial Intelligence recommendation technologies to you. As we all know this and much more computational hardware is used when training, for example, GPT-3 which costs alone in electricity millions of dollars to train. But most of the time, running inference that means predicting on these models is computationally expensive too. Making these types of energy costly operations happen mostly in data centers far away from your phone.

learning, machine learning, tinyml, (10 more...)

#artificialintelligence

Industry: Information Technology > Services (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

The Double Descent Hypothesis Explains How Bigger Models can Hurt Performance

#artificialintelligenceFeb-15-2022, 00:47:56 GMT

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. It's free, we don't spam, and we never share your email address.

ai-related product, bigger model, hurt performance, (1 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.73)

Add feedback

🔆🔅 Go Big First, Then Compress

#artificialintelligenceMar-28-2021, 14:35:19 GMT

Conventional wisdom in machine learning (ML) tells us that bigger models are better. In the current state of the ML ecosystem dominated by supervised learning models, the mantra is to go big. Bigger deep learning models tend to outperform smaller versions in most deep learning scenarios. However, bigger models are also slow, expensive to run and really difficult to operate. Model compression is one of the techniques that helps address those limitations.

compress, hypothesis, lottery ticket hypothesis, (13 more...)

#artificialintelligence

Industry: Banking & Finance > Capital Markets (0.75)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Why a major AI Revolution is coming, but it's not what you think -- AAAI 2020

#artificialintelligenceMar-20-2020, 05:15:15 GMT

You already know that Deep Learning is good at vision, translation, playing games, and other tasks. But Neural Networks don't "learn" the way humans do, instead it's just really good at fast pattern matching. Today's research mainly focuses on bigger models with larger datasets, bigger models, and complicated loss functions. But the next revolution is likely going to be more fundamental. Let's take a look at two approaches: adding logic with Stacked Capsule Auto Encoders and Self-Supervised Learning at scale. This about sums up what most AI scientists already know: Deep Learning is really good at doing narrow, pattern based tasks such as object or speech recognition.

major ai revolution, neural network, stacked capsule auto encoder, (11 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.57)

Add feedback

Why Big Is Not Always Better In Machine Learning

#artificialintelligenceDec-11-2019, 15:15:38 GMT

Neural networks are trained to exactly fit the data. Such models usually would be considered as over-fitting, and yet they have managed to obtain high accuracy on test data. It is counter-intuitive -- but it works. This has raised many eyebrows, especially regarding the mathematical foundations of machine learning and their relevance to practitioners. In order to address these contradictions, researchers at OpenAI, in their recent work, double down on this widely believed grand illusion of bigger is better. In this paper, an attempt has been made to reconcile classical understanding and modern practice within a unified performance curve.

bigger model, machine learning, u-shaped bias-variance trade-off curve, (6 more...)

#artificialintelligence

AI-Alerts: 2019 > 2019-12 > AAAI AI-Alert for Dec 17, 2019 (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.56)
Information Technology > Data Science > Data Mining > Big Data (0.40)

Add feedback

Facebook's latest giant language AI hits computing wall at 500 Nvidia GPUs ZDNet

#artificialintelligenceNov-17-2019, 13:07:39 GMT

Facebook's giant "XLM-R" neural network is engineered to work word problems across 100 different languages, including Swahili and Urdu, but it runs up against computing constraints even using 500 of Nvidia's world-class GPUs. With a trend to bigger and bigger machine learning models, state-of-the-art artificial intelligence research continues to run up against the limits of conventional computing technology. Last week they published a report on their invention, XLM-R, a natural language model based on the wildly popular Transformer model from Google. XLM-R is engineered to be able to perform translations between one hundred different languages. It builds upon work that Conneau did earlier this year with Guillaume Lample at Facebook, the creation of the initial XLM.

facebook, latest giant language ai, xlm-r, (10 more...)

#artificialintelligence

Industry:

Information Technology > Services (0.65)
Information Technology > Hardware (0.63)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback