AITopics | cerebra

Collaborating Authors

cerebra

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Conductor and the Engine: A Path Towards Co-Designed Reasoning

Wang, Yuanxin, Filipczuk, Pawel, Garg, Anisha, Dhada, Amaan, Hassanpour, Mohammad, Bick, David, Venkatesh, Ganesh

arXiv.org Artificial IntelligenceSep-25-2025

Modern LLM reasoning relies on extensive test-time computation, driven by internal model training and external agentic orchestration. However, this synergy is often inefficient, as model verbosity and poor instruction following lead to wasted compute. We analyze this capability-cost trade-off and introduce an optimized reasoning workflow (\cepo) that empowers smaller open-source models to outperform models multiple times their size. We will open-source this workflow to enable further research. Our work demonstrates a clear path toward co-designing orchestration frameworks with the underlying model capabilities to unlock powerful reasoning in small-to-medium sized models.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.19762

Genre:

Workflow (0.69)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Calibrated Reasoning: An Explanatory Verifier for Dynamic and Efficient Problem-Solving

Garg, Anisha, Tekin, Engin, More, Yash, Bick, David, Neema, Nishit, Venkatesh, Ganesh

arXiv.org Artificial IntelligenceSep-25-2025

Advanced test-time computing strategies are essential for scaling reasoning models, but their effectiveness is capped by the models' poor self-evaluation. We propose a pairwise Explanatory Verifier, trained via reinforcement learning (GRPO), that produces calibrated confidence scores and associated natural language reasoning for generated solutions. Our verifier improves the accuracy and efficiency of test-time strategies like best-of-n and self-reflection. Crucially, it excels at identifying challenging failure modes, such as when both candidate solutions are identically incorrect, succeeding where standard methods like majority voting fail.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.19681

Genre: Research Report (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Cerebras-GPT vs LLaMA AI Model Comparison

#artificialintelligenceMar-29-2023, 21:40:39 GMT

On March 28th, Cerebras released on HuggingFace a new Open Source model trained on The Pile dataset called "Cerebras-GPT" with GPT-3-like performance. While Cerebras isn't as capable of a model for performing tasks when compared directly to models like LLaMA, ChatGPT, or GPT-4, it has one important quality that sets it apart: It's been released under the Apache 2.0 licence, a fully permissive Open Source license, and the weights are available for anybody to download and try out. This is different from other models like LLaMA that, while their weights are freely available, their license restricts LLaMAs usage to only "Non-Commercial" use cases like academic research or personal tinkering. That means if you'd like to check out LLaMA you'll have to get access to a powerful GPU to run it or use a volunteer-run service like KoboldAI. You can't just go to a website like you can with ChatGPT and expect to start feeding it prompts.

cerebra, gpt neox, llama ai model comparison, (7 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Andromeda - Cerebras

#artificialintelligenceDec-30-2022, 06:10:33 GMT

Andromeda is one of the largest AI supercomputers ever built. It delivers more than 1 Exaflop of AI compute and 120 Petaflops of dense compute. Andromeda is the only AI supercomputer to ever demonstrate near-perfect linear scaling on large language model workloads, and is extremely simple to use. Unlike any known GPU-based cluster, Andromeda delivers near-perfect scaling across GPT-class large language models, including GPT-3, GPT-J and GPT-NeoX. Near-perfect scaling means that that as additional CS-2s are used, training time is reduced in near perfect proportion.

ai supercomputer, andromeda, cerebra, (1 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

This AI Supercomputer Has 13.5 Million Cores--and Was Built in Just Three Days

#artificialintelligenceNov-22-2022, 20:55:30 GMT

Artificial intelligence is on a tear. Machines can speak, write, play games, and generate original images, video, and music. But as AI's capabilities have grown, so too have its algorithms. A decade ago, machine learning algorithms relied on tens of millions of internal connections, or parameters. Today's algorithms regularly reach into the hundreds of billions and even trillions of parameters.

andromeda, cerebra, supercomputer, (12 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.83)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.57)

Add feedback

GPT-4 Rumors From Silicon Valley

#artificialintelligenceNov-16-2022, 07:01:14 GMT

GPT-4 is possibly the most anticipated AI model in history. In 2020, GPT-3 surprised everyone with a huge performance leap from GPT-2 and set unprecedented expectations for its successor. But for two years OpenAI has been super shy about GPT-4--letting out info in dribs and drabs and remaining silent for the most part. People have been talking these months. What I've heard from several sources: GPT-4 is almost ready and will be released (hopefully) sometime December-February.

gpt-3, gpt-4, openai, (15 more...)

#artificialintelligence

Country: North America > United States > California (0.41)

Industry: Information Technology (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.43)

Add feedback

Last Week in AI #174: Cerebras sets record for largest AI model on one device, open source large language model, robotaxis paralyzed, and more!

#artificialintelligenceAug-15-2022, 05:24:00 GMT

Cerebras Systems, with its latest WSE-2 chip, has set the record for the largest AI model ever trained on a single device. The chip, which has 850k cores and 2.6 trillion transistors, is much larger than the largest GPUs. It has 123x more cores, 1k times more memory, and 12k times more bandwidth than the largest GPU. This allowed Cerebras to train a 20 billion parameter neural network model on a single chip. Doing so with GPUs would require complex compute cluster engineering and management, which could be much more expensive and only doable at large tech companies.

intelligence, language model, largest ai model, (13 more...)

#artificialintelligence

Country: North America > United States > California > San Francisco County > San Francisco (0.05)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.32)

Add feedback

US Hardware Startup, Cerebras, Sets Record For Largest AI Model Being Trained On One Device

#artificialintelligenceJul-10-2022, 23:10:11 GMT

When it comes to powerful chips, the US company Cerebras has you covered. They have trained their AI model on a single device powered by Wafer Scale Engine 2 – which is considered the world's largest chip in terms of processing power. According to the AI startup, A single CS-2 system may cut the engineering time and work required to train natural language processing (NLP) models from months to minutes. The branch of AI known as natural language processing (NLP) aims to make it possible for computers to analyze and comprehend human language from text or speech data. One of the "most unpleasant elements" of training big NLP models, which often entails distributing the model across hundreds or thousands of different GPUs, will be eliminated, according to Cerebras, as a result of its most recent finding.

neural network, parallelism, processor, (13 more...)

#artificialintelligence

Country: North America > United States (0.25)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Training a 20–Billion Parameter AI Model on a Single Processor - EETimes

#artificialintelligenceJun-28-2022, 09:45:17 GMT

Cerebras has shown off the capabilities of its second–generation wafer–scale engine, announcing it has set the record for the largest AI model ever trained on a single device. For the first time, a natural language processing network with 20 billion parameters, GPT–NeoX 20B, was trained on a single device. A new type of neural network, the transformer, is taking over. Today, transformers are mainly used for natural language processing (NLP) where their attention mechanism can help spot the relationship between words in a sentence, but they are spreading to other AI applications, including vision. The bigger a transformer is, the more accurate it is.

cerebra, processor system, transformer, (11 more...)

#artificialintelligence

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Cerebras sets record for largest AI model on a single chip

#artificialintelligenceJun-27-2022, 05:42:54 GMT

In brief US hardware startup Cerebras claims to have trained the largest AI model on a single device powered by the world's largest Wafer Scale Engine 2 chip the size of a plate. "Using the Cerebras Software Platform (CSoft), our customers can easily train state-of-the-art GPT language models (such as GPT-3 and GPT-J) with up to 20 billion parameters on a single CS-2 system," the company claimed this week. "Running on a single CS-2, these models take minutes to set up and users can quickly move between models with just a few keystrokes." The CS-2 packs a whopping 850,000 cores, and has 40GB of on-chip memory capable of reaching 20 PB/sec memory bandwidth. The specs on other types of AI accelerators and GPUs pale in comparison, meaning machine learning engineers have to train huge AI models with billions of parameters across more servers.

language model, largest ai model, robot, (16 more...)

#artificialintelligence

Industry: Information Technology (0.31)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback