DeepSeek might not be such good news for energy after all
Add the fact that other tech firms, inspired by DeepSeek's approach, may now start building their own similar low-cost reasoning models, and the outlook for energy consumption is already looking a lot less rosy. The life cycle of any AI model has two phases: training and inference. Training is the often months-long process in which the model learns from data. The model is then ready for inference, which happens each time anyone in the world asks it something. Both usually take place in data centers, where they require lots of energy to run chips and cool servers. On the training side for its R1 model, DeepSeek's team improved what's called a "mixture of experts" technique, in which only a portion of a model's billions of parameters--the "knobs" a model uses to form better answers--are turned on at a given time during training.
Jan-31-2025, 21:13:38 GMT
- Technology: