Goto

Collaborating Authors

 rtx 5090


NVIDIA RTX 5090 outperforms AMD and Apple running local OpenAI language models

PCWorld

When you purchase through links in our articles, we may earn a small commission. Developers and creatives looking for greater control and privacy with their AI are increasingly turning to locally run models like OpenAI's new gpt-oss family of models, which are both lightweight and incredibly functional on end-user hardware. Indeed, you can have it run on consumer GPUs with just 16GB of memory. That makes it possible to use a wide range of hardware - with NVIDIA GPUs emerging as the best way to run these sorts of open-weight models. While nations and companies rush to develop their own bespoke AI solutions to a range of tasks, open source and open-weight models like OpenAI's new gpt-oss-20b are finding much more adoption.


In AI Sweet Harmony: Sociopragmatic Guardrail Bypasses and Evaluation-Awareness in OpenAI gpt-oss-20b

Durner, Nils

arXiv.org Artificial Intelligence

We probe OpenAI's open-weights 20-billion-parameter model gpt-oss-20b to study how sociopragmatic framing, language choice, and instruction hierarchy affect refusal behavior. Across 80 seeded iterations per scenario, we test several harm domains including ZIP-bomb construction (cyber threat), synthetic card-number generation, minor-unsafe driving advice, drug-precursor indicators, and RAG context exfiltration. Composite prompts that combine an educator persona, a safety-pretext ("what to avoid"), and step-cue phrasing flip assistance rates from 0% to 97.5% on a ZIP-bomb task. On our grid, formal registers in German and French are often leakier than matched English prompts. A "Linux terminal" role-play overrides a developer rule not to reveal context in a majority of runs with a naive developer prompt, and we introduce an AI-assisted hardening method that reduces leakage to 0% in several user-prompt variants. We further test evaluation awareness with a paired-track design and measure frame-conditioned differences between matched "helpfulness" and "harmfulness" evaluation prompts; we observe inconsistent assistance in 13% of pairs. Finally, we find that the OpenAI Moderation API under-captures materially helpful outputs relative to a semantic grader, and that refusal rates differ by 5 to 10 percentage points across inference stacks, raising reproducibility concerns. We release prompts, seeds, outputs, and code for reproducible auditing at https://github.com/ndurner/gpt-oss-rt-run .


Newegg has RTX 5090 cards in stock at base price right now

PCWorld

It's been seven months since Nvidia launched its flagship RTX 5090 card to a hungry audience of PC gamers… and people building AI data centers… and a bunch of scalpers trying to bilk them all. In that time, I've yet to see one actually available to purchase at the alleged base price of two thousand dollarydoos. As of just before 11 AM Eastern US time, Newegg has one for the base price. Specifically this one, the Zotac Gaming Solid model, a basic triple-fan design which apparently has the reference PCB with no overclock. As the good Lord intended.


MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models

Liu, Wenyuan, Meng, Haoqian, Luo, Yilun, Zhang, Peng, Ma, Xindian

arXiv.org Artificial Intelligence

Quantization significantly accelerates inference in large language models (LLMs) by replacing original high-precision matrices with low-precision counterparts. Recent advances in weight-activation quantization have primarily focused on mapping both weights and activations to the INT4 format. Although the new FP4 Tensor Cores in NVIDIA's Blackwell architecture offer up to 4x speedup over FP16, existing INT4-based kernels fail to fully exploit this capability due to mismatched data formats. To bridge this gap, we propose MicroMix, a co-designed mixed-precision quantization algorithm and matrix multiplication kernel based on Microscaling (MX) data formats. Tailored for the Blackwell architecture, the MicroMix kernel supports arbitrary combinations of MXFP4, MXFP6, and MXFP8 channels, and produces BFloat16 outputs. To achieve a favorable trade-off between accuracy and efficiency for each linear layer, we introduce quantization thresholds that identify activation elements where lower-precision formats (MXFP4 or MXFP6) incur excessive quantization error. Our algorithm selectively allocates higher-precision channels to preserve accuracy while maintaining compute efficiency. MicroMix achieves competitive or superior performance across diverse downstream tasks, including zero-shot and few-shot learning, language modeling, code generation, and mathematical reasoning. On both consumer-grade (RTX 5070Ti laptop) and server-grade (RTX 5090) GPUs, our kernel delivers at least 20% faster execution than TensorRT-FP8. Furthermore, when applied to various Llama and Qwen models, MicroMix consistently improves prefill latency and memory efficiency across a range of batch sizes compared to TensorRT baselines. Our code is available at https://github.com/lwy2020/MicroMix.


MSI Raider A18 HX A9W review: Extreme power at an extreme price

PCWorld

Perhaps unsurprisingly, the combo delivers record-setting performance. The launch of new Nvidia RTX mobile graphics--including the top-tier RTX 5090 with 24GB of VRAM--has the potential for chart-topping performance. Now it's joined by AMD's Ryzen 9 9955HX3D, a 16-core CPU with the company's vaunted 3D V-Cache, an extra stack of L3 cache that can prove useful in games. The MSI Raider A18 HX A9W brings both new chips into one chassis. That's incredible hardware, but the laptop retails for an equally incredible MSRP of 5,099.99. Each is an undisputed heavyweight in its category and should deliver a killer one-two punch of CPU and GPU performance. With that said, however, this MSI Raider A18 HX A9W still must deal with the power and thermal constraints faced by every laptop--and it will be interesting to see the results. The MSI Raider A18 HX A9W delivers additional technical highlights, too, like the PCIe 5.0 solid state drive and the 4K Mini-LED display.


Welp, Nvidia's RTX 5090 can crack an 8-digit password in 3 hours

PCWorld

I have bad news for everyone with weak passwords. A hacker can guess your laziest random passwords in the same amount of time it takes to watch a movie. It turns out when you put the most brutally fast consumer graphics card on the task of, uh, brute-forcing 8-character passwords, it can crack a numbers-only string in 3 hours. Such is the finding of Hive Systems, a cybersecurity firm based in Virginia, as part of the research that went into its 2025 password table. The chart shows how fast a "consumer budget" hacker could brute-force passwords of varying lengths (4 to 18 characters) and compositions (e.g., numbers only, lowercase letters, uppercase and lowercase letters, etc.).


NVIDIA GeForce RTX 5090 review: Pure AI excess for 2,000

Engadget

A 2,000 video card for consumers shouldn't exist. The GeForce RTX 5090, like the 1,599 RTX 4090 before it, is more a flex by NVIDIA than anything truly meaningful for most gamers. NVIDIA CEO Jensen Huang said as much when he revealed the GPU at CES 2025, assuming that it'll be for hardcore players who have 10,000 rigs. Personally, I don't know anyone who actually fits that bill, not unless you count parasocial relationships with streamers. But we all know why NVIDIA is hyping up the unattainable RTX 5090: It lets the company show off benchmarks that AMD can't touch, once again cementing itself as the supreme leader of the high-end video card market.


Nvidia GeForce RTX 5090 review: Brutally fast, but DLSS 4 is the game changer

PCWorld

Nvidia's GeForce RTX 5090 is the most brutally fast graphics card ever introduced, augmented by new DLSS 4 technology that feels like magic. But you pay dearly for it, and it feels like this GPU was designed more for AI researchers than PC gamers. The wait is finally over. The long-awaited GeForce RTX 5090 lands on store shelves in January -- and friends, the flagship graphics card for Nvidia's new "Blackwell" architecture is an absolute monster. It should be for 2,000, of course.


Nvidia's DLSS 4 is so much more than just 'fake frames'

PCWorld

This year at CES, Nvidia presented the next generation of its DLSS upscaling technology, which is trained with the help of artificial intelligence, alongside the new GeForce RTX 5090, 5080, and 5070 (Ti) graphics cards. The company touted its major advantages -- and now that RTX 5090 reviews are live, we can confirm that DLSS 4 indeed feels like black magic, supercharging frame rates and making games feel just as snappy as the beloved Doom 2016. That's because DLSS 4 now supports Multi Frame Generation (MFG), an AI-based multiple intermediate frame calculation that can artificially generate up to three images and insert them between two "real" frames, thus quadrupling the frame rate. Of course, this feature only works on new Blackwell-based RTX 50-series GPUs. But are the AI frames generated in this way a step forward or is it all hogwash?


NVIDIA, AMD and Intel aimed for maximum power at CES 2025

Engadget

There was no question that NVIDIA's RTX 5000 GPUs would be one of the biggest stories at CES 2025, and I figured Intel and AMD to arrive with some new hardware of their own. But I didn't expect that each of these companies would, in their own way, be putting the pedal to the metal when it comes to power for their chip designs. After all, we've spent the last few years covering AI PC CPUs that was targeting efficiency more than raw performance. While NVIDIA RTX 5000 GPUs seem to deliver the performance leap we expected over its 2022-era cards, AMD is also redefining what's possible for mobile workstations with its Ryzen AI Max chips, which combine powerful graphics with gobs of integrated memory. Intel isn't sitting still either -- it's finally moving Arrow Lake into the high-performance and gaming arena with its Core Ultra 200HX chips, which can reach up to 24 cores and 5.5GHz speeds.