Goto

Collaborating Authors

 blizzard entertainment


Reconsidering LLM Uncertainty Estimation Methods in the Wild

Bakman, Yavuz, Yaldiz, Duygu Nur, Kang, Sungmin, Zhang, Tuo, Buyukates, Baturalp, Avestimehr, Salman, Karimireddy, Sai Praneeth

arXiv.org Artificial Intelligence

Large Language Model (LLM) Uncertainty Estimation (UE) methods have become a crucial tool for detecting hallucinations in recent years. While numerous UE methods have been proposed, most existing studies evaluate them in isolated short-form QA settings using threshold-independent metrics such as AUROC or PRR. However, real-world deployment of UE methods introduces several challenges. In this work, we systematically examine four key aspects of deploying UE methods in practical settings. Specifically, we assess (1) the sensitivity of UE methods to decision threshold selection, (2) their robustness to query transformations such as typos, adversarial prompts, and prior chat history, (3) their applicability to long-form generation, and (4) strategies for handling multiple UE scores for a single query. Our evaluations on 19 UE methods reveal that most of them are highly sensitive to threshold selection when there is a distribution shift in the calibration dataset. While these methods generally exhibit robustness against previous chat history and typos, they are significantly vulnerable to adversarial prompts. Additionally, while existing UE methods can be adapted for long-form generation through various strategies, there remains considerable room for improvement. Lastly, ensembling multiple UE scores at test time provides a notable performance boost, which highlights its potential as a practical improvement strategy. Code is available at: https://github.com/duygunuryldz/uncertainty_in_the_wild.


What Went Wrong at Blizzard Entertainment

The Atlantic - Technology

Over the past three years, as I worked on a book about the history of the video-game company Blizzard Entertainment, a disconcerting question kept popping into my head: Why does success seem so awful? Even typing that out feels almost anti-American, anathema to the ethos of hard work and ambition that has propelled so many of the great minds and ideas that have changed the world. But Blizzard makes a good case for the modest achievement over the astronomical. Founded in Irvine, California, by two UCLA students named Allen Adham and Mike Morhaime, the company quickly became well respected and popular thanks to a series of breakout franchises such as StarCraft and Diablo. But everything changed in 2004 with the launch of World of Warcraft (or WoW), which became an online-gaming juggernaut that made billions of dollars.


Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents

Li, Zelong, Hua, Wenyue, Wang, Hao, Zhu, He, Zhang, Yongfeng

arXiv.org Artificial Intelligence

Recent advancements on Large Language Models (LLMs) enable AI Agents to automatically generate and execute multi-step plans to solve complex tasks. However, since LLM's content generation process is hardly controllable, current LLM-based agents frequently generate invalid or non-executable plans, which jeopardizes the performance of the generated plans and corrupts users' trust in LLM-based agents. In response, this paper proposes a novel ``Formal-LLM'' framework for LLM-based agents by integrating the expressiveness of natural language and the precision of formal language. Specifically, the framework allows human users to express their requirements or constraints for the planning process as an automaton. A stack-based LLM plan generation process is then conducted under the supervision of the automaton to ensure that the generated plan satisfies the constraints, making the planning process controllable. We conduct experiments on both benchmark tasks and practical real-life tasks, and our framework achieves over 50% overall performance increase, which validates the feasibility and effectiveness of employing Formal-LLM to guide the plan generation of agents, preventing the agents from generating invalid and unsuccessful plans. Further, more controllable LLM-based agents can facilitate the broader utilization of LLM in application scenarios where high validity of planning is essential. The work is open-sourced at https://github.com/agiresearch/Formal-LLM.


'Hell Welcomes All'

The Atlantic - Technology

When I listen to the voice recording I made at the Irvine, California, headquarters of the video-game company Blizzard Entertainment this past January, I hear a noise that many gamers find blissful: the sound of utter mayhem. Playing a prerelease version of Diablo IV, the latest installment in a 26-year-old adventure series about battling the forces of hell, I faced swarms of demons that yowled and belched. I jabbed buttons arrhythmically--click … click … clickclickclick--while trying to stifle curses and whimpers. But the strangest sounds came from the two Diablo IV designers who sat alongside me. As I dueled with an angry sea witch, Joseph Piepiora, an associate game director, gently noted that I was low on healing potions.


Blizzard To Pull Popular Games From China After License Spat

International Business Times

US gaming giant Blizzard Entertainment will suspend most of its services in China from January, the company said Thursday, after it failed to reach a licensing deal with local firm NetEase. Producer of some of the best-known titles in video gaming, including "World of Warcraft" and "Overwatch", Blizzard has operated since 2008 in China -- the world's biggest gaming market. But the firm said it had failed to reach an agreement with Chinese publisher NetEase over an extension to their 14-year partnership. "We will suspend new sales in the coming days and Chinese players will be receiving details of how this will work soon," Blizzard Entertainment, a subsidiary of California-based Activision Blizzard, said in a statement. Microsoft in January offered to buy Activision Blizzard for $69 billion, but the deal has yet to be finalised as anti-trust authorities examine it.


Making Diablo II Was Pure Hell

WIRED

David L. Craddock is the author of more than a dozen books about video games, including Break Out, about the history of Apple II games, and Rocket Jump, about the history of first-person shooters. "I tend to write a lot about games made in the '80s, '90s, and early '00s," Craddock says in Episode 481 of the Geek's Guide to the Galaxy podcast. "I love to write about creative people who had big ideas but very, very tight restrictions, and I think that from that comes some of the most enduring products--most enduring experiences--ever made." One of Craddock's most recent books is Stay Awhile and Listen: Book II, about the making of Blizzard's classic action RPG Diablo II. Craddock says this volume was a much bigger undertaking than Stay Awhile and Listen: Book I, about the original Diablo. "There was just so much more to juggle in terms of timeline, in terms of game," he says.


Activision Replaces Blizzard Head as It Grapples With Gender-Bias Lawsuit

WSJ.com: WSJD - Technology

Activision Blizzard Inc. said an executive named in a gender-bias lawsuit filed against the company last month by California regulators is leaving the videogame company. J. Allen Brack is immediately stepping down from his role as president of Blizzard Entertainment, the unit behind hit franchises such as World of Warcraft and Overwatch, the company said Tuesday. Two company veterans, Jen Oneal and Mike Ybarra, were named co-leaders of the unit, which it acquired in 2008. "It became clear to J. Allen Brack and Activision Blizzard leadership that Blizzard Entertainment needs a new direction and leadership given the critical work ahead in terms of workplace culture, game development, and innovation," the company said in a statement. Mr. Brack didn't immediately respond to a request for comment.


The Sexual Harassment Case That's Blown the Lid Off of Video Games' "Frat Boy" Work Culture

Slate

The company behind some of the biggest video games in the world is facing intense scrutiny after California regulators filed a lawsuit on July 20 alleging that it has fostered an intensely sexist workplace culture. The state's Department of Fair Employment and Housing is suing Activision Blizzard, the publisher of Call of Duty and Warcraft, following a two-year investigation in which it allegedly discovered evidence that women at the company perpetually face professional and personal discrimination. The disturbing examples span everything from pay imbalances and a glass ceiling to a drunken office culture wherein rape jokes and unwanted advances go unpunished. The company quickly denied the allegations in the lawsuit, but the scandal is snowballing. Both current and former executives have reacted with horror at the investigation, and a growing number of Activision Blizzard employees have shared their own troubling experiences working at the publisher--experiences that echo similar stories of discrimination at other major video game companies.


Overwatch Is Decreasing Toxicity In Chat With Machine Learning

#artificialintelligence

Blizzard Entertainment, the developer behind Overwatch, has implemented a machine learning system in the game that it says is helping to dismantle the toxicity of chat in Overwatch. Unfortunately, in any game with a chat function, there will be those players who do not regard the feelings of others, have poor sportsmanship, or are downright offensive and not afraid to show it. A game as popular as Overwatch, with over 40 million players, is bound to have problems in the chat as people compete with others from all over the world. Overwatch is competitive by nature, pitting teams of six players against each other in a multiplayer first-person shooter. This year's Overwatch League wrapped up recently, but due to COVID-19 restrictions audiences could only watch from their homes online.


BlizzCon Returns Next Year As Online Event

International Business Times

Blizzard Entertainment made true to its promise to host an online version of BlizzCon after the company was forced to cancel its in-person convention due to the pandemic caused by COVID-19. BlizzCon 2020 was supposed to take place later this year, but with the events that caused a worldwide dilemma has urged organizers to think of ways to still push through the annual event while observing guidelines set by health agencies and the government to curb the spread of the virus. "We're talking about how we might be able to channel the BlizzCon spirit and connect with you in some way online, far less impacted by the state of health and safety protocols for mass in-person gatherings," said BlizzCon Executive Producer Saralyn Smith in a May 26, 2020 blog post. Blizzard Entertainment's eSports World Championship competitions for "Hearthstone," "Heroes of the Storm," "World of Warcraft" and "StarCraft 2" will be held over an entire week later this year. Windows 10 has come on board as an official BlizzCon 2015 sponsor, with the opening ceremony streaming on Xbox One for the first time.