Goto

Collaborating Authors

 Generative AI


Rethinking Test-time Likelihood: The Likelihood Path Principle and Its Application to OOD Detection

arXiv.org Machine Learning

While likelihood is attractive in theory, its estimates by deep generative models (DGMs) are often broken in practice, and perform poorly for out of distribution (OOD) Detection. Various recent works started to consider alternative scores and achieved better performances. However, such recipes do not come with provable guarantees, nor is it clear that their choices extract sufficient information. We attempt to change this by conducting a case study on variational autoencoders (VAEs). First, we introduce the likelihood path (LPath) principle, generalizing the likelihood principle. This narrows the search for informative summary statistics down to the minimal sufficient statistics of VAEs' conditional likelihoods. Second, introducing new theoretic tools such as nearly essential support, essential distance and co-Lipschitzness, we obtain non-asymptotic provable OOD detection guarantees for certain distillation of the minimal sufficient statistics. To our best knowledge, this is the first provable unsupervised OOD method that delivers excellent empirical results, better than any other VAEs based techniques. Independent and identically distributed (IID) samples in training and test times is the key to much of machine learning (ML)'s success. For example, this experimentally validated modern neural nets before tight learning theoretic bounds are established. However, as ML systems are deployed in the real world, out of distribution (OOD) data are apriori unknown and pose serious threats. This is particularly so in the most general setting where labels are absent, and test input arrives in a streaming fashion. While attractive in theory, naive approaches, such as using the likelihood of deep generative models (DGMs), are proved to be ineffective, often assigning high likelihood to OOD data (Nalisnick et al., 2018). Even with access to perfect density, likelihood alone is still insufficient to detect OOD data Le Lan & Dinh (2021); Zhang et al. (2021) when the IID and OOD distributions overlap. In response to likelihood's weakness, most works have focused on either improving density models Havtorn et al. (2021); Kirichenko et al. (2020) or taking some form of likelihood ratios with a baseline model chosen with prior knowledge about image data (Ren et al., 2019; Serrร  et al., 2019; Xiao et al., 2020). Recent theoretical works (Behrmann et al., 2021; Dai et al.) show that perfect density estimation may be infeasible for many DGMs.


Hewlett Packard Enterprise to Buy Juniper Networks For 14 Billion

WSJ.com: WSJD - Technology

Hewlett Packard Enterprise has agreed to buy Juniper Networks, a 14 billion deal that would merge two legacy network-operations companies as they seek to capitalize on the rise of generative artificial intelligence. The Wall Street Journal reported on Monday that the companies were nearing a deal.


Microsoft's OpenAI Investment Could Face EU Probe

WSJ.com: WSJD - Technology

The European Union is considering whether to launch a review of Microsoft's investment in ChatGPT maker OpenAI under the bloc's merger regulations, a month after the U.K. said it was also weighing whether the tech partnership could have an impact on competition. The European Commission, the EU's executive arm, made the disclosure on Tuesday as it sought input from interested parties on the level of competition in virtual worlds and generative artificial intelligence, and feedback on what competition law can do to keep these new markets competitive.


Microsoft's investment in OpenAI may face EU scrutiny, officials say

The Guardian

Microsoft's multibillion-dollar investment in the ChatGPT developer OpenAI could face a merger investigation in the European Union, officials have said. Microsoft is the largest minority investor in OpenAI Global LLC, a "capped profit" subsidiary company that is controlled by OpenAI Inc, the non-profit majority owner of the organisation. Its investment, given in the form of cloud-computing credits as well as cash, officially gives it no control of the company itself, but the possibility of a maximum of a 100-times return on its capital. The European Commission said on Tuesday it was "checking whether Microsoft's investment in OpenAI might be reviewable under the EU merger regulation". OpenAI's unusual corporate structure was thrust into the limelight last year, when its chief executive, Sam Altman, was ousted and then reappointed in a bitter struggle with the non-profit's board.


Microsoft's OpenAI Ties Face Potential E.U. Merger Probe

TIME - Tech

Microsoft Corp.'s 13 billion investment into OpenAI Inc. risks a full-blown investigation by European Union deals watchdogs, after a mutiny at the ChatGPT creator laid bare deep ties between the two companies. The European Commission said on Tuesday that it's examining whether Microsoft's involvement should be vetted under the bloc's merger rules -- paving the way for a formal probe and even a potential unwinding if it's found to hamper fair competition. The EU move, part of a broader look at artificial intelligence, follows a similar step by the UK's Competition and Markets Authority. "Virtual worlds and generative AI are rapidly developing," said Margrethe Vestager, the EU's antitrust commissioner. "It is fundamental that these new markets stay competitive, and that nothing stands in the way of businesses growing and providing the best and most innovative products to consumers."


She helped OpenAI win over world leaders. Can she keep the peace?

Washington Post - Technology News

Amid the growing clamor in Congress to regulate AI, the company is bringing in reinforcements. After years of outreach to lawmakers, OpenAI in fall 2023 disclosed its first in-house lobbyist, and reported that it is working with global law firm DLA Piper, according to federal disclosures. OpenAI to date has not advocated for or against any specific bill, Makanju says, but she anticipates that will change in 2024, especially with the Schumer effort that is underway. Makanju's team is also growing around the world, with more than 20 people in the United Kingdom, Germany, Japan and Brazil.


OpenAI admits it's impossible to train generative AI without copyrighted materials

Engadget

And based on what OpenAI told the House of Lords Communications and Digital Select Committee, we might see more lawsuits against the companies in the future. It added that "[l]imiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today's citizens." In a new post on its blog made in response to the The New York Times' lawsuit, it said the use of publicly available internet materials to train AI falls under fair use doctrine. It admitted, however, that there is "still work to be done to support and empower creators." The company talked about the ways it's allowing publishers to block the GPTBot web crawler from being able to access their websites. It also said that it's developing additional mechanisms allowing rightsholders to opt out of training and that it's engaging with them to find mutually beneficial agreements.


What to expect from the coming year in AI

MIT Technology Review

I also had plenty of time to reflect on the past year. There are so many more of you reading The Algorithm than when we first started this newsletter, and for that I am eternally grateful. Thank you for joining me on this wild AI ride. So what can we expect in 2024? All signs point to there being immense pressure on AI companies to show that generative AI can make money and that Silicon Valley can produce the "killer app" for AI. Big Tech, generative AI's biggest cheerleaders, is betting big on customized chatbots, which will allow anyone to become a generative-AI app engineer, with no coding skills needed.


Duolingo cuts contractors by 10% amid AI content shift

The Japan Times

Duolingo, the maker of language-learning software, is cutting some contractors as the app uses generative artificial intelligence to create more content. About 10% of contractors were "offboarded," a company spokesperson said Monday. "We just no longer need as many people to do the type of work some of these contractors were doing. Part of that could be attributed to AI," the spokesperson said. CEO Luis von Ahn said during an August earnings call that the company is using generative AI to "speed up" scripts for the app "and to more efficiently scale our course content."


In the race for AI supremacy, China and the US are travelling on entirely different tracks Manya Koetse

The Guardian

Of the many events that stand out as noteworthy in online discussions across Chinese social media in 2023, it's perhaps the rise of ChatGPT that will prove to be the most significant. Although the chatbot made by the US-based OpenAI was officially launched in late 2022, it took until 2023 for its unprecedented growth to raise eyebrows in China, where the government has set the goal of becoming the global AI leader by 2030. Over the past decade, the focus on AI in Chinese society and digital culture has grown. Since the Covid-19 outbreak, AI implementations in schools, office buildings and factories have rolled out in fast forward. AI facial recognition is employed in everything from public security to payment technology; smart glasses and helmets make it easier for many workers to perform their tasks; and intelligent robots have become a common sight in China's service industry, in malls, restaurants, and banks. There seemed little doubt over who would win the tech race between the eagle and the dragon; but then came ChatGPT.