enron
Nvidia insists it isn't Enron, but its AI deals are testing investor faith
Nvidia's chief executive, Jensen Huang, has been on an energetic world tour as the company's share price has soared. Nvidia's chief executive, Jensen Huang, has been on an energetic world tour as the company's share price has soared. Nvidia insists it isn't Enron, but its AI deals are testing investor faith The chipmaker's sprawling partnerships are driving extraordinary growth but also bank its future on the AI boom paying off quickly N vidia is, in crucial ways, nothing like Enron - the Houston energy giant that imploded through multibillion-dollar accounting fraud in 2001. Nor is it similar to companies such as Lucent or Worldcom that folded during the dotcom bubble. But the fact that it needs to reiterate this to its investors is less than ideal. Now worth more than $4tn (£3tn), Nvidia makes the specialised technology that powers the world's AI surge: silicon chips and software packages that train and host systems such as ChatGPT.
- Asia > South Korea (0.49)
- Europe > Ukraine (0.06)
- Asia > Middle East > Saudi Arabia (0.05)
- (5 more...)
- Information Technology > Hardware (1.00)
- Government > Regional Government (0.97)
- Information Technology > Communications > Social Media (0.72)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.42)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.42)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.42)
Empirical Privacy Variance
Hu, Yuzheng, Wu, Fan, Xian, Ruicheng, Liu, Yuhang, Zakynthinou, Lydia, Kamath, Pritish, Zhang, Chiyuan, Forsyth, David
We propose the notion of empirical privacy variance and study it in the context of differentially private fine-tuning of language models. Specifically, we show that models calibrated to the same $(\varepsilon, \delta)$-DP guarantee using DP-SGD with different hyperparameter configurations can exhibit significant variations in empirical privacy, which we quantify through the lens of memorization. We investigate the generality of this phenomenon across multiple dimensions and discuss why it is surprising and relevant. Through regression analysis, we examine how individual and composite hyperparameters influence empirical privacy. The results reveal a no-free-lunch trade-off: existing practices of hyperparameter tuning in DP-SGD, which focus on optimizing utility under a fixed privacy budget, often come at the expense of empirical privacy. To address this, we propose refined heuristics for hyperparameter selection that explicitly account for empirical privacy, showing that they are both precise and practically useful. Finally, we take preliminary steps to understand empirical privacy variance. We propose two hypotheses, identify limitations in existing techniques like privacy auditing, and outline open questions for future research.
- Europe > Austria > Vienna (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.88)
- Information Technology > Security & Privacy (1.00)
- Government (0.67)
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Jang, Joel, Yoon, Dongkeun, Yang, Sohee, Cha, Sungmin, Lee, Moontae, Logeswaran, Lajanugen, Seo, Minjoon
Pretrained Language Models (LMs) memorize a vast amount of knowledge during initial pretraining, including information that may violate the privacy of personal lives and identities. Previous work addressing privacy issues for language models has mostly focused on data preprocessing and differential privacy methods, both requiring re-training the underlying LM. We propose knowledge unlearning as an alternative method to reduce privacy risks for LMs post hoc. We show that simply performing gradient ascent on target token sequences is effective at forgetting them with little to no degradation of general language modeling performances for larger LMs; it sometimes even substantially improves the underlying LM with just a few iterations. We also find that sequential unlearning is better than trying to unlearn all the data at once and that unlearning is highly dependent on which kind of data (domain) is forgotten. By showing comparisons with a previous data preprocessing method and a decoding method known to mitigate privacy risks for LMs, we show that unlearning can give a stronger empirical privacy guarantee in scenarios where the data vulnerable to extraction attacks are known a priori while being much more efficient and robust. We release the code and dataset needed to replicate our results at https://github.com/joeljang/knowledge-unlearning.
- Europe > United Kingdom (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > Australia (0.05)
- (11 more...)
- Media (1.00)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- (2 more...)
Email Insights from Data Science -- Part 2
A detailed method for extracting sentiment and alignment information from corporate email content. Part 3 -- Shows a method of unsupervised-to-supervised feature extraction. In Part 1 of this series I demonstrated a method for extracting email contents from a proprietary repository in preparation for analysis and further data exploration. In this part I will focus on analysis and rating of the extracted information to determine usability for building a supervised modeling dataset. Currently, the data we retrieved from the Enron repository is still in its raw, but mostly clean and filtered form. This means the dataset is unstructured and unfocused for the tasks we are solving for. Since our goals are to classify the email contents to determine overall company sentiment (negative/positive) and alignment with company objectives, I'll need to transform the unstructured texts into a supervised dataset that we will use to train a recurrent network.
Could your computer please be more polite? Thank you
In a tense time when a pandemic rages, politicians wrangle for votes and protesters demand racial justice, a little politeness and courtesy go a long way. Now researchers at Carnegie Mellon University have developed an automated method for making communications more polite. Specifically, the method takes nonpolite directives or requests--those that use either impolite or neutral language--and restructures them or adds words to make them more well-mannered. "Send me the data," for instance, might become "Could you please send me the data?" The researchers will present their study on politeness transfer at the Association for Computational Linguistics annual meeting, which will be held virtually beginning July 5.
The Perfect Data Set: Why the Enron E-mails Are Useful, Even 10 Years Later
Former Enron executive Vincent Kaminski is a modest, semi-retired business school professor from Houston who recently wrote a 960-page book explaining the fundamentals of energy markets. His most lasting legacy, however, may involve thousands of e-mails he wrote more than a decade ago at the energy-services company. Kaminski, a former managing director for research who warned repeatedly about concerning practices he saw at Enron, is among more than 150 senior executives whose e-mail boxes were dumped onto the Internet by the Federal Energy Regulatory Commission (FERC) on March 26, 2003. In the name of serving the public's interest during its investigation of Enron, the federal agency made the controversial decision to post online more than 1.6 million e-mails that Enron executives sent and received from 2000 through 2002. FERC eventually culled the trove to remove the most sensitive and personal data, after receiving complaints (see PDF).
- Government > Regional Government > North America Government > United States Government (1.00)
- Energy > Power Industry (1.00)
What Enron's emails tell us about artificial intelligence - Technical.ly Brooklyn
Do you know that many of the artificially-intelligent things we use in our everyday, quotidian lives "learned" how to "think" to varying degrees by studying the emails of some of the most craven and degraded capitalists in our deeply weird corporate history? Brooklyn's Sam Lavigne and Tega Brain have a new piece of internet art out called The Good Life (Enron Simulator). We first told you about it back in August, right after it won a Rhizome Net Art Microgrant. You input your email into a very Windows 95-looking website and the site sends you each of the 500,000 publicly-available emails from the Enron archives in the order they were sent. You can choose to receive these emails over the course of seven days, 30 days, one year or seven years. Depending on your choice, you'll receive somewhere between 100,000 and 196 emails per day.
- Energy > Power Industry (0.92)
- Government > Regional Government > North America Government > United States Government (0.34)
Adaptive Spam Detection Inspired by a Cross-Regulation Model of Immune Dynamics: A Study of Concept Drift
Abi-Haidar, Alaa, Rocha, Luis M.
This paper proposes a novel solution to spam detection inspired by a model of the adaptive immune system known as the crossregulation model. We report on the testing of a preliminary algorithm on six e-mail corpora. We also compare our results statically and dynamically with those obtained by the Naive Bayes classifier and another binary classification method we developed previously for biomedical text-mining applications. We show that the cross-regulation model is competitive against those and thus promising as a bio-inspired algorithm for spam detection in particular, and binary classification in general.
- Europe > Portugal (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey > Hudson County > Secaucus (0.04)
- (4 more...)
- Research Report > New Finding (0.49)
- Research Report > Promising Solution (0.34)