Danbury
Long-form factuality in large language models Jerry Wei 1 Chengrun Y ang 1 Xinying Song 1 Yifeng Lu
To benchmark a model's long-form factuality in open domains, we first use GPT -4 to generate LongFact, a prompt set comprising thousands of questions spanning 38 topics. We then propose that LLM agents can be used as automated evaluators for long-form factuality through a method which we call Search-Augmented Factuality Evaluator (SAFE).
Long-form factuality in large language models
Wei, Jerry, Yang, Chengrun, Song, Xinying, Lu, Yifeng, Hu, Nathan, Huang, Jie, Tran, Dustin, Peng, Daiyi, Liu, Ruibo, Huang, Da, Du, Cosmo, Le, Quoc V.
Large language models (LLMs) often generate content that contains factual errors when responding to fact-seeking prompts on open-ended topics. To benchmark a model's long-form factuality in open domains, we first use GPT-4 to generate LongFact, a prompt set comprising thousands of questions spanning 38 topics. We then propose that LLM agents can be used as automated evaluators for long-form factuality through a method which we call Search-Augmented Factuality Evaluator (SAFE). SAFE utilizes an LLM to break down a long-form response into a set of individual facts and to evaluate the accuracy of each fact using a multi-step reasoning process comprising sending search queries to Google Search and determining whether a fact is supported by the search results. Furthermore, we propose extending F1 score as an aggregated metric for long-form factuality. To do so, we balance the percentage of supported facts in a response (precision) with the percentage of provided facts relative to a hyperparameter representing a user's preferred response length (recall). Empirically, we demonstrate that LLM agents can outperform crowdsourced human annotators - on a set of ~16k individual facts, SAFE agrees with crowdsourced human annotators 72% of the time, and on a random subset of 100 disagreement cases, SAFE wins 76% of the time. At the same time, SAFE is more than 20 times cheaper than human annotators. We also benchmark thirteen language models on LongFact across four model families (Gemini, GPT, Claude, and PaLM-2), finding that larger language models generally achieve better long-form factuality. LongFact, SAFE, and all experimental code are available at https://github.com/google-deepmind/long-form-factuality.
Forecasting Post-Wildfire Vegetation Recovery in California using a Convolutional Long Short-Term Memory Tensor Regression Network
The study of post-wildfire plant regrowth is essential for developing successful ecosystem recovery strategies. Prior research mainly examines key ecological and biogeographical factors influencing post-fire succession. This research proposes a novel approach for predicting and analyzing post-fire plant recovery. We develop a Convolutional Long Short-Term Memory Tensor Regression (ConvLSTMTR) network that predicts future Normalized Difference Vegetation Index (NDVI) based on short-term plant growth data after fire containment. The model is trained and tested on 104 major California wildfires occurring between 2013 and 2020, each with burn areas exceeding 3000 acres. The integration of ConvLSTM with tensor regression enables the calculation of an overall logistic growth rate k using predicted NDVI. Overall, our k-value predictions demonstrate impressive performance, with 50% of predictions exhibiting an absolute error of 0.12 or less, and 75% having an error of 0.24 or less. Finally, we employ Uniform Manifold Approximation and Projection (UMAP) and KNN clustering to identify recovery trends, offering insights into regions with varying rates of recovery. This study pioneers the combined use of tensor regression and ConvLSTM, and introduces the application of UMAP for clustering similar wildfires. This advances predictive ecological modeling and could inform future post-fire vegetation management strategies.
#cx_2022-01-16_16-58-09.xlsx
The graph represents a network of 2,962 Twitter users whose tweets in the requested range contained "#cx", or who were replied to or mentioned in those tweets. The network was obtained from the NodeXL Graph Server on Monday, 17 January 2022 at 01:14 UTC. The requested start date was Sunday, 16 January 2022 at 01:01 UTC and the maximum number of days (going backward) was 14. The maximum number of tweets collected was 7,500. The tweets in the network were tweeted over the 3-day, 7-hour, 28-minute period from Wednesday, 12 January 2022 at 17:28 UTC to Sunday, 16 January 2022 at 00:56 UTC.
Differentially Private M-band Wavelet-Based Mechanisms in Machine Learning Environments
In the post-industrial world, data science and analytics have gained paramount importance regarding digital data privacy. Improper methods of establishing privacy for accessible datasets can compromise large amounts of user data even if the adversary has a small amount of preliminary knowledge of a user. Many researchers have been developing high-level privacy-preserving mechanisms that also retain the statistical integrity of the data to apply to machine learning. Recent developments of differential privacy, such as the Laplace and Privelet mechanisms, drastically decrease the probability that an adversary can distinguish the elements in a data set and thus extract user information. In this paper, we develop three privacy-preserving mechanisms with the discrete M-band wavelet transform that embed noise into data. The first two methods (LS and LS+) add noise through a Laplace-Sigmoid distribution that multiplies Laplace-distributed values with the sigmoid function, and the third method utilizes pseudo-quantum steganography to embed noise into the data. We then show that our mechanisms successfully retain both differential privacy and learnability through statistical analysis in various machine learning environments.
Media's Data-Driven Future
"Today is the slowest rate of technological change you will ever experience in your lifetime," wrote Shelly Palmer in his e-book Data-Driven Thinking (Digital Living Press, 2016). As one of the world's premier voices on the accelerating pace of digital technology, he is increasingly preoccupied with helping companies and individuals prepare for the dramatic changes he sees coming, particularly in entertainment and media. Palmer started his career at age 12 as a musician, playing the clarinet, saxophone, and flute in the 1970s in venues around New York. He was also an early experimenter with analog and digital synthesizers. He holds patents for two major interactive television technologies, one of which -- a method for syncing broadcast TV with server-based text, known as enhanced television -- was adopted by Monday Night Football and Who Wants to Be a Millionaire? His background also includes writing the theme music for Spin City and Live with Regis and Kathie Lee, and conducting the London Symphony Orchestra. Currently, he is Fox 5 New York's on-air tech and digital media expert and the proprietor of a popular and prescient email newsletter that covers the impact of technology on media and daily life, with a special focus on smart cars and smart homes. For the past decade, as a venture capitalist and CEO of his own consulting firm and marketing agency, the Palmer Group, Palmer has focused his attention on the evolution of advertising, marketing, and related businesses, along with leading-edge technologies such as smart home systems and data analytics. We recently talked with Palmer in New York. Conscious of the intertwined trajectories of trends in technology and media, we sought to explore how artificial intelligence (AI) and the churn in business models could affect advertising, media, and related fields over the next few years.
George Devol, Developer of Robot Arm, Dies at 99
George C. Devol, a largely self-taught inventor who drew from science fiction to help develop Unimate, the revolutionary mechanical arm that became a prototype for robots now widely used on automobile assembly lines and in other industries, died on Thursday at his home in Wilton, Conn. In the early 1950s, before the advent of industrial robotics, Mr. Devol (pronounced de-VAHL) built on his own work in electrical engineering and machine controls to design a mechanical arm that could be programmed to repeat precise tasks, like grasping and lifting. He applied for a patent in 1954 and explained the concept to a fellow engineer, Joseph F. Engelberger, at a cocktail party where they discussed their favorite science fiction writers. Mr. Engelberger listened with interest and immediately seized on the significance of the new technology. Mr. Devol named the concept Universal Automation -- later shortened to Unimation -- and received a patent in 1961.
Assistant Vice President Analytics, Machine Learning Jobs in Danbury, CT - Genpact
Genpact believes that the next phase of its growth will continue to be driven by Analytics and is seeking an Assistant Vice President (AVP) – Machine Learning. This is a pre-sales analytics Subject Matter Expert (SME) and platform solution role across all vertical and horizontal business groups within Genpact. This person will lead bid/win process and pre-sales technical support for information science engagements specific to machine learning and other technology tools and techniques to maximize automation and process improvement. This person will liaise with service lines and delivery to ensure proper solution design and fit. This person will be the SME for Machine Learning to the salesforce.
Menzerath-Altmann Law for Syntactic Structures in Ukrainian
Buk, Solomija, Rovenchak, Andrij
In the general form, such a dependence can be formulated as follows: the longer is the construct the shorter are its constituents. Later on, this fact was put in a mathematical form by Gabriel Altmann [1]. Now it is known as the Menzerath-Altmann law and is considered to be one of the general linguistic laws with evidences reaching far beyond the linguistic domain itself [2]. The mentioned relationship is studied on various levels of language units, such as syllable-word, morpheme-word, etc. While the word-sentence seems to be the most straightforward generalization on the syntactic level, it appears that in fact an intermediate unit must be introduced in this scheme [3, p. 283]. Usually, this intermediate unit are thought to be phrases or clauses, which are direct constituents of the sentence [4]. We would like to note, however, that the notion of clause is not well elaborated in Eastern European linguistic traditions [5], including Ukrainian (cf.