Large Language Model
Extracting Structured Seed-Mediated Gold Nanorod Growth Procedures from Literature with GPT-3
Walker, Nicholas, Dagdelen, John, Cruse, Kevin, Lee, Sanghoon, Gleason, Samuel, Dunn, Alexander, Ceder, Gerbrand, Alivisatos, A. Paul, Persson, Kristin A., Jain, Anubhav
Abstract--Although gold nanorods have been the subject of much research, the pathways for controlling their shape and thereby their optical properties remain largely heuristically understood. Although it is apparent that the simultaneous presence of and interaction between various reagents during synthesis control these properties, computational and experimental approaches for exploring the synthesis space can be either intractable or too time-consuming in practice. This motivates an alternative approach leveraging the wealth of synthesis information already embedded in the body of scientific literature by developing tools to extract relevant structured data in an automated, high-throughput manner. To that end, we present an approach using the powerful GPT-3 language model to extract structured multi-step seed-mediated growth procedures and outcomes for gold nanorods from unstructured scientific text. GPT-3 prompt completions are finetuned to predict synthesis templates in the form of JSON documents from unstructured text input with an overall accuracy of 86%. The performance is notable, considering the model is performing simultaneous entity recognition and relation extraction. We present a dataset of 11,644 entities extracted from 1,137 papers, resulting in 268 papers with at least one complete seed-mediated gold nanorod growth procedure and outcome for a total of 332 complete procedures. In the last three semiconductor technology,[11, 12] biomedicine,[13, 14] and decades, chemists have developed the ability to synthesize cosmetics.[15] The suitability of a nanoparticle for a particular anisotropic metal nanoparticles in a controllable and re-application depends on its morphology and size, which correspond to different plasmonic properties.[16,
A Portrait of Emotion: Empowering Self-Expression through AI-Generated Art
Lee, Yoon Kyung, Park, Yong-Ha, Hahn, Sowon
We investigated the potential and limitations of generative artificial intelligence (AI) in reflecting the authors' cognitive processes through creative expression. The focus is on the AI-generated artwork's ability to understand human intent (alignment) and visually represent emotions based on criteria such as creativity, aesthetic, novelty, amusement, and depth. Results show a preference for images based on the descriptions of the authors' emotions over the main events. We also found that images that overrepresent specific elements or stereotypes negatively impact AI alignment. Our findings suggest that AI could facilitate creativity and the self-expression of emotions. Our research framework with generative AIs can help design AI-based interventions in related fields (e.g., mental health education, therapy, and counseling).
Neuro-symbolic Zero-Shot Code Cloning with Cross-Language Intermediate Representation
Hasija, Krishnam, Pradhan, Shrishti, Patwardhan, Manasi, Medicherla, Raveendra Kumar, Vig, Lovekesh, Naik, Ravindra
In this paper, we define a neuro-symbolic approach to address the task of finding semantically similar clones for the codes of the legacy programming language COBOL, without training data. We define a meta-model that is instantiated to have an Intermediate Representation (IR) in the form of Abstract Syntax Trees (ASTs) common across codes in C and COBOL. We linearize the IRs using Structure Based Traversal (SBT) to create sequential inputs. We further fine-tune UnixCoder, the best-performing model for zero-shot cross-programming language code search, for the Code Cloning task with the SBT IRs of C code-pairs, available in the CodeNet dataset. This allows us to learn latent representations for the IRs of the C codes, which are transferable to the IRs of the COBOL codes. With this fine-tuned UnixCoder, we get a performance improvement of 12.85 MAP@2 over the pre-trained UniXCoder model, in a zero-shot setting, on the COBOL test split synthesized from the CodeNet dataset. This demonstrates the efficacy of our meta-model based approach to facilitate cross-programming language transfer.
Attention Scheme Inspired Softmax Regression
Deng, Yichuan, Li, Zhihang, Song, Zhao
Large language models (LLMs) have made transformed changes for human society. One of the key computation in LLMs is the softmax unit. This operation is important in LLMs because it allows the model to generate a distribution over possible next words or phrases, given a sequence of input words. This distribution is then used to select the most likely next word or phrase, based on the probabilities assigned by the model. The softmax unit plays a crucial role in training LLMs, as it allows the model to learn from the data by adjusting the weights and biases of the neural network. In the area of convex optimization such as using central path method to solve linear programming. The softmax function has been used a crucial tool for controlling the progress and stability of potential function [Cohen, Lee and Song STOC 2019, Brand SODA 2020]. In this work, inspired the softmax unit, we define a softmax regression problem. Formally speaking, given a matrix $A \in \mathbb{R}^{n \times d}$ and a vector $b \in \mathbb{R}^n$, the goal is to use greedy type algorithm to solve \begin{align*} \min_{x} \| \langle \exp(Ax), {\bf 1}_n \rangle^{-1} \exp(Ax) - b \|_2^2. \end{align*} In certain sense, our provable convergence result provides theoretical support for why we can use greedy algorithm to train softmax function in practice.
Is a prompt and a few samples all you need? Using GPT-4 for data augmentation in low-resource classification tasks
Mรธller, Anders Giovanni, Dalsgaard, Jacob Aarup, Pera, Arianna, Aiello, Luca Maria
Obtaining and annotating data can be expensive and time-consuming, especially in complex, low-resource domains. We use GPT-4 and ChatGPT to augment small labeled datasets with synthetic data via simple prompts, in three different classification tasks with varying complexity. For each task, we randomly select a base sample of 500 texts to generate 5,000 new synthetic samples. We explore two augmentation strategies: one that preserves original label distribution and another that balances the distribution. Using a progressively larger training sample size, we train and evaluate a 110M parameter multilingual language model on the real and synthetic data separately. We also test GPT-4 and ChatGPT in a zero-shot setting on the test sets. We observe that GPT-4 and ChatGPT have strong zero-shot performance across all tasks. We find that data augmented with synthetic samples yields a good downstream performance, and particularly aids in low-resource settings, such as in identifying rare classes. Human-annotated data exhibits a strong predictive power, overtaking synthetic data in two out of the three tasks. This finding highlights the need for more complex prompts for synthetic datasets to consistently surpass human-generated ones.
The Roles of Symbols in Neural-based AI: They are Not What You Think!
Silver, Daniel L., Mitchell, Tom M.
We propose that symbols are first and foremost external communication tools used between intelligent agents that allow knowledge to be transferred in a more efficient and effective manner than having to experience the world directly. But, they are also used internally within an agent through a form of self-communication to help formulate, describe and justify subsymbolic patterns of neural activity that truly implement thinking. Symbols, and our languages that make use of them, not only allow us to explain our thinking to others and ourselves, but also provide beneficial constraints (inductive bias) on learning about the world. In this paper we present relevant insights from neuroscience and cognitive science, about how the human brain represents symbols and the concepts they refer to, and how today's artificial neural networks can do the same. We then present a novel neuro-symbolic hypothesis and a plausible architecture for intelligent agents that combines subsymbolic representations for symbols and concepts for learning and reasoning. Our hypothesis and associated architecture imply that symbols will remain critical to the future of intelligent systems NOT because they are the fundamental building blocks of thought, but because they are characterizations of subsymbolic processes that constitute thought.
Multidimensional Evaluation for Text Style Transfer Using ChatGPT
Lai, Huiyuan, Toral, Antonio, Nissim, Malvina
We investigate the potential of ChatGPT as a multidimensional evaluator for the task of \emph{Text Style Transfer}, alongside, and in comparison to, existing automatic metrics as well as human judgements. We focus on a zero-shot setting, i.e. prompting ChatGPT with specific task instructions, and test its performance on three commonly-used dimensions of text style transfer evaluation: style strength, content preservation, and fluency. We perform a comprehensive correlation analysis for two transfer directions (and overall) at different levels. Compared to existing automatic metrics, ChatGPT achieves competitive correlations with human judgments. These preliminary results are expected to provide a first glimpse into the role of large language models in the multidimensional evaluation of stylized text generation.
Microsoft shares up 8.3% as AI features give a boost to sales
Microsoft Corp beat Wall Street's quarterly revenue and profit estimates on Tuesday, driven by growth in its cloud computing and Office productivity software businesses, and the company said artificial intelligence products were stimulating sales. The company forecast that revenue in its main segments for the current quarter would match or top Wall Street targets. Shares gained 8.3% in after-market trading following a report by the Redmond, Washington-based technology company that profits were $2.45 a share in the fiscal third quarter, beating Wall Street estimates of $2.23, according to data from Refinitiv and up 10% from the same quarter last year. In regular trading, fears about earnings had sent Microsoft down 2.2%, making it the biggest drag on the S&P 500 on Tuesday ahead of its report. Revenue rose 7% to $52.9bn in the quarter ended March, inching past the average analyst estimate of $51.02bn, according to Refinitiv.
U.S. Law-Enforcement Agencies Seek to Combat AI Bias
U.S. law-enforcement officials said Tuesday they are resolved to combat discrimination and bias arising from the use of artificial intelligence in areas such as lending, housing and hiring, as growing adoption of automated systems such as ChatGPT gains attention from Washington. "There is not an exemption in our nation's civil-rights laws for new technologies," said Rohit Chopra, director of the Consumer Financial Protection Bureau, on a call with reporters. "Companies must take responsibility for the use of these tools."
OpenAI improves ChatGPT privacy with new data controls
The company announced today that the AI chatbot's users can now turn off their chat histories, preventing their input from being used for training data. The controls, which roll out "starting today," can be found under ChatGPT user settings under a new section labeled Data Controls. After toggling the switch off for "Chat History & Training," you'll no longer see your recent chats in the sidebar. Even with the history and training turned off, OpenAI says it will still store your chats for 30 days. It does this to prevent abuse, with the company saying it will only review them if it needs to monitor them.