Goto

Collaborating Authors

 Large Language Model


The technology behind ChatGPT is about to get even more powerful

#artificialintelligence

Nearly four months after OpenAI stunned the tech industry with ChatGPT, the company is releasing its next-generation version of the technology that powers the viral chatbot tool. In a blog post on Tuesday, OpenAI unveiled GPT-4, which the company says is capable of performing well on a range of standardized tests and is also less likely to "go off the guardrails" with its responses, as some users have previously experienced. OpenAI said the updated technology passed a simulated law school bar exam with a score around the top 10% of test takers; by contrast, the prior version, GPT-3.5, scored around the bottom 10%. GPT-4 can also read, analyze or generate up to 25,000 words of text, and write code in all major programming languages, according to the company. OpenAI described the update as the "latest milestone" for the company.


STEM versus STEAM According to ChatGPT - fullSTEAMahead365

#artificialintelligence

All rights reserved unless otherwise indicated. Further, endorsement of any external sources or links is neither implied nor suggested. We thank you for your support and encourage you to contact us for any reason, including, but not limited to, questions, concerns, business endeavors, or praise. Armchair Creative Services, LLC, may earn compensation for sales from links on posts through affiliate and other programs. Editorial rigor and objectivity standards are strictly adhered to and any compensation has no effect on coverage or opinions.


AI's ascendance seems unfazed by SVB mess

#artificialintelligence

This installment will be brief because I have to finish prepping for today's TechCrunch Live with Arianna Huffington of Thrive Global and Mamoon Hamid of Kleiner Perkins. Since we scheduled the conversation, a few things have happened, so I need to retool my notes and questions. The Exchange explores startups, markets and money. Read it every morning on TechCrunch or get The Exchange newsletter every Saturday. The tech-narrative whiplash is actually what I want to talk about this morning.


Report: Microsoft cut a key AI ethics team

#artificialintelligence

An entire team responsible for making sure that Microsoft's AI products are shipped with safeguards to mitigate social harms was cut during the company's most recently layoff of 10,000 employees, Platformer reported. Former employees said that the ethics and society team was a critical part of Microsoft's strategy to reduce risks associated with using OpenAI technology in Microsoft products. Before it was killed off, the team developed an entire "responsible innovation toolkit" to help Microsoft engineers forecast what harms could be caused by AI--and then to diminish those harms. Platformer's report came just before OpenAI released possibly its most powerful AI model yet, GPT-4, which is already helping to power Bing search, Reuters reported. In a statement provided to Ars, Microsoft said that it remains "committed to developing AI products and experiences safely and responsibly, and does so by investing in people, processes, and partnerships that prioritize this."


PwC's 4,000 legal staffers get AI assistant as law chatbots gain steam

#artificialintelligence

PwC said it partnered with AI startup Harvey for an initial 12-month contract, which the accounting and consulting firm said will help lawyers with contract analysis, regulatory compliance work, due diligence and other legal advisory and consulting services. PwC said it will also determine ways for tax professionals to use the technology. It said its access to Harvey's technology is exclusive among the Big Four professional services firms. Harvey is built on technology from OpenAI, the Microsoft Corp-backed startup that on Tuesday released an upgraded version of its AI sensation ChatGPT. Harvey received a $5 million investment last year in a funding round led by the OpenAI Startup Fund.


Human-Guided Fair Classification for Natural Language Processing

arXiv.org Artificial Intelligence

Text classifiers have promising applications in high-stake tasks such as resume screening and content moderation. These classifiers must be fair and avoid discriminatory decisions by being invariant to perturbations of sensitive attributes such as gender or ethnicity. However, there is a gap between human intuition about these perturbations and the formal similarity specifications capturing them. While existing research has started to address this gap, current methods are based on hardcoded word replacements, resulting in specifications with limited expressivity or ones that fail to fully align with human intuition (e.g., in cases of asymmetric counterfactuals). This work proposes novel methods for bridging this gap by discovering expressive and intuitive individual fairness specifications. We show how to leverage unsupervised style transfer and GPT-3's zero-shot capabilities to automatically generate expressive candidate pairs of semantically similar sentences that differ along sensitive attributes. We then validate the generated pairs via an extensive crowdsourcing study, which confirms that a lot of these pairs align with human intuition about fairness in the context of toxicity classification. Finally, we show how limited amounts of human feedback can be leveraged to learn a similarity specification that can be used to train downstream fairness-aware models.


Towards the Scalable Evaluation of Cooperativeness in Language Models

arXiv.org Artificial Intelligence

It is likely that AI systems driven by pre-trained language models (PLMs) will increasingly be used to assist humans in high-stakes interactions with other agents, such as negotiation or conflict resolution. Consistent with the goals of Cooperative AI \citep{dafoe_open_2020}, we wish to understand and shape the multi-agent behaviors of PLMs in a pro-social manner. An important first step is the evaluation of model behaviour across diverse cooperation problems. Since desired behaviour in an interaction depends upon precise game-theoretic structure, we focus on generating scenarios with particular structures with both crowdworkers and a language model. Our work proceeds as follows. First, we discuss key methodological issues in the generation of scenarios corresponding to particular game-theoretic structures. Second, we employ both crowdworkers and a language model to generate such scenarios. We find that the quality of generations tends to be mediocre in both cases. We additionally get both crowdworkers and a language model to judge whether given scenarios align with their intended game-theoretic structure, finding mixed results depending on the game. Third, we provide a dataset of scenario based on our data generated. We provide both quantitative and qualitative evaluations of UnifiedQA and GPT-3 on this dataset. We find that instruct-tuned models tend to act in a way that could be perceived as cooperative when scaled up, while other models seemed to have flat scaling trends.


Automatic Geo-alignment of Artwork in Children's Story Books

arXiv.org Artificial Intelligence

A study was conducted to prove AI software could be used to translate and generate illustrations without any human intervention. This was done with the purpose of showing and distributing it to the external customer, Pratham Books. The project aligns with the company's vision by leveraging the generalisation and scalability of Machine Learning algorithms, offering significant cost efficiency increases to a wide range of literary audiences in varied geographical locations. A comparative study methodology was utilised to determine the best performant method out of the 3 devised, Prompt Augmentation using Keywords, CLIP Embedding Mask, and Cross Attention Control with Editorial Prompts. A thorough evaluation process was completed using both quantitative and qualitative measures. Each method had its own strengths and weaknesses, but through the evaluation, method 1 was found to have the best yielding results. Promising future advancements may be made to further increase image quality by incorporating Large Language Models and personalised stylistic models. The presented approach can also be adapted to Video and 3D sculpture generation for novel illustrations in digital webbooks.


A Short Survey of Viewing Large Language Models in Legal Aspect

arXiv.org Artificial Intelligence

Large language models (LLMs) have transformed many fields, including natural language processing, computer vision, and reinforcement learning. These models have also made a significant impact in the field of law, where they are being increasingly utilized to automate various legal tasks, such as legal judgement prediction, legal document analysis, and legal document writing. However, the integration of LLMs into the legal field has also raised several legal problems, including privacy concerns, bias, and explainability. In this survey, we explore the integration of LLMs into the field of law. We discuss the various applications of LLMs in legal tasks, examine the legal challenges that arise from their use, and explore the data resources that can be used to specialize LLMs in the legal domain. Finally, we discuss several promising directions and conclude this paper. By doing so, we hope to provide an overview of the current state of LLMs in law and highlight the potential benefits and challenges of their integration.


HIVE: Harnessing Human Feedback for Instructional Visual Editing

arXiv.org Artificial Intelligence

Incorporating human feedback has been shown to be crucial to align text generated by large language models to human preferences. We hypothesize that state-of-the-art instructional image editing models, where outputs are generated based on an input image and an editing instruction, could similarly benefit from human feedback, as their outputs may not adhere to the correct instructions and preferences of users. In this paper, we present a novel framework to harness human feedback for instructional visual editing (HIVE). Specifically, we collect human feedback on the edited images and learn a reward function to capture the underlying user preferences. We then introduce scalable diffusion model fine-tuning methods that can incorporate human preferences based on the estimated reward. Besides, to mitigate the bias brought by the limitation of data, we contribute a new 1M training dataset, a 3.6K reward dataset for rewards learning, and a 1K evaluation dataset to boost the performance of instructional image editing. We conduct extensive empirical experiments quantitatively and qualitatively, showing that HIVE is favored over previous state-of-the-art instructional image editing approaches by a large margin.