Large Language Model
New Trends in Machine Translation using Large Language Models: Case Examples with ChatGPT
Lyu, Chenyang, Xu, Jitao, Wang, Longyue
Machine Translation (MT) has made significant progress in recent years using deep learning, especially after the emergence of large language models (LLMs) such as GPT-3 and ChatGPT. This brings new challenges and opportunities for MT using LLMs. In this paper, we brainstorm some interesting directions for MT using LLMs, including stylized MT, interactive MT, and Translation Memory-based MT, as well as a new evaluation paradigm using LLMs. We also discuss the privacy concerns in MT using LLMs and a basic privacy-preserving method to mitigate such risks. To illustrate the potential of our proposed directions, we present several examples for the new directions mentioned above, demonstrating the feasibility of the proposed directions and highlight the opportunities and challenges for future research in MT using LLMs.
Automated Paper Screening for Clinical Reviews Using Large Language Models
Guo, Eddie, Gupta, Mehul, Deng, Jiawen, Park, Ye-Jean, Paget, Mike, Naugler, Christopher
Objective: To assess the performance of the OpenAI GPT API in accurately and efficiently identifying relevant titles and abstracts from real-world clinical review datasets and compare its performance against ground truth labelling by two independent human reviewers. Methods: We introduce a novel workflow using the OpenAI GPT API for screening titles and abstracts in clinical reviews. A Python script was created to make calls to the GPT API with the screening criteria in natural language and a corpus of title and abstract datasets that have been filtered by a minimum of two human reviewers. We compared the performance of our model against human-reviewed papers across six review papers, screening over 24,000 titles and abstracts. Results: Our results show an accuracy of 0.91, a sensitivity of excluded papers of 0.91, and a sensitivity of included papers of 0.76. On a randomly selected subset of papers, the GPT API demonstrated the ability to provide reasoning for its decisions and corrected its initial decision upon being asked to explain its reasoning for a subset of incorrect classifications. Conclusion: The GPT API has the potential to streamline the clinical review process, save valuable time and effort for researchers, and contribute to the overall quality of clinical reviews. By prioritizing the workflow and acting as an aid rather than a replacement for researchers and reviewers, the GPT API can enhance efficiency and lead to more accurate and reliable conclusions in medical research.
How ChatGPT and Other LLMs Work--and Where They Could Go Next
AI-powered chatbots such as ChatGPT and Google Bard are certainly having a moment--the next generation of conversational software tools promise to do everything from taking over our web searches to producing an endless supply of creative literature to remembering all the world's knowledge so we don't have to. ChatGPT, Google Bard, and other bots like them, are examples of large language models, or LLMs, and it's worth digging into how they work. It means you'll be able to better make use of them, and have a better appreciation of what they're good at (and what they really shouldn't be trusted with). Like a lot of artificial intelligence systems--like the ones designed to recognize your voice or generate cat pictures--LLMs are trained on huge amounts of data. The companies behind them have been rather circumspect when it comes to revealing where exactly that data comes from, but there are certain clues we can look at. For example, the research paper introducing the LaMDA (Language Model for Dialogue Applications) model, which Bard is built on, mentions Wikipedia, "public forums," and "code documents from sites related to programming like Q&A sites, tutorials, etc."
AI journalism is getting harder to tell from the old-fashioned, human-generated kind Ian Tucker
A couple of weeks ago I tweeted a call-out for freelance journalists to pitch me feature ideas for the science and sechnology section of the Observer's New Review. Unsurprisingly, given headlines, fears and interest in LLM (large language model) chatbots such as ChatGPT, many of the suggestions that flooded in focused on artificial intelligence – including a pitch about how it is being employed to predict deforestation in the Amazon. One submission however, from an engineering student who had posted a couple of articles on Medium, seemed to be riding the artificial intelligence wave with more chutzpah. He offered three feature ideas – pitches on innovative agriculture, data storage and the therapeutic potential of VR. While coherent, the pitches had a bland authority about them, repetitive paragraph structure, and featured upbeat endings, which if you've been toying with ChatGPT or reading about Google chatbot Bard's latest mishaps, are hints of chatbot-generated content.
G7 digital ministers agree to pursue responsible AI as ChatGPT booms
The ministers also agreed to further promote smooth and trustworthy cross-border data flows -- one of Japan's key goals for the two-day G7 tech meeting -- as more countries look to tighten regulations on the flow of data. How to apply rules to the use of generative AI tools is becoming a pressing issue for governments around the world in the wake of the public debut of OpenAI's ChatGPT last November. Since then, the chatbot app has demonstrated its high capacity to handle a variety of tasks, including finding and summarizing information, drafting documents and checking programing code. This could be due to a conflict with your ad-blocking or security software. Please add japantimes.co.jp and piano.io to your list of allowed sites.
Daiwa Securities takes lead in finance sector over ChatGPT use
Brokerage Daiwa Securities has taken the lead among major financial institutions in the country in adopting the ChatGPT chatbot to help its employees work more efficiently. Daiwa Securities began using ChatGPT, which it has described as having "immense potential," from Wednesday, with an eye to streamlining day-to-day tasks including information gathering in English. The firm also said it hopes to see a reduction in costs and time for preparing outsourcing tasks such as creating documents, leaving employees more time to craft business plans and complete other assignments. This could be due to a conflict with your ad-blocking or security software. Please add japantimes.co.jp and piano.io to your list of allowed sites.
SMILE: Single-turn to Multi-turn Inclusive Language Expansion via ChatGPT for Mental Health Support
Qiu, Huachuan, He, Hongliang, Zhang, Shuai, Li, Anqi, Lan, Zhenzhong
There has been an increasing research interest in developing specialized dialogue systems that can offer mental health support. However, gathering large-scale and real-life multi-turn conversations for mental health support poses challenges due to the sensitivity of personal information, as well as the time and cost involved. To address these issues, we introduce the SMILE approach, an inclusive language expansion technique that employs ChatGPT to extend public single-turn dialogues into multi-turn ones. Our research first presents a preliminary exploratory study that validates the effectiveness of the SMILE approach. Furthermore, we conduct a comprehensive and systematic contrastive analysis of datasets generated with and without the SMILE approach, demonstrating that the SMILE method results in a large-scale, diverse, and close-to-real-life multi-turn mental health support conversation corpus, including dialog topics, lexical and semantic features. Finally, we use the collected corpus (SMILECHAT) to develop a more effective dialogue system that offers emotional support and constructive suggestions in multi-turn conversations for mental health support.
Evaluation of GPT-3.5 and GPT-4 for supporting real-world information needs in healthcare delivery
Dash, Debadutta, Thapa, Rahul, Banda, Juan M., Swaminathan, Akshay, Cheatham, Morgan, Kashyap, Mehr, Kotecha, Nikesh, Chen, Jonathan H., Gombar, Saurabh, Downing, Lance, Pedreira, Rachel, Goh, Ethan, Arnaout, Angel, Morris, Garret Kenn, Magon, Honor, Lungren, Matthew P, Horvitz, Eric, Shah, Nigam H.
Despite growing interest in using large language models (LLMs) in healthcare, current explorations do not assess the real-world utility and safety of LLMs in clinical settings. Our objective was to determine whether two LLMs can serve information needs submitted by physicians as questions to an informatics consultation service in a safe and concordant manner. Sixty six questions from an informatics consult service were submitted to GPT-3.5 and GPT-4 via simple prompts. 12 physicians assessed the LLM responses' possibility of patient harm and concordance with existing reports from an informatics consultation service. Physician assessments were summarized based on majority vote. For no questions did a majority of physicians deem either LLM response as harmful. For GPT-3.5, responses to 8 questions were concordant with the informatics consult report, 20 discordant, and 9 were unable to be assessed. There were 29 responses with no majority on "Agree", "Disagree", and "Unable to assess". For GPT-4, responses to 13 questions were concordant, 15 discordant, and 3 were unable to be assessed. There were 35 responses with no majority. Responses from both LLMs were largely devoid of overt harm, but less than 20% of the responses agreed with an answer from an informatics consultation service, responses contained hallucinated references, and physicians were divided on what constitutes harm. These results suggest that while general purpose LLMs are able to provide safe and credible responses, they often do not meet the specific information need of a given question. A definitive evaluation of the usefulness of LLMs in healthcare settings will likely require additional research on prompt engineering, calibration, and custom-tailoring of general purpose models.
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Casanova, Edresson, Weber, Julian, Shulby, Christopher, Junior, Arnaldo Candido, Gölge, Eren, Ponti, Moacir Antonelli
YourTTS brings the power of a multilingual approach to the task of zero-shot multi-speaker TTS. Our method builds upon the VITS model and adds several novel modifications for zero-shot multi-speaker and multilingual training. We achieved state-of-the-art (SOTA) results in zero-shot multi-speaker TTS and results comparable to SOTA in zero-shot voice conversion on the VCTK dataset. Additionally, our approach achieves promising results in a target language with a single-speaker dataset, opening possibilities for zero-shot multi-speaker TTS and zero-shot voice conversion systems in low-resource languages. Finally, it is possible to fine-tune the YourTTS model with less than 1 minute of speech and achieve state-of-the-art results in voice similarity and with reasonable quality. This is important to allow synthesis for speakers with a very different voice or recording characteristics from those seen during training.
Chemists are teaching GPT-4 to do chemistry and control lab robots
Language models that power chatbots like ChatGPT can be used for automated chemistry, from synthesising chemicals and discovering drugs to designing, planning and carrying out scientific experiments. Large language models like GPT-4 have been trained on data from much of the internet and appear to be competent at answering problems from a wide range of disciplines, but they can struggle with tasks requiring more expert knowledge, such as chemistry. "They lack this chemical knowledge and they are not really good at representing …