AITopics

doi: 10.1145/3600211.3604712

2304.09991

Country:

North America > United States > New York > New York County > New York City (0.05)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > Texas (0.04)
(7 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.68)
Personal > Interview (0.46)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Government (0.68)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Daily Mail - Science & techNov-29-2023, 09:36:30 GMT

AI popstar Anna Indiana is ridiculed for her first single - so, do YOU think it deserves the hate?

Critics might complain that modern pop music is soulless and artificial - but a new'AI popstar' takes that to a whole new level. Anna Indiana, a self-described AI singer-songwriter, has been ridiculed after releasing her first single. In a video posted to YouTube, Anna performs a pop song to a backing track of piano, guitar, and drums. Introducing itself, the AI explains: 'Everything from the key, tempo, chord progression, melody notes, rhythm, lyrics, and my image and singing, is auto-generated using AI.' However, music fans have not reacted well to the release, calling it'horrifying' and'unnerving'.

anna indiana, commenter, indiana, (12 more...)

Daily Mail - Science & tech

Country: North America > United States > Indiana (0.68)

Genre:

Questionnaire & Opinion Survey (0.40)
Personal > Interview (0.40)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Communications > Social Media (0.59)
Information Technology > Artificial Intelligence > Applied AI (0.36)

Nasr, Milad, Carlini, Nicholas, Hayase, Jonathan, Jagielski, Matthew, Cooper, A. Feder, Ippolito, Daphne, Choquette-Choo, Christopher A., Wallace, Eric, Tramèr, Florian, Lee, Katherine

Scalable Extraction of Training Data from (Production) Language Models

This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT. Existing techniques from the literature suffice to attack unaligned models; in order to attack the aligned ChatGPT, we develop a new divergence attack that causes the model to diverge from its chatbot-style generations and emit training data at a rate 150x higher than when behaving properly. Our methods show practical attacks can recover far more data than previously thought, and reveal that current alignment techniques do not eliminate memorization.

large language model, machine learning, natural language, (22 more...)

2311.17035

Country:

South America (1.00)
North America > United States > California (1.00)
Asia > Middle East (1.00)
(39 more...)

Genre:

Personal (1.00)
Research Report > New Finding (0.92)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Media > Television (1.00)
(26 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

RETSim: Resilient and Efficient Text Similarity

Zhang, Marina, Vallis, Owen, Bumin, Aysegul, Vakharia, Tanay, Bursztein, Elie

This paper introduces RETSim (Resilient and Efficient Text Similarity), a lightweight, multilingual deep learning model trained to produce robust metric embeddings for near-duplicate text retrieval, clustering, and dataset deduplication tasks. We demonstrate that RETSim is significantly more robust and accurate than MinHash and neural text embeddings, achieving new state-of-the-art performance on dataset deduplication, adversarial text retrieval benchmarks, and spam clustering tasks. We also introduce the W4NT3D benchmark (Wiki-40B 4dversarial Near-T3xt Dataset) for evaluating multilingual, near-duplicate text retrieval capabilities under adversarial settings. RETSim and the W4NT3D benchmark are open-sourced under the MIT License at https://github.com/google/unisim.

benchmark, dataset, retsim, (12 more...)

2311.17264

Country:

North America > United States > Montana > Gallatin County > Bozeman (0.04)
North America > United States > Idaho (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
(7 more...)

Genre:

Personal > Obituary (0.68)
Research Report (0.41)

Industry:

Information Technology > Security & Privacy (0.68)
Media (0.68)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models

Zhou, Jinfeng, Chen, Zhuang, Wan, Dazhen, Wen, Bosi, Song, Yi, Yu, Jifan, Huang, Yongkang, Peng, Libiao, Yang, Jiaming, Xiao, Xiyao, Sabour, Sahand, Zhang, Xiaohan, Hou, Wenjing, Zhang, Yijia, Dong, Yuxiao, Tang, Jie, Huang, Minlie

In this paper, we present CharacterGLM, a series of models built upon ChatGLM, with model sizes ranging from 6B to 66B parameters. Our CharacterGLM is designed for generating Character-based Dialogues (CharacterDial), which aims to equip a conversational AI system with character customization for satisfying people's inherent social desires and emotional needs. On top of CharacterGLM, we can customize various AI characters or social agents by configuring their attributes (identities, interests, viewpoints, experiences, achievements, social relationships, etc.) and behaviors (linguistic features, emotional expressions, interaction patterns, etc.). Our model outperforms most mainstream close-source large langauge models, including the GPT series, especially in terms of consistency, human-likeness, and engagement according to manual evaluations. We will release our 6B version of CharacterGLM and a subset of training data to facilitate further research development in the direction of character-based dialogue generation.

ai character, characterglm, dialogue, (15 more...)

2311.16832

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(5 more...)

Genre:

Research Report (0.64)
Personal > Honors (0.46)

Industry:

Transportation > Ground > Road (0.46)
Transportation > Electric Vehicle (0.46)
Energy > Renewable (0.46)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Nishal, Sachita, Sinchai, Jasmine, Diakopoulos, Nicholas

Understanding Practices around Computational News Discovery Tools in the Domain of Science Journalism

Science and technology journalists today face challenges in finding newsworthy leads due to increased workloads, reduced resources, and expanding scientific publishing ecosystems. Given this context, we explore computational methods to aid these journalists' news discovery in terms of time-efficiency and agency. In particular, we prototyped three computational information subsidies into an interactive tool that we used as a probe to better understand how such a tool may offer utility or more broadly shape the practices of professional science journalists. Our findings highlight central considerations around science journalists' agency, context, and responsibilities that such tools can influence and could account for in design. Based on this, we suggest design opportunities for greater and longer-term user agency; incorporating contextual, personal and collaborative notions of newsworthiness; and leveraging flexible interfaces and generative models. Overall, our findings contribute a richer view of the sociotechnical system around computational news discovery tools, and suggest ways to improve such tools to better support the practices of science journalists.

journalist, news angle, participant, (12 more...)

2311.06864

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > New York > New York County > New York City (0.04)
(20 more...)

Genre:

Research Report > New Finding (1.00)
Personal > Interview (0.93)

Industry: Media > News (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
(5 more...)

Marsellus: A Heterogeneous RISC-V AI-IoT End-Node SoC with 2-to-8b DNN Acceleration and 30%-Boost Adaptive Body Biasing

Conti, Francesco, Paulin, Gianna, Garofalo, Angelo, Rossi, Davide, Di Mauro, Alfio, Rutishauser, Georg, Ottavi, Gianmarco, Eggimann, Manuel, Okuhara, Hayate, Benini, Luca

Emerging Artificial Intelligence-enabled Internet-of-Things (AI-IoT) System-on-a-Chip (SoC) for augmented reality, personalized healthcare, and nano-robotics need to run many diverse tasks within a power envelope of a few tens of mW over a wide range of operating conditions: compute-intensive but strongly quantized Deep Neural Network (DNN) inference, as well as signal processing and control requiring high-precision floating-point. We present Marsellus, an all-digital heterogeneous SoC for AI-IoT end-nodes fabricated in GlobalFoundries 22nm FDX that combines 1) a general-purpose cluster of 16 RISC-V Digital Signal Processing (DSP) cores attuned for the execution of a diverse range of workloads exploiting 4-bit and 2-bit arithmetic extensions (XpulpNN), combined with fused MAC&LOAD operations and floating-point support; 2) a 2-8bit Reconfigurable Binary Engine (RBE) to accelerate 3x3 and 1x1 (pointwise) convolutions in DNNs; 3) a set of On-Chip Monitoring (OCM) blocks connected to an Adaptive Body Biasing (ABB) generator and a hardware control loop, enabling on-the-fly adaptation of transistor threshold voltages. Marsellus achieves up to 180 Gop/s or 3.32 Top/s/W on 2-bit precision arithmetic in software, and up to 637 Gop/s or 12.4 Top/s/W on hardware-accelerated DNN layers.

arsellus, efficiency, opération, (16 more...)

doi: 10.1109/JSSC.2023.3318301

2305.08415

Country:

Europe > Switzerland > Zürich > Zürich (0.15)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.05)
Asia > Singapore > Central Region > Singapore (0.04)
(6 more...)

Genre:

Personal (0.46)
Research Report (0.40)

Industry:

Semiconductors & Electronics (1.00)
Information Technology > Smart Houses & Appliances (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

arXiv.org Artificial IntelligenceNov-27-2023

Tokenized Model: A Blockchain-Empowered Decentralized Model Ownership Verification Platform

Li, Yihao, Lai, Yanyi, Liao, Tianchi, Chen, Chuan, Zheng, Zibin

With the development of practical deep learning models like generative AI, their excellent performance has brought huge economic value. For instance, ChatGPT has attracted more than 100 million users in three months. Since the model training requires a lot of data and computing power, a well-performing deep learning model is behind a huge effort and cost. Facing various model attacks, unauthorized use and abuse from the network that threaten the interests of model owners, in addition to considering legal and other administrative measures, it is equally important to protect the model's copyright from the technical means. By using the model watermarking technology, we point out the possibility of building a unified platform for model ownership verification. Given the application history of blockchain in copyright verification and the drawbacks of a centralized third-party, this paper considers combining model watermarking technology and blockchain to build a unified model copyright protection platform. By a new solution we called Tokenized Model, it protects the model's copyright by reliable ownership record and verification mechanism. It also promotes the financial value of model by constructing the model's transaction process and contribution shares of a model. In the typical case study, we also study the various performance under usual scenario to verify the effectiveness of this platform.

blockchain, verification, watermark, (14 more...)

2312.00048

Country:

Asia > China > Guangdong Province > Guangzhou (0.05)
Asia > China > Hong Kong (0.05)

Genre:

Research Report (0.50)
Personal (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-27-2023

Releasing the CRaQAn (Coreference Resolution in Question-Answering): An open-source dataset and dataset creation methodology using instruction-following models

Grzywinski, Rob, D'Arcy, Joshua, Naidoff, Rob, Shukla, Ashish, Browne, Alex, Gibbons, Ren, Bent, Brinnae

Instruction-following language models demand robust methodologies for information retrieval to augment instructions for question-answering applications. A primary challenge is the resolution of coreferences in the context of chunking strategies for long documents. The critical barrier to experimentation of handling coreferences is a lack of open source datasets, specifically in question-answering tasks that require coreference resolution. In this work we present our Coreference Resolution in Question-Answering (CRaQAn) dataset, an open-source dataset that caters to the nuanced information retrieval requirements of coreference resolution in question-answering tasks by providing over 250 question-answer pairs containing coreferences. To develop this dataset, we developed a novel approach for creating high-quality datasets using an instruction-following model (GPT-4) and a Recursive Criticism and Improvement Loop.

coreference resolution, dataset, reviewer, (11 more...)

2311.16338

Country:

North America > United States > California (0.04)
South America (0.04)
Atlantic Ocean > North Atlantic Ocean (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)

Genre:

Research Report (0.84)
Personal > Honors (0.68)

Industry:

Health & Medicine > Government Relations & Public Policy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Law (0.93)
Health & Medicine > Health Care Providers & Services > Reimbursement (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

FOX NewsNov-26-2023, 10:00:05 GMT

'Love Actually' 20th anniversary: Keira Knightley, Hugh Grant and Colin Firth then and now

Fox News Flash top entertainment and celebrity headlines are here. The classic romantic comedy follows nine love stories, including tales about a woman whose husband is cheating on her, a man who's in love with his best friend's wife, and a little boy trying to profess his love to his crush. In its initial run, the movie made nearly $250 million at the global box office, and has since become a must-watch movie during the holiday season. Here is what the film's cast has been up to since its November 2003 release date. "Love Actually" is celebrating its 20th anniversary.

nomination, sequel, universal picture getty image, (14 more...)

FOX News

Country:

Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.05)
Africa > Rwanda (0.05)

Genre: Personal > Obituary (0.73)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence (0.48)