Goto

Collaborating Authors

 Generative AI


New York Times Sues Microsoft and OpenAI, Alleging Copyright Infringement

WSJ.com: WSJD - Technology

In a complaint filed Wednesday, the Times said the technology companies exploited its content without permission to create their AI products, including OpenAI's humanlike chatbot ChatGPT and Microsoft's Copilot. The tools were trained on millions of pieces of Times content, the suit said, and draw on that material to serve up answers to users' prompts.


The Morning After: Microsoft's big bet on AI in 2023

Engadget

Microsoft, a notoriously conservative and slow-moving giant, is bringing artificial intelligence right into the heart of Windows. But. after investing a total of $13 billion in ChatGPT-maker OpenAI (and acquiring a 49 percent stake in the process), will AI actually make its products better? Bing Chat officially kicked off its year of AI, while Copilot, assisting with its AI smarts, subsequently launched on Edge, Microsoft 365 products like Word and Powerpoint and eventually Windows 11. While the AI interactions aren't perfect, the one constant around AI is that everything is changing incredibly quickly. Microsoft has already announced Copilot will be upgraded with the more powerful GPT-4 Turbo and Dall-E 3 models.


Apple's iPhone designer is leaving to work with Jony Ive and Sam Altman on AI hardware

Engadget

Apple's designer exodus continues as product design chief Tang Tan is leaving the company and joining Jony Ive's design firm LoveFrom, according to Bloomberg's Mark Gurman. There, he'll reportedly work on a new artificial intelligence hardware project backed by OpenAI's Sam Altman with aim of creating devices deploying the latest deep learning technology. Tan was in charge of design for Apple's main products including the iPhone, Watch and AirPods, so his departure leaves a sizable hole. As part of LoveFrom, Tan will act as hardware design lead for the new AI project, with Altman providing the software running underneath. All products are supposedly in the early concept phases, with a focus on devices for the home.


Exploiting the capacity of deep networks only at training stage for nonlinear black-box system identification

arXiv.org Artificial Intelligence

To benefit from the modeling capacity of deep models in system identification, without worrying about inference time, this study presents a novel training strategy that uses deep models only at the training stage. For this purpose two separate models with different structures and goals are employed. The first one is a deep generative model aiming at modeling the distribution of system output(s), called the teacher model, and the second one is a shallow basis function model, named the student model, fed by system input(s) to predict the system output(s). That means these isolated paths must reach the same ultimate target. As deep models show a great performance in modeling of highly nonlinear systems, aligning the representation space learned by these two models make the student model to inherit the approximation power of the teacher model. The proposed objective function consists of the objective of each student and teacher model adding up with a distance penalty between the learned latent representations. The simulation results on three nonlinear benchmarks show a comparative performance with examined deep architectures applied on the same benchmarks. Algorithmic transparency and structure efficiency are also achieved as byproducts.


StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data

arXiv.org Artificial Intelligence

The remarkable multimodal capabilities demonstrated by OpenAI's GPT-4 have sparked significant interest in the development of multimodal Large Language Models (LLMs). A primary research objective of such models is to align visual and textual modalities effectively while comprehending human instructions. Current methodologies often rely on annotations derived from benchmark datasets to construct image-dialogue datasets for training purposes, akin to instruction tuning in LLMs. However, these datasets often exhibit domain bias, potentially constraining the generative capabilities of the models. In an effort to mitigate these limitations, we propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning. This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models to yield a diverse and controllable dataset with varied image content. Additionally, datasets can be arbitrarily scaled. This not only provides greater flexibility compared to existing methodologies but also significantly enhances several model capabilities. Our research includes comprehensive experiments conducted on various datasets. The results emphasize substantial enhancements in more than ten commonly assessed capabilities. Additionally, our model achieves state-of-the-art results across multiple widely recognized multimodal benchmarks.


A Pathway Towards Responsible AI Generated Content

arXiv.org Artificial Intelligence

AI Generated Content (AIGC) has received tremendous attention within the past few years, with content generated in the format of image, text, audio, video, etc. Meanwhile, AIGC has become a double-edged sword and recently received much criticism regarding its responsible usage. In this article, we focus on 8 main concerns that may hinder the healthy development and deployment of AIGC in practice, including risks from (1) privacy; (2) bias, toxicity, misinformation; (3) intellectual property (IP); (4) robustness; (5) open source and explanation; (6) technology abuse; (7) consent, credit, and compensation; (8) environment. Additionally, we provide insights into the promising directions for tackling these risks while constructing generative models, enabling AIGC to be used more responsibly to truly benefit society.


Microsoft's Copilot AI assistant arrives on Android

Engadget

Microsoft's Copilot tool, the company's AI chatbot that can do everything from help you write code to draft a marketing email, has made its way onto Android mobile devices. Copilot, which is powered by OpenAI's latest models GPT-4 and DALL-E 3, can also be used to generate images from simple text descriptions and requests. The app is available on the Google Play Store, is free to download and does not require a Microsoft account to sign in. The rollout of a mobile version of Microsoft's Copilot ( formerly Bing Chat) was quiet -- with little buzz and no formal announcements, unlike what we saw with the release of Bing Chat on mobile devices. The new Copilot app was released earlier this month and was initially spotted by Neowin when X users noticed it in the Play Store.


Microsoft bet big on AI in 2023, but its AI future is still unclear

Engadget

And in my testing, it also crashes more often than you'd think, which requires a "reboot" of your session (but at least it doesn't flash a blue screen like Windows). In an effort to temper our expectations, Microsoft has a helpful note emblazoned atop Bing's AI chat: "Bing is powered by AI, so surprises and mistakes are possible. Please share feedback so we can improve!" Microsoft appears to show a bit of humility here by acknowledging that its AI chat isn't perfect, and it's trying to earn some brownie points by saying it's listening to your feedback. Mostly, though, that warning serves as a way out for Microsoft.


How to Use OpenAI's ChatGPT to Create Your Own Custom GPT

WIRED

I was never afraid to train an AI chatbot on my writing, because OpenAI had already broken the seal. CEO Sam Altman announced the "GPT" feature at OpenAI's first developer day in November, prior to the company's five days of leadership chaos. Before the release of custom GPTs, ChatGPT with web browsing was already able to plunder my writing for answers to questions about everything, from using better prompts to understanding niche creepypastas. Why not wrestle around with the chatbot and see if it can mimic me tout à fait? Together, let's see how far we can trek into the uncanny valley with AI and learn how to make one of these so-called GPTs using OpenAI's tools.


One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications

arXiv.org Artificial Intelligence

The prevalent use of commercial and open-source diffusion models (DMs) for text-to-image generation prompts risk mitigation to prevent undesired behaviors. Existing concept erasing methods in academia are all based on full parameter or specification-based fine-tuning, from which we observe the following issues: 1) Generation alternation towards erosion: Parameter drift during target elimination causes alternations and potential deformations across all generations, even eroding other concepts at varying degrees, which is more evident with multi-concept erased; 2) Transfer inability & deployment inefficiency: Previous model-specific erasure impedes the flexible combination of concepts and the training-free transfer towards other models, resulting in linear cost growth as the deployment scenarios increase. To achieve non-invasive, precise, customizable, and transferable elimination, we ground our erasing framework on one-dimensional adapters to erase multiple concepts from most DMs at once across versatile erasing applications. The concept-SemiPermeable structure is injected as a Membrane (SPM) into any DM to learn targeted erasing, and meantime the alteration and erosion phenomenon is effectively mitigated via a novel Latent Anchoring fine-tuning strategy. Once obtained, SPMs can be flexibly combined and plug-and-play for other DMs without specific re-tuning, enabling timely and efficient adaptation to diverse scenarios. During generation, our Facilitated Transport mechanism dynamically regulates the permeability of each SPM to respond to different input prompts, further minimizing the impact on other concepts. Quantitative and qualitative results across ~40 concepts, 7 DMs and 4 erasing applications have demonstrated the superior erasing of SPM. Our code and pre-tuned SPMs will be available on the project page https://lyumengyao.github.io/projects/spm.