Goto

Collaborating Authors

 youtu


"I made this (sort of)": Negotiating authorship, confronting fraudulence, and exploring new musical spaces with prompt-based AI music generation

Sturm, Bob L. T.

arXiv.org Artificial Intelligence

I reflect on my experience creating two music albums centered on state-of-the-art prompt-based AI music generation platforms. The first album explicitly poses the question: What happens when I collide my junk mail with these platforms? The second album is a direct response to the first, and toys with the inability of state-of-the-art prompt-based AI music generation platforms to generate music that is not ``practiced'', ``polished'', and ``produced''. I seed a large language model (LLM) with information about these albums and have it interview me, which results in the exploration of several deeper questions: To what extent am I the author? Where am I in the resulting music? How is my musical identity changing as I am faced with machines that are in some ways far more talented than I? What new musical spaces does my work open, for me or anyone/thing else? I conclude by reflecting on my reflections, as well as LLM-mediated self-reflection as method.


On a measure of intelligence

Gurevich, Yuri

arXiv.org Artificial Intelligence

The measure of intelligence is the ability to change. Abstract The Fall 2024 Logic in Computer Science column of the Bulletin of EATCS is a little discussion on intelligence, measuring intelligence, and related issues, provoked by a fascinating must-read article "On the measure of intelligence" by François Chollet. The discussion includes a modicum of critique of the article. Q: Is it about psychology? Chollet is a prominent figure in AI. Q: We spoke about AI last spring. But you didn't seem to be interested in AI before that. A: This is largely correct, though I read Norbert Wiener's "Cybernetics" [18], when it was translated to Russian in 1968, and was taken with it. For a while I tried to follow cybernetics developments, at least in the USSR.


A Surprisingly Efficient Representation for Multi-Finger Grasping

Yan, Hengxu, Fang, Hao-Shu, Lu, Cewu

arXiv.org Artificial Intelligence

The problem of grasping objects using a multi-finger hand has received significant attention in recent years. However, it remains challenging to handle a large number of unfamiliar objects in real and cluttered environments. In this work, we propose a representation that can be effectively mapped to the multi-finger grasp space. Based on this representation, we develop a simple decision model that generates accurate grasp quality scores for different multi-finger grasp poses using only hundreds to thousands of training samples. We demonstrate that our representation performs well on a real robot and achieves a success rate of 78.64% after training with only 500 real-world grasp attempts and 87% with 4500 grasp attempts. Additionally, we achieve a success rate of 84.51% in a dynamic human-robot handover scenario using a multi-finger hand.


Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent

Jucys, Karolis, Adamopoulos, George, Hamidi, Mehrab, Milani, Stephanie, Samsami, Mohammad Reza, Zholus, Artem, Joseph, Sonia, Richards, Blake, Rish, Irina, Şimşek, Özgür

arXiv.org Artificial Intelligence

Understanding the mechanisms behind decisions taken by large foundation models in sequential decision making tasks is critical to ensuring that such systems operate transparently and safely. In this work, we perform exploratory analysis on the Video PreTraining (VPT) Minecraft playing agent, one of the largest open-source vision-based agents. We aim to illuminate its reasoning mechanisms by applying various interpretability techniques. First, we analyze the attention mechanism while the agent solves its training task - crafting a diamond pickaxe. The agent pays attention to the last four frames and several key-frames further back in its six-second memory. This is a possible mechanism for maintaining coherence in a task that takes 3-10 minutes, despite the short memory span. Secondly, we perform various interventions, which help us uncover a worrying case of goal misgeneralization: VPT mistakenly identifies a villager wearing brown clothes as a tree trunk when the villager is positioned stationary under green tree leaves, and punches it to death.


Development of a Novel Impedance-Controlled Quasi-Direct-Drive Robot Hand

Best, Jay

arXiv.org Artificial Intelligence

Development of a Novel Impedance-Controlled Quasi-Direct-Drive Robot Hand by Jay Best Master of Science in Mechanical Engineering Stony Brook University 2023 Most robotic hands and grippers rely on actuators with large gearboxes and force sensors for controlling gripping force. However, this might not be ideal for tasks which require the robot to interact with an unstructured and/or unknown environment. We propose a novel quasidirect-drive two-fingered robotic hand with variable impedance control in the joint space and Cartesian space. The hand has a total of four degrees of freedom, a backdrivable gear train, and four brushless direct current (BLDC) motors. Field-Oriented Control (FOC) with current sensing is used to control motor torques. Variable impedance control allows the hand to perform dexterous manipulation tasks while being safe during human-robot interaction. The quasidirect-drive actuators enable the fingers to handle contact with the environment without the need for complicated tactile or force sensors. A majority 3D printed assembly makes this a lowcost research platform built with affordable off-the-shelf components. The hand demonstrates grasping with force-closure and form-closure, stable grasps in response to disturbances, tasks exploiting contact with the environment, simple in-hand manipulation, and a light touch for handling fragile objects.


Generating symbolic music using diffusion models

Atassi, Lilac

arXiv.org Artificial Intelligence

Denoising Diffusion Probabilistic models have emerged as simple yet very powerful generative models. Unlike other generative models, diffusion models do not suffer from mode collapse or require a discriminator to generate high-quality samples. In this paper, a diffusion model that uses a binomial prior distribution to generate piano rolls is proposed. The paper also proposes an efficient method to train the model and generate samples. The generated music has coherence at time scales up to the length of the training piano roll segments. The paper demonstrates how this model is conditioned on the input and can be used to harmonize a given melody, complete an incomplete piano roll, or generate a variation of a given piece. The code is publicly shared to encourage the use and development of the method by the community.


Understand an AI Algorithm That Can See

#artificialintelligence

This can be through the face ID of your phone, the last google search you did, or the movie that you chose to watch last night. AI is a huge trend currently. This is why I decided to understand how it works. And I don't just mean reading about it. I decided to program an AI algorithm (for some context, I barely know code).


How Does DALL·E mini Work?

#artificialintelligence

I explain Artificial Intelligence terms and news to non-experts. Dalle mini is amazing -- and YOU can use it! I'm sure you've seen pictures like those in your Twitter feed in the past few days. If you wondered what they were, they are images generated by an AI called DALL·E mini. If you've never seen those, you need to watch this video because you are missing out.


Google's New AI Creates Summaries of Your Documents in Google Docs

#artificialintelligence

I explain Artificial Intelligence terms and news to non-experts. Google recently announced a new model for automatically generating summaries using machine learning, released in Google Docs that you can already use. The model will try to understand the whole document and generate a short summary of the piece--something some movie professionals clearly still can't do. The model needs to achieve two things to achieve that, which you will learn in the video below! Read the full article: https://www.louisbouchard.ai/google-docs-summary/


Here's How You Can Learn AI And Machine Learning For Free

#artificialintelligence

AI refers to computational tools that can perform certain tasks in place of human intelligence. Technology is advancing at a breakneck speed, much like the exponential growth experienced by database technology in the late twentieth century. As a result, databases have evolved into the core infrastructure for enterprise-level software. In a similar fashion, most of the new value-added in software over the coming decades is expected to come from AI, at least in part. AI is used everywhere, from the modern smart phones to advanced quantum computers. Hence, in this day and age, it's crucial that you understand the basics, if not advanced, of the concepts involved in artificial intelligence.