Collaborating Authors


Near-Vana: A 'New' Kurt Cobain Track Appears Courtesy Of Artificial Intelligence


Arriving a symbolic and symmetric 27 years after he died at the age of 27, a "new" Nirvana song has been released. What makes "Drowned In The Sun" very different to "'You Know You're Right" – the last track Nirvana recorded in 1994 but which was not released until 2002 – is that Kurt Cobain did not write it and no members of Nirvana played on it. The track in question was created using artificial intelligence (AI) software that analyzed a number of Nirvana tracks in order to mimic their writing, recording and lyrical styles – drawing on vocals by Eric Hogan, lead singer in Nevermind, a Nirvana tribute act. Such digital necromancy comes with a whole host of moral, ethical and musical concerns, but in this case it is part of the Lost Tapes Of The 27 Club project raising awareness of mental health issues in music. The 27 Club refers to that mythologized grouping of musicians who all died at the age of 27.

A Survey on Personality-Aware Recommendation Systems Artificial Intelligence

With the emergence of personality computing as a new research field related to artificial intelligence and personality psychology, we have witnessed an unprecedented proliferation of personality-aware recommendation systems. Unlike conventional recommendation systems, these new systems solve traditional problems such as the cold start and data sparsity problems. This survey aims to study and systematically classify personality-aware recommendation systems. To the best of our knowledge, this survey is the first that focuses on personality-aware recommendation systems. We explore the different design choices of personality-aware recommendation systems, by comparing their personality modeling methods, as well as their recommendation techniques. Furthermore, we present the commonly used datasets and point out some of the challenges of personality-aware recommendation systems.

Knowledge Generation -- Variational Bayes on Knowledge Graphs Artificial Intelligence

This thesis is a proof of concept for the potential of Variational Auto-Encoder (VAE) on representation learning of real-world Knowledge Graphs (KG). Inspired by successful approaches to the generation of molecular graphs, we evaluate the capabilities of our model, the Relational Graph Variational Auto-Encoder (RGVAE). The impact of the modular hyperparameter choices, encoding through graph convolutions, graph matching and latent space prior, is compared. The RGVAE is first evaluated on link prediction. The mean reciprocal rank (MRR) scores on the two datasets FB15K-237 and WN18RR are compared to the embedding-based model DistMult. A variational DistMult and a RGVAE without latent space prior constraint are implemented as control models. The results show that between different settings, the RGVAE with relaxed latent space, scores highest on both datasets, yet does not outperform the DistMult. Further, we investigate the latent space in a twofold experiment: first, linear interpolation between the latent representation of two triples, then the exploration of each latent dimension in a $95\%$ confidence interval. Both interpolations show that the RGVAE learns to reconstruct the adjacency matrix but fails to disentangle. For the last experiment we introduce a new validation method for the FB15K-237 data set. The relation type-constrains of generated triples are filtered and matched with entity types. The observed rate of valid generated triples is insignificantly higher than the random threshold. All generated and valid triples are unseen. A comparison between different latent space priors, using the $\delta$-VAE method, reveals a decoder collapse. Finally we analyze the limiting factors of our approach compared to molecule generation and propose solutions for the decoder collapse and successful representation learning of multi-relational KGs.

Neural Methods for Effective, Efficient, and Exposure-Aware Information Retrieval Artificial Intelligence

Neural networks with deep architectures have demonstrated significant performance improvements in computer vision, speech recognition, and natural language processing. The challenges in information retrieval (IR), however, are different from these other application areas. A common form of IR involves ranking of documents -- or short passages -- in response to keyword-based queries. Effective IR systems must deal with query-document vocabulary mismatch problem, by modeling relationships between different query and document terms and how they indicate relevance. Models should also consider lexical matches when the query contains rare terms -- such as a person's name or a product model number -- not seen during training, and to avoid retrieving semantically related but irrelevant results. In many real-life IR tasks, the retrieval involves extremely large collections -- such as the document index of a commercial Web search engine -- containing billions of documents. Efficient IR methods should take advantage of specialized IR data structures, such as inverted index, to efficiently retrieve from large collections. Given an information need, the IR system also mediates how much exposure an information artifact receives by deciding whether it should be displayed, and where it should be positioned, among other results. Exposure-aware IR systems may optimize for additional objectives, besides relevance, such as parity of exposure for retrieved items and content publishers. In this thesis, we present novel neural architectures and methods motivated by the specific needs and challenges of IR tasks.

The Morning After: SpaceX lines up a big test for the Starship


The vehicle Elon Musk sees as the key to fast travel around the Earth and multiplanetary living has only taken short hops so far, but its next trip will reach 50,000 feet. The plan is to test out its aerodynamic capabilities and attempt a landing flip maneuver. SpaceX's stream begins at 7 AM, but stay tuned for more information on exactly when the test will go down if you want to watch live -- this could be historic. After multiple delays, CD Projekt Red's highly anticipated RPG (based on the table-top game of the same name) arrives on PC and consoles, and Jessica Conditt has spent about 20 hours in the world on Night City. The game is too deep for that to give a comprehensive view of what it contains (she took six hours to get beyond the prologue and meet Keanu Reeves) but more than enough to see if its 80s-tinged vision of the future holds up.

How to Revolutionize the Artificial Intelligence Importance for Innovations


Artificial Intelligence is showing its tremendous effect in various industries, markets, and services. Moreover, the creative industry and the art world have not been able to use this technology fully. However, two experts devised a platform for further reference. Using the latest technology, they allow creators, expert film designers, and video creators from the movie and music industries to use artificial intelligence algorithms in their work. It is Runway, a platform that integrates machine learning and the benefits of artificial intelligence for innovation.

Managing Marketing: The Psychology Of Brand Language Using Artificial Intelligence


Managing Marketing is a weekly podcast hosted by TrinityP3. Each one is a conversation with a marketing thought-leader, professional, practitioner and experts on the issues and topics of interest to marketers and business leaders everywhere. In this special series, TrinityP3's Anton Buchner discusses the rise of Artificial Intelligence and the impact it is having on marketing. Alastair Herbert is the founder of the research consultancy Linguabrand. He shares his wisdom having developed a deep-listening robot (Bob), that analyses visual and verbal language. Alastair introduces you to how Bob listens and analyses the psychology of language that humans potentially miss in data analysis and research groups. Bob can uncover insights to help brands shift the conversation away from sounding generic, to position themselves more persuasively. Follow Managing Marketing on Soundcloud, TuneIn, Stitcher, Spotify and Apple Podcast. Welcome to Managing Marketing, a weekly podcast where we sit down and talk with thought leaders and experts on the issues and opportunities in the marketing and business world. And it's quite warm here, so windows are open, so if you hear barking dogs, police cars, or squawking birds, you all know the reason why. It's nothing to do with COVID, it's actually just to do with enjoying summer. Now I'm really excited to have a chat with you today. As in most communications, I think most people realise that the vast majority of it is actually subconscious. And hopefully, by the end of this session, your listeners will have a much better understanding of how communications work. I'm sure they'll be excited. Before we jump in, I met you relatively recently through a colleague, Jeremy Taylor-Riley. He's now a business colleague of yours, I believe. Well, we actually go back to school days together. And what was great is that we – I think this was back when dinosaurs ruled the earth.

Graph-based Topic Extraction from Vector Embeddings of Text Documents: Application to a Corpus of News Articles Artificial Intelligence

Production of news content is growing at an astonishing rate. To help manage and monitor the sheer amount of text, there is an increasing need to develop efficient methods that can provide insights into emerging content areas, and stratify unstructured corpora of text into `topics' that stem intrinsically from content similarity. Here we present an unsupervised framework that brings together powerful vector embeddings from natural language processing with tools from multiscale graph partitioning that can reveal natural partitions at different resolutions without making a priori assumptions about the number of clusters in the corpus. We show the advantages of graph-based clustering through end-to-end comparisons with other popular clustering and topic modelling methods, and also evaluate different text vector embeddings, from classic Bag-of-Words to Doc2Vec to the recent transformers based model Bert. This comparative work is showcased through an analysis of a corpus of US news coverage during the presidential election year of 2016.

Language Models are Open Knowledge Graphs Artificial Intelligence

This paper shows how to construct knowledge graphs (KGs) from pre-trained language models (e.g., BERT, GPT-2/3), without human supervision. Popular KGs (e.g, Wikidata, NELL) are built in either a supervised or semi-supervised manner, requiring humans to create knowledge. Recent deep language models automatically acquire knowledge from large-scale corpora via pre-training. The stored knowledge has enabled the language models to improve downstream NLP tasks, e.g., answering questions, and writing code and articles. In this paper, we propose an unsupervised method to cast the knowledge contained within language models into KGs. We show that KGs are constructed with a single forward pass of the pre-trained language models (without fine-tuning) over the corpora. We demonstrate the quality of the constructed KGs by comparing to two KGs (Wikidata, TAC KBP) created by humans. Our KGs also provide open factual knowledge that is new in the existing KGs. Our code and KGs will be made publicly available.

FSD50K: an Open Dataset of Human-Labeled Sound Events Machine Learning

Most existing datasets for sound event recognition (SER) are relatively small and/or domain-specific, with the exception of AudioSet, based on a massive amount of audio tracks from YouTube videos and encompassing over 500 classes of everyday sounds. However, AudioSet is not an open dataset---its release consists of pre-computed audio features (instead of waveforms), which limits the adoption of some SER methods. Downloading the original audio tracks is also problematic due to constituent YouTube videos gradually disappearing and usage rights issues, which casts doubts over the suitability of this resource for systems' benchmarking. To provide an alternative benchmark dataset and thus foster SER research, we introduce FSD50K, an open dataset containing over 51k audio clips totalling over 100h of audio manually labeled using 200 classes drawn from the AudioSet Ontology. The audio clips are licensed under Creative Commons licenses, making the dataset freely distributable (including waveforms). We provide a detailed description of the FSD50K creation process, tailored to the particularities of Freesound data, including challenges encountered and solutions adopted. We include a comprehensive dataset characterization along with discussion of limitations and key factors to allow its audio-informed usage. Finally, we conduct sound event classification experiments to provide baseline systems as well as insight on the main factors to consider when splitting Freesound audio data for SER. Our goal is to develop a dataset to be widely adopted by the community as a new open benchmark for SER research.