Goto

Collaborating Authors

 deeplearning



A Primer on Large Language Models and their Limitations

Johnson, Sandra, Hyland-Wood, David

arXiv.org Artificial Intelligence

The world of artificial intelligence (AI) is increasingly penetrating all aspects of our personal and professional lives. This proliferation of AI tools and applications are being met with a mixture of excitement, scepticism and even dread [78]. Excitement at the seemingly endless potential of AI applications such as LLMs, especially when they are integrated "within broader systems" [13], scepticism as the realisation dawns that LLMs are in fact fallible as evidenced by hallucinations and hence not the golden bullet that can solve all problems [19, 21], and a feeling of dread for those who believe that LLMs and AI have the potential to detrimentally impact our lives and make people redundant [78]. The ability of some LLMs to pass Theory of Mind (ToM) [64][32] and Turing Tests [7][42] suggests support for the Computational Theory of Mind (CTM), that cognition may be substrate independent. These findings challenge biological essentialism and open new avenues for creating sophisticated AI systems capable of human-like reasoning and interaction.


Predicting the activity of chemical compounds based on machine learning approaches

Tu, Do Hoang, Van Lang, Tran, Xuyen, Pham Cong, Long, Le Mau

arXiv.org Artificial Intelligence

ABSTRACT -- Exploring methods and techniques of machine learning (ML) to address specific challenges in various fields is essential. In this work, we tackle a problem in the domain of Cheminformatics; that is, providing a suitable solution to aid in predicting the activity of a chemical compound to the best extent possible. To address the problem at hand, this study conducts experiments on 100 different combinations of existing techniques. These solutions are then selected based on a set of criteria that includes the G-means, F1-score, and AUC metrics. The results have been tested on a dataset of about 10,000 chemical compounds from PubChem that have been classified according to their activity. I. INTRODUCTION In datasets used in biological experiments for measuring the activity of various compounds against different biological targets, often used in screening, there is usually a significant imbalance between active and inactive compounds, with the number of inactive data points being much larger. Therefore, training requires the use of suitable machine learning models. Additionally, preprocessing before using machine learning methods for training is also a crucial issue. The following issues are approached to address the problem of predicting the activity of chemical compounds using chemistry-related datasets: Investigating the dependency of attributes or features in the dataset to potentially reduce the number of features. This can be done using methods such as ANOVA F-test to assess the dependency of each feature on the target variable or by using correlation coefficients.


Monadic Deep Learning

Yang, Bo, Marisa, Zhihao Zhang Kirisame, Shi, Kai

arXiv.org Artificial Intelligence

The Java and Scala community has built a very successful big data ecosystem. However, most of neural networks running on it are modeled in dynamically typed programming languages. These dynamically typed deep learning frameworks treat neural networks as differentiable expressions that contain many trainable variable, and perform automatic differentiation on those expressions when training them. Until 2019, none of the learning frameworks in statically typed languages provided the expressive power of traditional frameworks. Their users are not able to use custom algorithms unless creating plenty of boilerplate code for hard-coded back-propagation. We solved this problem in DeepLearning.scala 2. Our contributions are: 1. We discovered a novel approach to perform automatic differentiation in reverse mode for statically typed functions that contain multiple trainable variable, and can interoperate freely with the metalanguage. 2. We designed a set of monads and monad transformers, which allow users to create monadic expressions that represent dynamic neural networks. 3. Along with these monads, we provide some applicative functors, to perform multiple calculations in parallel. With these features, users of DeepLearning.scala were able to create complex neural networks in an intuitive and concise way, and still maintain type safety.


#DeepLearning. Just published a new blog post on deep…

#artificialintelligence

Deep learning is a subset of machine learning that is based on artificial neural networks. It has revolutionized the field of artificial intelligence and is being used in a variety of applications, including image and speech recognition, natural language processing, and autonomous vehicles. In this blog post, we will provide an overview of what deep learning is, how it works, and its applications.


How to Build a Speech-to-Text System using ChatGPT and Python - Pyresearch - Medium

#artificialintelligence

Check out our latest tutorial on how to build a speech-to-text system using ChatGPT and Python! Learn how to leverage the power of natural language processing and deep learning to convert audio to text with amazing accuracy. Please let me know your valuable feedback on the video by means of comments. Please like and share the video. Do not forget to subscribe to my channel for more educational videos.


Use OpenAI with Google Spreadsheets

#artificialintelligence

This article explains how you can integrate OpenAI GPT-3 with Google Spreadsheets. This allows you to complete spreadsheet tasks with the use of AI. Tip: Make sure to subscribe to above Gist since all future revisions with improvements will be made to this file. Then you can refer to this file later and update your functions. Note: When there are revisions to functions in the Gist file we discussed above, this is the same place you need to update the new revised code as well.


Memory Complexity with Transformers - KDnuggets

#artificialintelligence

The key innovation in Transformers is the introduction of a self-attention mechanism, which computes similarity scores for all pairs of positions in an input sequence, and can be evaluated in parallel for each token of the input sequence, avoiding the sequential dependency of recurrent neural networks, and enabling Transformers to vastly outperform previous sequence models like LSTM. There are a lot of deep explanations elsewhere so here I'd like to share some example questions in an interview setting. What can be a solution to this problem? Here are some tips for readers' reference: If you try to run a large transformer on the long sequence, you just run out of memory. A limitation of existing Transformer models and their derivatives is that the full self-attention mechanism has computational and memory requirements that are quadratic with the input sequence length.


How to Build Good AI Solutions When Data Is Scarce

#artificialintelligence

Conventional wisdom holds that you need large volumes of labeled training data to unlock value from powerful AI models. For the consumer internet companies where many of today's AI models originated, this hasn't been difficult to obtain. But for companies in other sectors -- such as industrial companies, manufacturers, health care organizations, and educational institutions -- curating labeled data in sufficient volume can be significantly more challenging. Over the past few years, AI practitioners and researchers have developed several techniques to significantly reduce the volume of labeled data needed to build accurate AI models. Using these approaches, it's often possible to build a good AI model with a fraction of the labeled data that might otherwise be needed.


"UnConference"🎙 with Jeremy Howard

#artificialintelligence

Today's post is slightly off-track. I was invited to the first Fast.AI unconference in Brisbane, Queensland this week. It was an honor to be part of the community and I'm having a blast meeting with so many brilliant AI researchers around the globe! In short, UnConferences are "unconventional conferences". Anyone can propose an agenda, organize a session to any topics they want.