Goto

Collaborating Authors

 Large Language Model


Impact of Tokenization on Language Models: An Analysis for Turkish

arXiv.org Artificial Intelligence

Tokenization is an important text preprocessing step to prepare input tokens for deep language models. WordPiece and BPE are de facto methods employed by important models, such as BERT and GPT. However, the impact of tokenization can be different for morphologically rich languages, such as Turkic languages, where many words can be generated by adding prefixes and suffixes. We compare five tokenizers at different granularity levels, i.e. their outputs vary from smallest pieces of characters to the surface form of words, including a Morphological-level tokenizer. We train these tokenizers and pretrain medium-sized language models using RoBERTa pretraining procedure on the Turkish split of the OSCAR corpus. We then fine-tune our models on six downstream tasks. Our experiments, supported by statistical tests, reveal that Morphological-level tokenizer has challenging performance with de facto tokenizers. Furthermore, we find that increasing the vocabulary size improves the performance of Morphological and Word-level tokenizers more than that of de facto tokenizers. The ratio of the number of vocabulary parameters to the total number of model parameters can be empirically chosen as 20% for de facto tokenizers and 40% for other tokenizers to obtain a reasonable trade-off between model size and performance.


AI models to detect how you're feeling in sales calls

#artificialintelligence

In brief AI software is being offered to sales teams to analyze whether potential customers appear interested during virtual meetings. Sentiment analysis is often used in machine-learning research to detect emotions in underlying text or video, and the technology is now being applied to help people see how possible future clients are feeling in sales pitches to improve results, Protocol reported this month. The COVID-19 pandemic has moved a lot of meetings virtually as employees work from home. "It's very hard to build rapport in a relationship in that type of environment," said Tim Harris, director of product marketing at Uniphore, a software company specializing in conversational analytics. The hope is that sellers may be able to use AI technology to automatically tell when they're boring clients and can immediately change tactics, such as being more empathetic to keep them interested. In addition, reactions to individual products could be included, so that vendors are aware of what Harris calls "emotional state of a deal."


Artificial intelligence is mastery of language. Should we trust what he says?

#artificialintelligence

But as the fluency of GPT-3 has impressed many observers, the big language model approach has also attracted significant criticism over the past few years. Some skeptics argue that the software is only capable of blind imitation – that it imitates the grammatical patterns of human language but is unable to generate its own ideas or make complex decisions, a fundamental limitation that would prevent the LLM approach from maturing into anything resembling human intelligence. For these critics, GPT-3 is the latest brilliant object in a long history of AI hype, directing research money and attention to what will ultimately prove to be a dead end, preventing other promising approaches from maturing. Other critics believe programs like GPT-3 will forever be compromised by biases, propaganda, and misinformation in the data they have been trained on, meaning their use of anything more than salon tricks will always be irresponsible. Wherever you get to this debate, the pace of recent improvement in large language models makes it hard to imagine that they will not be deployed commercially in the coming years.


Top 10 GPT-3 Powered Applications to Know in 2022

#artificialintelligence

GPT-3 powered applications are flourishing in the global tech market from the house of OpenAI in recent years. It is also known as Generative Pre-trained Transformer 3 as an autoregressive language mode for leveraging deep learning for human-based texts. There are thousands of companies that have started using this AI model of OpenAI to complete the workload efficiently. OpenAI has trained this AI model on a massive corpus of text with over 175 billion parameters to become the largest language model. Multiple GPT-3 powered applications are available on the internet for more efficient services.


DeepMind's Clever Idea to Master Asymmetric Games

#artificialintelligence

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. The method expands the concept of a Nash equilibrium by decomposing an asymmetric game into multiple symmetric games.


What's the whole buzz about Salesforce's CodeGen

#artificialintelligence

Towards the end of last month, Salesforce released a large scale language model called CodeGen, which many in the AI community anticipated to be a "Github Copilot killer." While we don't yet know if this is possible until the reviews for CodeGen fly in, it is safe to say that we are in the middle of a paradigm shift in how we interact with our computers. If it was hard to imagine that a user could just develop an app by telling the machine in simple language what the app does, that is exactly what CodeGen does. CodeGen is a leap from code generators like GitHub copilot because its application is even simpler than writing instructions – it is simply talking. Github Copilot, on the other hand, was launched as a tool that autocompletes snippets of code.


LiT: Zero-Shot Transfer with Locked-image Tuning

#artificialintelligence

Below you can choose an image from a selection and then write free-form text prompts that are matched to the image. Once you hit return on your keyboard or press the "compute" button, a text encoder implemented in TensorFlow.js will compute embeddings for the provided text on your local device, and the similarity of these text embeddings to the image embedding will be displayed.


人工知能がプログラムコードを自動生成する「OpenAI Codex」。プログラマーがいらなくなる?

#artificialintelligence

人工知能がプログラムを自動で作ってくれる!人間がプログラミング言語を勉強してコードを書くのは時代遅れ?【OpenAI Codex】について解説します。


Learning by Forgetting: DeepMiand the Jennifer Aniston Neuron

#artificialintelligence

Learning by Forgetting: DeepMiand the Jennifer Aniston Neuron. DeepMind’s research shows how to understand the role of individual neurons in a neural network..


Competitive programming with AlphaCode

#artificialintelligence

Creating solutions to unforeseen problems is second nature in human intelligence – a result of critical thinking informed by experience. The machine learning community has made tremendous progress in generating and understanding textual data, but advances in problem solving remain limited to relatively simple maths and programming problems, or else retrieving and copying existing solutions. As part of DeepMind's mission to solve intelligence, we created a system called AlphaCode that writes computer programs at a competitive level. AlphaCode achieved an estimated rank within the top 54% of participants in programming competitions by solving new problems that require a combination of critical thinking, logic, algorithms, coding, and natural language understanding. In our preprint, we detail AlphaCode, which uses transformer-based language models to generate code at an unprecedented scale, and then smartly filters to a small set of promising programs.