AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

Impact of Tokenization on Language Models: An Analysis for Turkish

Toraman, Cagri, Yilmaz, Eyup Halit, Şahinuç, Furkan, Ozcelik, Oguzhan

arXiv.org Artificial IntelligenceApr-19-2022

Tokenization is an important text preprocessing step to prepare input tokens for deep language models. WordPiece and BPE are de facto methods employed by important models, such as BERT and GPT. However, the impact of tokenization can be different for morphologically rich languages, such as Turkic languages, where many words can be generated by adding prefixes and suffixes. We compare five tokenizers at different granularity levels, i.e. their outputs vary from smallest pieces of characters to the surface form of words, including a Morphological-level tokenizer. We train these tokenizers and pretrain medium-sized language models using RoBERTa pretraining procedure on the Turkish split of the OSCAR corpus. We then fine-tune our models on six downstream tasks. Our experiments, supported by statistical tests, reveal that Morphological-level tokenizer has challenging performance with de facto tokenizers. Furthermore, we find that increasing the vocabulary size improves the performance of Morphological and Word-level tokenizers more than that of de facto tokenizers. The ratio of the number of vocabulary parameters to the total number of model parameters can be empirically chosen as 20% for de facto tokenizers and 40% for other tokenizers to obtain a reasonable trade-off between model size and performance.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3578707

2204.08832

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(14 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Film (0.34)
Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AI models to detect how you're feeling in sales calls

#artificialintelligenceApr-18-2022, 12:35:07 GMT

In brief AI software is being offered to sales teams to analyze whether potential customers appear interested during virtual meetings. Sentiment analysis is often used in machine-learning research to detect emotions in underlying text or video, and the technology is now being applied to help people see how possible future clients are feeling in sales pitches to improve results, Protocol reported this month. The COVID-19 pandemic has moved a lot of meetings virtually as employees work from home. "It's very hard to build rapport in a relationship in that type of environment," said Tim Harris, director of product marketing at Uniphore, a software company specializing in conversational analytics. The hope is that sellers may be able to use AI technology to automatically tell when they're boring clients and can immediately change tactics, such as being more empathetic to keep them interested. In addition, reactions to individual products could be included, so that vendors are aware of what Harris calls "emotional state of a deal."

naiac, sale call, vehicle, (11 more...)

#artificialintelligence

Country:

North America > United States > California > San Francisco County > San Francisco (0.06)
Asia > China (0.05)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area (0.77)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

Artificial intelligence is mastery of language. Should we trust what he says?

#artificialintelligenceApr-18-2022, 05:20:06 GMT

But as the fluency of GPT-3 has impressed many observers, the big language model approach has also attracted significant criticism over the past few years. Some skeptics argue that the software is only capable of blind imitation – that it imitates the grammatical patterns of human language but is unable to generate its own ideas or make complex decisions, a fundamental limitation that would prevent the LLM approach from maturing into anything resembling human intelligence. For these critics, GPT-3 is the latest brilliant object in a long history of AI hype, directing research money and attention to what will ultimately prove to be a dead end, preventing other promising approaches from maturing. Other critics believe programs like GPT-3 will forever be compromised by biases, propaganda, and misinformation in the data they have been trained on, meaning their use of anything more than salon tricks will always be irresponsible. Wherever you get to this debate, the pace of recent improvement in large language models makes it hard to imagine that they will not be deployed commercially in the coming years.

artificial intelligence, humanity, intelligence, (15 more...)

#artificialintelligence

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > California (0.06)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)

Industry: Media (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Top 10 GPT-3 Powered Applications to Know in 2022

#artificialintelligenceApr-18-2022, 04:26:10 GMT

GPT-3 powered applications are flourishing in the global tech market from the house of OpenAI in recent years. It is also known as Generative Pre-trained Transformer 3 as an autoregressive language mode for leveraging deep learning for human-based texts. There are thousands of companies that have started using this AI model of OpenAI to complete the workload efficiently. OpenAI has trained this AI model on a massive corpus of text with over 175 billion parameters to become the largest language model. Multiple GPT-3 powered applications are available on the internet for more efficient services.

ai model, application, gpt-3, (6 more...)

#artificialintelligence

Industry: Information Technology (0.32)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.72)

Add feedback

DeepMind's Clever Idea to Master Asymmetric Games

#artificialintelligenceApr-15-2022, 17:17:43 GMT

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. The method expands the concept of a Nash equilibrium by decomposing an asymmetric game into multiple symmetric games.

clever idea, deepmind, master asymmetric game, (2 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

What's the whole buzz about Salesforce's CodeGen

#artificialintelligenceApr-15-2022, 02:28:45 GMT

Towards the end of last month, Salesforce released a large scale language model called CodeGen, which many in the AI community anticipated to be a "Github Copilot killer." While we don't yet know if this is possible until the reviews for CodeGen fly in, it is safe to say that we are in the middle of a paradigm shift in how we interact with our computers. If it was hard to imagine that a user could just develop an app by telling the machine in simple language what the app does, that is exactly what CodeGen does. CodeGen is a leap from code generators like GitHub copilot because its application is even simpler than writing instructions – it is simply talking. Github Copilot, on the other hand, was launched as a tool that autocompletes snippets of code.

codegen, programming language, salesforce, (6 more...)

#artificialintelligence

Industry: Information Technology > Software (0.70)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.99)

Add feedback

LiT: Zero-Shot Transfer with Locked-image Tuning

#artificialintelligenceApr-14-2022, 20:45:54 GMT

Below you can choose an image from a selection and then write free-form text prompts that are matched to the image. Once you hit return on your keyboard or press the "compute" button, a text encoder implemented in TensorFlow.js will compute embeddings for the provided text on your local device, and the similarity of these text embeddings to the image embedding will be displayed.

compute, locked-image tuning, zero-shot transfer, (1 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Vision (0.40)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)

Add feedback

人工知能がプログラムコードを自動生成する「OpenAI Codex」。プログラマーがいらなくなる？

#artificialintelligenceApr-13-2022, 22:20:03 GMT

人工知能がプログラムを自動で作ってくれる！人間がプログラミング言語を勉強してコードを書くのは時代遅れ？【OpenAI Codex】について解説します。

openai

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.88)

Add feedback

Learning by Forgetting: DeepMiand the Jennifer Aniston Neuron

#artificialintelligenceApr-13-2022, 13:05:34 GMT

Learning by Forgetting: DeepMiand the Jennifer Aniston Neuron. DeepMind’s research shows how to understand the role of individual neurons in a neural network..

forgetting, jennifer aniston neuron, learning, (1 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Competitive programming with AlphaCode

#artificialintelligenceApr-13-2022, 13:05:31 GMT

Creating solutions to unforeseen problems is second nature in human intelligence – a result of critical thinking informed by experience. The machine learning community has made tremendous progress in generating and understanding textual data, but advances in problem solving remain limited to relatively simple maths and programming problems, or else retrieving and copying existing solutions. As part of DeepMind's mission to solve intelligence, we created a system called AlphaCode that writes computer programs at a competitive level. AlphaCode achieved an estimated rank within the top 54% of participants in programming competitions by solving new problems that require a combination of critical thinking, logic, algorithms, coding, and natural language understanding. In our preprint, we detail AlphaCode, which uses transformer-based language models to generate code at an unprecedented scale, and then smartly filters to a small set of promising programs.

alphacode, competition, competitive programming, (16 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback