AITopics

Country:

North America > United States > California (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.05)

Industry:

Information Technology > Services (0.53)
Media > News (0.38)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.31)

#artificialintelligenceMar-24-2020, 09:44:42 GMT

Google open-sources framework that reduces AI training costs by up to 80%

Google researchers recently published a paper describing a framework -- SEED RL -- that scales AI model training to thousands of machines. They say that it could facilitate training at millions of frames per second on a machine while reducing costs by up to 80%, potentially leveling the playing field for startups that couldn't previously compete with large AI labs. Training sophisticated machine learning models in the cloud remains prohibitively expensive. According to a recent Synced report, the University of Washington's Grover, which is tailored for both the generation and detection of fake news, cost $25,000 to train over the course of two weeks. OpenAI racked up $256 per hour to train its GPT-2 language model, and Google spent an estimated $6,912 training BERT, a bidirectional transformer model that redefined the state of the art for 11 natural language processing tasks.

ai training cost, google open-source framework, seed rl, (4 more...)

AI-Alerts: 2020 > 2020-03 > AAAI AI-Alert for Mar 24, 2020 (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.74)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.58)

#artificialintelligenceMar-23-2020, 11:15:07 GMT

Hands-On Guide to OpenAI Gym Custom Environments - Analytics India Magazine

OpenAI Gym is a well known RL community for developing and comparing Reinforcement Learning agents. OpenAI Gym doesn't make assumptions about the structure of the agent and works out well with any numerical computation library such as TensorFlow, PyTorch. The gym also provides various types of environments. In this hands-on guide, we will develop a tic-tac-toe environment from scratch using OpenAI Gym. To start with, let's create the desired folder structure with all the required files.

analytic india magazine, hand-on guide, openai gym custom environment, (7 more...)

Country: Asia > India (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

#artificialintelligenceMar-23-2020, 04:46:11 GMT

How to generate text: using different decoding methods for language generation with Transformers

In recent years, there has been an increasing interest in open-ended language generation thanks to the rise of large transformer-based language models trained on millions of webpages, such as OpenAI's famous GPT2 model. The results on conditioned open-ended language generation are impressive, e.g. Besides the improved transformer architecture and massive unsupervised training data, better decoding methods have also played an important role. This blog post gives a brief overview of different decoding strategies and more importantly shows how you can implement them with very little effort using the popular transformers library! All of the following functionalities can be used for auto-regressive language generation (here a refresher).

language generation, probability, word sequence, (15 more...)

Country: North America > United States > New York (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceMar-21-2020, 14:40:01 GMT

What happens when a machine can write as well as an academic? University Affairs

Recently one morning, I asked my computer a relatively simple question: can artificial intelligence (AI) write? We're not too certain on what artificial intelligence will be able to write, but there are some scenarios in which computers could be responsible for a huge number of word documents … The biggest potential scenarios would involve machines analyzing what has already been written and determining what pieces need to be edited to make the content seem fresh. The above sentences were composed by a machine in a matter of seconds. The tool used is a freely accessible interface based on the GPT-2 text generator released by OpenAI – a company founded by technology industry leaders, including Elon Musk and Sam Altman. Only a limited version of the tool was made available, as it was dubbed "too dangerous" by the company to release fully into the world.

artificial intelligence, higher education, student, (14 more...)

Genre: Personal > Opinion (0.40)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.55)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Neural Information Processing SystemsMar-20-2020, 13:30:57 GMT

Transductive Zero-Shot Learning with Visual Structure Constraint

Wan, Ziyu, Chen, Dongdong, Li, Yan, Yan, Xingguang, Zhang, Junge, Yu, Yizhou, Liao, Jing

To recognize objects of the unseen classes, most existing Zero-Shot Learning (ZSL) methods first learn a compatible projection function between the common semantic space and the visual space based on the data of source seen classes, then directly apply it to the target unseen classes. However, in real scenarios, the data distribution between the source and target domain might not match well, thus causing the well-known domain shift problem. Based on the observation that visual features of test instances can be separated into different clusters, we propose a new visual structure constraint on class centers for transductive ZSL, to improve the generality of the projection function (\ie alleviate the above domain shift problem). Specifically, three different strategies (symmetric Chamfer-distance,Bipartite matching distance, and Wasserstein distance) are adopted to align the projected unseen semantic centers and visual cluster centers of test instances. We also propose a new training strategy to handle the real cases where many unrelated images exist in the test dataset, which is not considered in previous methods.

domain shift problem, transductive zero-shot learning, visual structure constraint, (2 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.65)

#artificialintelligenceMar-20-2020, 00:57:43 GMT

6 Pretrained Models to Master Text Classification

Though ERNIE 1.0 (released in March 2019) has been a popular model for text classification, it was ERNIE 2.0 which became the talk of the town in the latter half of 2019. Developed by tech-giant Baidu, ERNIE outperformed Google XLNet and BERT on the GLUE benchmark for English. ERNIE stands for Enhanced Representation through kNowledge IntEgration, and ERNIE 2.0 is an upgraded version of ERNIE 1.0. ERNIE 1.0 was pathbreaking in its own way – it was one of the first models to leverage Knowledge Graphs. This incorporation further enhanced training the model for advanced tasks like Relation Classification and NamedEntityRecognition (NER). Like its predecessor, ERNIE 2.0 brings another innovation to the table in the form of Continual Incremental Multi-task Learning.

pretrained model, text classification, transformer, (11 more...)

Country: North America > United States > New York (0.06)

Genre: Instructional Material (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.76)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

Neural Information Processing SystemsMar-19-2020, 02:47:36 GMT

Semantic-Guided Multi-Attention Localization for Zero-Shot Learning

Zhu, Yizhe, Xie, Jianwen, Tang, Zhiqiang, Peng, Xi, Elgammal, Ahmed

Zero-shot learning extends the conventional object classification to the unseen class recognition by introducing semantic representations of classes. Existing approaches predominantly focus on learning the proper mapping function for visual-semantic embedding, while neglecting the effect of learning discriminative visual features. In this paper, we study the significance of the discriminative region localization. We propose a semantic-guided multi-attention localization model, which automatically discovers the most discriminative parts of objects for zero-shot learning without any human annotations. Our model jointly learns cooperative global and local features from the whole object as well as the detected parts to categorize objects based on semantic descriptions.

semantic-guided multi-attention localization, zero-shot learning

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)

Micaelli, Paul, Storkey, Amos J.

Zero-shot Knowledge Transfer via Adversarial Belief Matching

Neural Information Processing SystemsMar-19-2020, 00:31:04 GMT

Performing knowledge transfer from a large teacher network to a smaller student is a popular task in modern deep learning applications. However, due to growing dataset sizes and stricter privacy regulations, it is increasingly common not to have access to the data that was used to train the teacher. We propose a novel method which trains a student to match the predictions of its teacher without using any data or metadata. We achieve this by training an adversarial generator to search for images on which the student poorly matches the teacher, and then using them to train the student. Our resulting student closely approximates its teacher for simple datasets like SVHN, and on CIFAR10 we improve on the state-of-the-art for few-shot distillation (with $100$ images per class), despite using no data.

adversarial belief matching, student, zero-shot knowledge transfer

Genre: Research Report (0.43)

Industry: Information Technology > Security & Privacy (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.45)

Neural Information Processing SystemsMar-19-2020, 00:03:22 GMT

Visualizing and Measuring the Geometry of BERT

Reif, Emily, Yuan, Ann, Wattenberg, Martin, Viegas, Fernanda B., Coenen, Andy, Pearce, Adam, Kim, Been

Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally. This paper describes qualitative and quantitative investigations of one particularly effective model, BERT. At a high level, linguistic features seem to be represented in separate semantic and syntactic subspaces.

geometry, representation, visualizing and measuring, (2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)