Goto

Collaborating Authors

Assessment & Standards


Help! My Husband Doesn't Want Anyone to Know That My IQ Is Higher Than His.

Slate

Slate is now asking those who read the most to support our journalism more directly by subscribing to Slate Plus. Dear Prudence is online weekly to chat live with readers. Here's an edited transcript of this week's chat. I know it's a petty, marriage-killing thing to dwell on … but I'm smarter than my husband. Because he insisted we both get IQ tests. It turns out I qualify for MENSA and he just does not. Except now he's telling our friends his fairly impressive IQ and when they ask about me, he says: "Oh well, it doesn't really matter. What's important is how you use what God gave you."


Wild cockatoos excel in intelligence tests, countering theory living with humans makes birds smarter

Daily Mail - Science & tech

A longheld theory that animals raised in captivity perform better in cognitive testing may need to be rethought. A new study organized by the University of Veterinary Medicine in Vienna found evidence that wild animals perform just as well at intelligence tests as their lab-raised counterparts. To test the theory, researchers compared two groups of Goffin's cockatoos, a species often found in the tropical jungles of Singapore, Indonesia, and Puerto Rico. The team compared a lab-raised'colony' of 11 cockatoos at their lab in Vienna to eight wild cockatoos recently taken into captivity at a field laboratory in Indonesia. The researchers compared the performance of both groups in a series of simple problem solving tests and found the wild cockatoos were just as clever as the lab-raised ones.


Grading on a curve? Why AI systems test brilliantly but stumble in real life - ScienceBlog.com

#artificialintelligence

The headline in early 2018 was a shocker: "Robots are better at reading than humans." Two artificial intelligence systems, one from Microsoft and the other from Alibaba, had scored slightly higher than humans on Stanford's widely used test of reading comprehension. The test scores were real, but the conclusion was wrong. As Robin Jia and Percy Liang of Stanford showed a few months later, the "robots" were only better than humans at taking that specific test. Because they had trained themselves on readings that were similar to those on the test.


AI Paper Recommendations from Experts

#artificialintelligence

After the'top AI books' reading list was so well received, we reached out to some of our community to find out which papers they believe everyone should have read! All of the below papers are free to access and cover a range of topics from Hypergradients to modeling yield response for CNNs. Each expert also included a reason as to why the paper was picked as well as a short bio. We spoke to Jeff back in January and at that time he couldn't pick just one paper as a must-read, so we let him pick two. This paper unpacks two key talking points, the limitations of sparse training data and also if recurrent networks can support meta-learning in a fully supervised context.


Challenge Closed-book Science Exam: A Meta-learning Based Question Answering System

arXiv.org Artificial Intelligence

Prior work in standardized science exams requires support from large text corpus, such as targeted science corpus from Wikipedia or SimpleWikipedia. However, retrieving knowledge from the large corpus is time-consuming and questions embedded in complex semantic representation may interfere with retrieval. Inspired by the dual process theory in cognitive science, we propose a MetaQA framework, where system 1 is an intuitive meta-classifier and system 2 is a reasoning module. Specifically, our method based on meta-learning method and large language model BERT, which can efficiently solve science problems by learning from related example questions without relying on external knowledge bases. We evaluate our method on AI2 Reasoning Challenge (ARC), and the experimental results show that meta-classifier yields considerable classification performance on emerging question types. The information provided by meta-classifier significantly improves the accuracy of reasoning module from 46.6% to 64.2%, which has a competitive advantage over retrieval-based QA methods.


Knowledge Elicitation using Deep Metric Learning and Psychometric Testing

arXiv.org Artificial Intelligence

Knowledge present in a domain is well expressed as relationships between corresponding concepts. For example, in zoology, animal species form complex hierarchies; in genomics, the different (parts of) molecules are organized in groups and subgroups based on their functions; plants, molecules, and astronomical objects all form complex taxonomies. Nevertheless, when applying supervised machine learning (ML) in such domains, we commonly reduce the complex and rich knowledge to a fixed set of labels, and induce a model shows good generalization performance with respect to these labels. The main reason for such a reductionist approach is the difficulty in eliciting the domain knowledge from the experts. Developing a label structure with sufficient fidelity and providing comprehensive multi-label annotation can be exceedingly labor-intensive in many real-world applications. In this paper, we provide a method for efficient hierarchical knowledge elicitation (HKE) from experts working with high-dimensional data such as images or videos. Our method is based on psychometric testing and active deep metric learning. The developed models embed the high-dimensional data in a metric space where distances are semantically meaningful, and the data can be organized in a hierarchical structure. We provide empirical evidence with a series of experiments on a synthetically generated dataset of simple shapes, and Cifar 10 and Fashion-MNIST benchmarks that our method is indeed successful in uncovering hierarchical structures.


From Machine Reading Comprehension to Dialogue State Tracking: Bridging the Gap

arXiv.org Artificial Intelligence

Dialogue state tracking (DST) is at the heart of task-oriented dialogue systems. However, the scarcity of labeled data is an obstacle to building accurate and robust state tracking systems that work across a variety of domains. Existing approaches generally require some dialogue data with state information and their ability to generalize to unknown domains is limited. In this paper, we propose using machine reading comprehension (RC) in state tracking from two perspectives: model architectures and datasets. We divide the slot types in dialogue state into categorical or extractive to borrow the advantages from both multiple-choice and span-based reading comprehension models. Our method achieves near the current state-of-the-art in joint goal accuracy on MultiWOZ 2.1 given full training data. More importantly, by leveraging machine reading comprehension datasets, our method outperforms the existing approaches by many a large margin in few-shot scenarios when the availability of in-domain data is limited. Lastly, even without any state tracking data, i.e., zero-shot scenario, our proposed approach achieves greater than 90% average slot accuracy in 12 out of 30 slots in MultiWOZ 2.1.


Variational Question-Answer Pair Generation for Machine Reading Comprehension

arXiv.org Artificial Intelligence

We present a deep generative model of question-answer (QA) pairs for machine reading comprehension. We introduce two independent latent random variables into our model in order to diversify answers and questions separately. We also study the effect of explicitly controlling the KL term in the variational lower bound in order to avoid the "posterior collapse" issue, where the model ignores latent variables and generates QA pairs that are almost the same. Our experiments on SQuAD v1.1 showed that variational methods can aid QA pair modeling capacity, and that the controlled KL term can significantly improve diversity while generating high-quality questions and answers comparable to those of the existing systems.


R3: A Reading Comprehension Benchmark Requiring Reasoning Processes

arXiv.org Artificial Intelligence

Existing question answering systems can only predict answers without explicit reasoning processes, which hinder their explainability and make us overestimate their ability of understanding and reasoning over natural language. In this work, we propose a novel task of reading comprehension, in which a model is required to provide final answers and reasoning processes. To this end, we introduce a formalism for reasoning over unstructured text, namely Text Reasoning Meaning Representation (TRMR). TRMR consists of three phrases, which is expressive enough to characterize the reasoning process to answer reading comprehension questions. We develop an annotation platform to facilitate TRMR's annotation, and release the R3 dataset, a \textbf{R}eading comprehension benchmark \textbf{R}equiring \textbf{R}easoning processes. R3 contains over 60K pairs of question-answer pairs and their TRMRs. Our dataset is available at: \url{http://anonymous}.


GenNet : Reading Comprehension with Multiple Choice Questions using Generation and Selection model

arXiv.org Artificial Intelligence

Multiple-choice machine reading comprehension is difficult task as its required machines to select the correct option from a set of candidate or possible options using the given passage and question.Reading Comprehension with Multiple Choice Questions task,required a human (or machine) to read a given passage, question pair and select the best one option from n given options. There are two different ways to select the correct answer from the given passage. Either by selecting the best match answer to by eliminating the worst match answer. Here we proposed GenNet model, a neural network-based model. In this model first we will generate the answer of the question from the passage and then will matched the generated answer with given answer, the best matched option will be our answer. For answer generation we used S-net (Tan et al., 2017) model trained on SQuAD and to evaluate our model we used Large-scale RACE (ReAding Comprehension Dataset From Examinations) (Lai et al.,2017).