Goto

Collaborating Authors

 koller


SCOPE: Sign Language Contextual Processing with Embedding from LLMs

Liu, Yuqi, Zhang, Wenqian, Ren, Sihan, Huang, Chengyu, Yu, Jingyi, Xu, Lan

arXiv.org Artificial Intelligence

Sign languages, used by around 70 million Deaf individuals globally, are visual languages that convey visual and contextual information. Current methods in vision-based sign language recognition (SLR) and translation (SLT) struggle with dialogue scenes due to limited dataset diversity and the neglect of contextually relevant information. To address these challenges, we introduce SCOPE (Sign language Contextual Processing with Embedding from LLMs), a novel context-aware vision-based SLR and SLT framework. For SLR, we utilize dialogue contexts through a multi-modal encoder to enhance gloss-level recognition. For subsequent SLT, we further fine-tune a Large Language Model (LLM) by incorporating prior conversational context. We also contribute a new sign language dataset that contains 72 hours of Chinese sign language videos in contextual dialogues across various scenarios. Experimental results demonstrate that our SCOPE framework achieves state-of-the-art performance on multiple datasets, including Phoenix-2014T, CSL-Daily, and our SCOPE dataset. Moreover, surveys conducted with participants from the Deaf community further validate the robustness and effectiveness of our approach in real-world applications. Both our dataset and code will be open-sourced to facilitate further research.


Sentence-Level Sign Language Recognition Framework

Akandeh, Atra

arXiv.org Artificial Intelligence

We present two solutions to sentence-level SLR. Sentence-level SLR required mapping videos of sign language sentences to sequences of gloss labels. Connectionist Temporal Classification (CTC) has been used as the classifier level of both models. CTC is used to avoid pre-segmenting the sentences into individual words. The first model is an LRCN-based model, and the second model is a Multi-Cue Network. LRCN is a model in which a CNN as a feature extractor is applied to each frame before feeding them into an LSTM. In the first approach, no prior knowledge has been leveraged. Raw frames are fed into an 18-layer LRCN with a CTC on top. In the second approach, three main characteristics (hand shape, hand position, and hand movement information) associated with each sign have been extracted using Mediapipe. 2D landmarks of hand shape have been used to create the skeleton of the hands and then are fed to a CONV-LSTM model. Hand locations and hand positions as relative distance to head are fed to separate LSTMs. All three sources of information have been then integrated into a Multi-Cue network with a CTC classification layer. We evaluated the performance of proposed models on RWTH-PHOENIX-Weather. After performing an excessive search on model hyper-parameters such as the number of feature maps, input size, batch size, sequence length, LSTM memory cell, regularization, and dropout, we were able to achieve 35 Word Error Rate (WER).


Language Models Understand Us, Poorly

Moore, Jared

arXiv.org Artificial Intelligence

Some claim language models understand us. Others won't hear it. To clarify, I investigate three views of human language understanding: as-mapping, as-reliability and as-representation. I argue that while behavioral reliability is necessary for understanding, internal representations are sufficient; they climb the right hill. I review state-of-the-art language and multi-modal models: they are pragmatically challenged by under-specification of form. I question the Scaling Paradigm: limits on resources may prohibit scaled-up models from approaching understanding. Last, I describe how as-representation advances a science of understanding. We need work which probes model internals, adds more of human language, and measures what models can learn.


Meaning without reference in large language models

Piantadosi, Steven T., Hill, Felix

arXiv.org Artificial Intelligence

The widespread success of large language models (LLMs) has been met with skepticism that they possess anything like human concepts or meanings. Contrary to claims that LLMs possess no meaning whatsoever, we argue that they likely capture important aspects of meaning, and moreover work in a way that approximates a compelling account of human cognition in which meaning arises from conceptual role. Because conceptual role is defined by the relationships between internal representational states, meaning cannot be determined from a model's architecture, training data, or objective function, but only by examination of how its internal states relate to each other. This approach may clarify why and how LLMs are so successful and suggest how they can be made more human-like.


Insitro raises $400M for machine learning-powered drug discovery efforts

#artificialintelligence

The artificial intelligence-powered drug discovery company insitro has secured a mammoth funding round of $400 million, to carry forward its development efforts and Big Pharma partnerships with the likes of Gilead and Bristol Myers Squibb. The series C financing was led by the Canada Pension Plan Investment Board, also known as CPP Investments, with additional backing from Andreessen Horowitz, Casdin Capital, ARCH Venture Partners, Foresite Capital, GV, Third Rock Ventures, Two Sigma Ventures, HOF Capital and Alexandria Venture Investments, as well as accounts managed by BlackRock and T. Rowe Price Associates. Temasek and Softbank Investment Advisors also joined the round, in addition to an undisclosed global investment group and a U.S. payer-provider health system. "For insitro, 2020 was a year of incredible growth and progress toward our founding vision of bringing the predictive powers of machine learning to drug discovery," said founder and CEO Daphne Koller. "We built out and demonstrated the capabilities of our target discovery platform in our Gilead collaboration in NASH, receiving the first of our operational milestone payments, and put in place an outstanding collaboration with Bristol Myers Squibb in ALS; we also took a big step forward towards moving from targets to medicines through the acquisition of Haystack Sciences, a high throughput chemistry platform that enables [machine learning]-driven molecular design; and we recruited Dr. Roger Perlmutter to our board to help guide our drug discovery efforts," Koller said.


8 Leading Women In The Field Of AI

#artificialintelligence

These eight women are at the forefront of the field of artificial intelligence today. It is a simple truth: the field of artificial intelligence is far too male-dominated. According to a 2018 study from Wired and Element AI, just 12% of AI researchers globally are female. Artificial intelligence will reshape every corner of our lives in the coming years--from healthcare to finance, from education to government. It is therefore troubling that those building this technology do not fully represent the society they are poised to transform.


Bristol Myers' Richard Hargreaves pays $70M to launch a neurodegeneration alliance with a star player in the machine learning world

#artificialintelligence

Bris tol My ers Squibb is turn ing to one of the star up starts in the ma chine learn ing world to go back to the draw ing board and come up with the dis ease mod els need ed to find drugs that can work against two of the tough est tar gets in the neu ro world. Daphne Koller's well-fund ed in sitro is get ting $70 mil lion in cash and near-term mile stones to use their ma chine learn ing plat form to cre ate in duced pluripo tent stem cell-de rived dis ease mod els for ALS and fron totem po ral de men tia. Then they'll use those in sights to start build ing new drugs for those two ail ments; a com plex, ground-up ap proach that has al ready won a close al liance with Gilead. Suc cess would trig ger up to $2 bil lion in mile stones, run ning a gamut of re search and com mer cial goals. "We be lieve that ma chine learn ing and da ta gen er at ed by nov el ex per i men tal plat forms of fer the op por tu ni ty to re think how we dis cov er and de sign nov el med i cines," said Richard Har g reaves, the chief of the neu ro group at Bris tol My ers, who made the leap from Cel gene.


How Machine Learning is Transforming Drug Discovery

#artificialintelligence

In a world where a drug takes years and billions of dollars to develop, just one in 20 candidates makes it to market. Daphne Koller is betting artificial intelligence can change that dynamic. Twenty years ago, when she first started using artificial intelligence to venture into medicine and biology, Koller was stymied by a lack of data. There wasn't enough of it and what there was, was often not well suited to the problems she wanted to solve. Fast-forward 20 years, however, and both the quantity and quality of data, and the tools for studying biology, have advanced so dramatically that the adjunct professor of computer science at Stanford founded a company, insitro, that uses machine learning (a subspecialty of artificial intelligence) to explore the causes and potential treatments for some very serious diseases.


Combining AI and biology could solve drug discovery's biggest problems

#artificialintelligence

Daphne Koller is best known as the cofounder of Coursera, the open database for online learning that launched in 2012. But before her work on Coursera, she was doing something much different. In 2000, Koller started working on applying machine learning to biomedical data sets to understand gene activity across cancer types. She put that work on hold to nurture Coursera, which took many more years than she initially thought it would. She didn't return to biology until 2016 when she joined Alphabet's life science research and development arm Calico.


Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping

Bargiacchi, Eugenio, Verstraeten, Timothy, Roijers, Diederik M., Nowé, Ann

arXiv.org Artificial Intelligence

We present a new model-based reinforcement learning algorithm, Cooperative Prioritized Sweeping, for efficient learning in multi-agent Markov decision processes. The algorithm allows for sample-efficient learning on large problems by exploiting a factorization to approximate the value function. Our approach only requires knowledge about the structure of the problem in the form of a dynamic decision network. Using this information, our method learns a model of the environment and performs temporal difference updates which affect multiple joint states and actions at once. Batch updates are additionally performed which efficiently back-propagate knowledge throughout the factored Q-function. Our method outperforms the state-of-the-art algorithm sparse cooperative Q-learning algorithm, both on the well-known SysAdmin benchmark and randomized environments.