cornish
How I Built ASR for Endangered Languages with a Spoken Dictionary
Bartley, Christopher, Ragni, Anton
Nearly half of the world's languages are endangered. Speech technologies such as Automatic Speech Recognition (ASR) are central to revival efforts, yet most languages remain unsupported because standard pipelines expect utterance-level supervised data. Speech data often exist for endangered languages but rarely match these formats. Manx Gaelic ($\sim$2,200 speakers), for example, has had transcribed speech since 1948, yet remains unsupported by modern systems. In this paper, we explore how little data, and in what form, is needed to build ASR for critically endangered languages. We show that a short-form pronunciation resource is a viable alternative, and that 40 minutes of such data produces usable ASR for Manx ($<$50\% WER). We replicate our approach, applying it to Cornish ($\sim$600 speakers), another critically endangered language. Results show that the barrier to entry, in quantity and form, is far lower than previously thought, giving hope to endangered language communities that cannot afford to meet the requirements arbitrarily imposed upon them.
- Instructional Material (0.69)
- Research Report > New Finding (0.48)
SymDiff: Equivariant Diffusion via Stochastic Symmetrisation
Zhang, Leo, Ashouritaklimi, Kianoosh, Teh, Yee Whye, Cornish, Rob
We propose SymDiff, a novel method for constructing equivariant diffusion models using the recently introduced framework of stochastic symmetrisation. SymDiff resembles a learned data augmentation that is deployed at sampling time, and is lightweight, computationally efficient, and easy to implement on top of arbitrary off-the-shelf models. Notably, in contrast to previous work, SymDiff typically does not require any neural network components that are intrinsically equivariant, avoiding the need for complex parameterizations and the use of higher-order geometric features. Instead, our method can leverage highly scalable modern architectures as drop-in replacements for these more constrained alternatives. We show that this additional flexibility yields significant empirical benefit on $\mathrm{E}(3)$-equivariant molecular generation. To the best of our knowledge, this is the first application of symmetrisation to generative modelling, suggesting its potential in this domain more generally.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Can companies police the biases found in artificial intelligence?
Artificial intelligence has seeped into almost every corner of our lives, including how people are hired for work. AI is used to screen and evaluate applicants, but there's a problem with that. Research has shown that AI can produce biased results, especially against women and minorities. That's something that Kenneth Chenault, chairman and managing director at the venture capital firm General Catalyst, is trying to address with his Data and Trust Alliance. Chenault is the co-chair of the organization.
ICE Turned To DMV Driver's License Databases For Help With Facial Recognition
Now we're going to look more broadly at what's been revealed today about ICE turning to DMV offices for help with facial recognition - that is, using driver's license photographs and algorithms to identify people suspected of being in the country illegally. Now, this collaboration was unearthed by a team at Georgetown University, and here to brief us is NPR's Aarti Shahani. CORNISH: I understand that in the past, ICE has gone to DMV offices and just asked for records on immigrants. We just heard about the case in Vermont that alleges that much. What exactly is new here?
- North America > United States > Vermont (0.28)
- North America > United States > Utah > Washington County > Washington (0.05)
- North America > United States > California > San Francisco County > San Francisco (0.05)
- Government > Immigration & Customs (0.95)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.73)
- Transportation > Ground > Road (0.62)
Microsoft President Brad Smith Discusses The Ethics Of Artificial Intelligence
Just because we can use it, should we? That's the question more and more people are asking about face recognition technology, software that's already in our phones and our social media feeds and many security systems. San Francisco leaders have voted to ban the police from using it, and even some in the tech industry say there should be limits. BRAD SMITH: It's the kind of technology that can do a lot of good for a lot of people, but it can be misused. It can be used in ways that lead to discrimination and bias.
- North America > United States > California > San Francisco County > San Francisco (0.25)
- Asia > China (0.16)
- Law (1.00)
- Information Technology (0.89)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.69)
- Government > Military (0.49)
Want To Know How Far Artificial Intelligence Has Come? Just Look At CAPTCHA
We're going to look now at the state of artificial intelligence this month in All Tech Considered. You've probably seen that statement online alongside a prompt that says something like, type the letters you see, or, click on all the stoplights. Do it right, and you get to go on to the next page. These games are developed by Google. Researcher Jason Polakis of the University of Illinois at Chicago has proven that, in fact, robots are pretty good at CAPTCHAs.
Scholars Delve Deeper Into The Ethics Of Artificial Intelligence
In 1941, science-fiction writer Isaac Asimov stated "The Three Laws of Robotics," in his short story "Runaround." Law One: A robot may not injure a human being or, through inaction, allow a human being to come to harm. Law Two: A robot must obey orders given it by human beings except where such orders would conflict with the First Law. Law Three: A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws. These laws come from the world of science fiction, but the real world is catching up.
- Law (1.00)
- Transportation > Passenger (0.53)
- Transportation > Ground > Road (0.34)
On the Opportunities and Pitfalls of Nesting Monte Carlo Estimators
Rainforth, Tom, Cornish, Robert, Yang, Hongseok, Warrington, Andrew, Wood, Frank
We present a formalization of nested Monte Carlo (NMC) estimation, whereby terms in an outer estimator themselves involve calculation of separate, nested, Monte Carlo (MC) estimators. We demonstrate that, under mild conditions, NMC can provide consistent estimates of nested expectations, including cases involving arbitrary levels of nesting; establish corresponding rates of convergence; and provide empirical evidence that these rates are observed in practice. We further establish a number of pitfalls that can arise from naïve nesting of MC estimators, provide guidelines about how these can be avoided, and lay out novel methods for reformulating certain classes of nested expectation problems into single expectations, leading to improved convergence rates. Finally, we use one of these reformulations to derive a new estimator for use in discrete Bayesian experimental design problems which has a better convergence rate than existing methods. Our results have implications for a wide range of fields from probabilistic programming to deep generative models and serve both as an invitation for further inquiry and a caveat against careless use.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)
Rise Of Artificial Intelligence Met With Mixed Reaction At SXSW
We head to Austin now for the annual South by Southwest Conference in this week's All Tech Considered. CORNISH: Now, South by Southwest is known for the music, but running alongside the shows are panels that bring leaders across industries together to discuss what's cutting edge. And one emerging technology being talked about a lot is artificial intelligence. For more on that, NPR's Laura Sydell joins us from Austin. CORNISH: To begin, obviously, people are talking about AI across the tech industry.
- North America > United States (0.37)
- Africa > Rwanda (0.05)