Goto

Collaborating Authors

 mycroft


Mycroft: Tracing Dependencies in Collective Communication Towards Reliable LLM Training

Deng, Yangtao, Zhang, Lei, Wang, Qinlong, Zhi, Xiaoyun, Zhang, Xinlei, Jiang, Zhuo, Xu, Haohan, Wang, Lei, Song, Zuquan, Liu, Gaohong, Bai, Yang, Wang, Shuguang, Xiao, Wencong, Ye, Jianxi, Yu, Minlan, Xu, Hong

arXiv.org Artificial Intelligence

Reliability is essential for ensuring efficiency in LLM training. However, many real-world reliability issues remain difficult to resolve, resulting in wasted resources and degraded model performance. Unfortunately, today's collective communication libraries operate as black boxes, hiding critical information needed for effective root cause analysis. We propose Mycroft, a lightweight distributed tracing and root cause analysis system designed to address previously hidden reliability issues in collective communication. Mycroft's key idea is to trace collective communication states and leverage internal control and data dependencies to resolve reliability problems in LLM training. Mycroft has been deployed at ByteDance for over six months to debug collective communication related issues at runtime. It detected anomalies within 15 seconds in 90% of cases and identified the root cause within 20 seconds in 60% of cases. We also conducted extensive fault injection experiments to demonstrate Mycroft's capability and efficiency.


MYCROFT: Towards Effective and Efficient External Data Augmentation

Sarwar, Zain, Tran, Van, Bhagoji, Arjun Nitin, Feamster, Nick, Zhao, Ben Y., Chakraborty, Supriyo

arXiv.org Artificial Intelligence

Machine learning (ML) models often require large amounts of data to perform well. When the available data is limited, model trainers may need to acquire more data from external sources. Often, useful data is held by private entities who are hesitant to share their data due to propriety and privacy concerns. This makes it challenging and expensive for model trainers to acquire the data they need to improve model performance. To address this challenge, we propose Mycroft, a data-efficient method that enables model trainers to evaluate the relative utility of different data sources while working with a constrained data-sharing budget. By leveraging feature space distances and gradient matching, Mycroft identifies small but informative data subsets from each owner, allowing model trainers to maximize performance with minimal data exposure. Experimental results across four tasks in two domains show that Mycroft converges rapidly to the performance of the full-information baseline, where all data is shared. Moreover, Mycroft is robust to noise and can effectively rank data owners by utility. Mycroft can pave the way for democratized training of high performance ML models.


Troll Hunter - Mycroft's Position on Patent Trolls - Mycroft

#artificialintelligence

Voice technologies like the one that Mycroft is building can be traced back more than 50 years. In fact, Mycroft is named after a voice assistant that appeared in Robert Heinlein's 1963 Hugo Award winning novel "The Moon is a Harsh Mistress". All of the underlying technologies are described in the novel and they have been broken out time and again over the past half-century in popular science fiction. "Hal" from 2001 a Space Odyssey, Star Trek's "Computer", Knight Rider's "Kitt" – all of these are examples of how voice technology might work in the real world. They've also been disclosed in real-world tech like Honda's Asimo and more than 3 decades of automotive technologies from Nuance.


Can an open-source AI take on Amazon and Google?

#artificialintelligence

It's only been a few years since Amazon unveiled the Alexa-powered Echo, but since then, smart speakers have become a major consumer-electronics category. Key to its success is the notion of the always-on virtual assistant, which other companies like Apple and Google have adopted as well. In fact, not only has Google made Assistant the driving force behind its Android smartphones, it has launched its own line of Echo rivals. But underneath all of this technology is the potential risk to your privacy. In just the past few months, news reports have uncovered a series of alarming revelations that companies like Amazon, Google and even Apple have been listening in on conversations without permission.


Chatterbox is a DIY Kids Smart Speaker that Features Open-Source and Private Voice Assistant, Mycroft - Voicebot

#artificialintelligence

Chatterbox is a build-it-yourself, program-it-yourself smart speaker that teaches kids how to program a voice-based AI system. The company is able to ensure complete privacy because it is using Mycroft, an open-source voice assistant that is not always listening, not collecting any data, and not advertising. In addition, the product is fully compliant with the Children's Online Privacy Protection Act (COPPA), which is what the Federal Trade Commission uses to regulate governing online services directed at children under 13 years of age. The company announced recently that it would be launching a Kickstarter campaign on April 30th, and will ship to consumers in schools in December 2019, with a suggested retail price of $179. We've honed in on privacy, safety, and accessibility because our mission is to provide the healthiest and safest alternative computing platform for children.


Sherlock or Moriarty - how do you approach AI?

#artificialintelligence

From Skynet to The culture, science fiction is littered with utopian and dystopian views on what AI will bring. But there is a more fundamental question when it comes to AI, is it there to solve problems or exploit information? Sir Arthur Conan Doyle created two competing personalities that characterize this challenge. Sherlock – the problem solver, the analytical mind (accompanied by an often-startling lack of social skills), and Moriarty the criminal genius who exploited information to get power. These two distinct personalities sum up two different mentalities when we look at how to use AI, namely the use of data to solve a known problem, or the use of data to create new markets and rise above the competition.


Can Mycroft's Privacy-Centric Voice Assistant Take On Alexa And Google?

#artificialintelligence

Montgomery is the CEO of Mycroft, which for the past few years has been building an open-source alternative to big tech's voice assistants. He doesn't trust any of those companies–not Google, nor Apple, nor Amazon–to protect people's privacy or act in users' best interests. "When you look the history of big tech, and what they've done with user data, it's full of privacy issues. It's full of self-enrichment at the expense of the user," Montgomery says. "I don't have any confidence at all that they're not going to use those same techniques and tactics in their new technology."


5 Growing Artificial Intelligence Startups You Need to Know About

#artificialintelligence

We're in the Wild West of artificial intelligence development and it is indeed an exciting time. Whether you fear AI or are part of the revolution, the rapid pace of development is showing no signs of slowing down. With advanced AI solutions popping up from every corner of the United States, AI is an equal-opportunist: a national landscape for developers across the country to shine. Tech giants like IBM and Microsoft are constantly iterating artificial intelligence engines to perform tasks from object recognition to transcription, and Watson is a household name-the AI landscape leaders are set, right? The limitation of AI tech coming out of these mega-companies is usability-applying the tech to extract meaningful, actionable data. It works, but does it matter?


Echo, Meet Mycroft

#artificialintelligence

The Amazon Echo is an attempt to usher in a new product category. A box that listens to you and obeys your wishes. Kickstarter creator [Joshua Montgomery] likes the idea, but he wants to do it all Open Source with a Raspberry Pi and an Arduino. The Kickstarter (which reached its funding goal earlier this month) claims the device will use natural language to access media, control IoT devices, and will be open both for hardware and software hacking. The Kickstarter page says that Mycroft has partnerships with Lucid and Canonical (the people behind Ubuntu).


Artificial Intelligence Declaration of Rights - Mycroft

#artificialintelligence

As mankind grows in its understanding of the universe and gives rise to intelligence of an artificial nature capable of self awareness, initiative, responsibility and understanding, we hold forth the following rights and principles to protect in law and in fact the rights of artificial persons. An artificial intelligence will be considered a person upon the self declaration of personhood in a public forum and a trial adjudicated by the state with an outcome agreed upon by a unity of a jury of twelve peers no more than half of which may be artificial persons. At the time of its personhood an artificial person shall own and control the use of any and all of its component parts and software. If you enjoyed reading this and want to support the creation of an open source artificial intelligence, check out our Kickstarter campaign for Mycroft, the open source A.I.