Personal
Federated Learning Hyper-Parameter Tuning from a System Perspective
Zhang, Huanle, Fu, Lei, Zhang, Mi, Hu, Pengfei, Cheng, Xiuzhen, Mohapatra, Prasant, Liu, Xin
Federated learning (FL) is a distributed model training paradigm that preserves clients' data privacy. It has gained tremendous attention from both academia and industry. FL hyper-parameters (e.g., the number of selected clients and the number of training passes) significantly affect the training overhead in terms of computation time, transmission time, computation load, and transmission load. However, the current practice of manually selecting FL hyper-parameters imposes a heavy burden on FL practitioners because applications have different training preferences. In this paper, we propose FedTune, an automatic FL hyper-parameter tuning algorithm tailored to applications' diverse system requirements in FL training. FedTune iteratively adjusts FL hyper-parameters during FL training and can be easily integrated into existing FL systems. Through extensive evaluations of FedTune for diverse applications and FL aggregation algorithms, we show that FedTune is lightweight and effective, achieving 8.48%-26.75% system overhead reduction compared to using fixed FL hyper-parameters. This paper assists FL practitioners in designing high-performance FL training solutions. The source code of FedTune is available at https://github.com/DataSysTech/FedTune.
CES 2023 robotics Innovation Award winners announced - The Robot Report
CES has announced the innovation award winners for the upcoming CES 2023 event happening in Las Vegas, on January 5-8, 2023. We went through the list of honorees and highlighted the robotics-related solutions for this story. French robotics company, ACWA Robotics, built a water pipe infrastructure mapping robot. The robot moves through the pipes while the water is running. The robot is able to precisely track its location while imaging the pipes, creating a map of exactly where repairs are needed.
Human or Machine? Turing Tests for Vision and Language
Zhang, Mengmi, Dellaferrera, Giorgia, Sikarwar, Ankur, Armendariz, Marcelo, Mudrik, Noga, Agrawal, Prachi, Madan, Spandan, Barbu, Andrei, Yang, Haochen, Kumar, Tanishq, Sadwani, Meghna, Dellaferrera, Stella, Pizzochero, Michele, Pfister, Hanspeter, Kreiman, Gabriel
As AI algorithms increasingly participate in daily activities that used to be the sole province of humans, we are inevitably called upon to consider how much machines are really like us. To address this question, we turn to the Turing test and systematically benchmark current AIs in their abilities to imitate humans. We establish a methodology to evaluate humans versus machines in Turing-like tests and systematically evaluate a representative set of selected domains, parameters, and variables. The experiments involved testing 769 human agents, 24 state-of-the-art AI agents, 896 human judges, and 8 AI judges, in 21,570 Turing tests across 6 tasks encompassing vision and language modalities. Surprisingly, the results reveal that current AIs are not far from being able to impersonate human judges across different ages, genders, and educational levels in complex visual and language challenges. In contrast, simple AI judges outperform human judges in distinguishing human answers versus machine answers. The curated large-scale Turing test datasets introduced here and their evaluation metrics provide valuable insights to assess whether an agent is human or not. The proposed formulation to benchmark human imitation ability in current AIs paves a way for the research community to expand Turing tests to other research areas and conditions. All of source code and data are publicly available at https://tinyurl.com/8x8nha7p
SkipConvGAN: Monaural Speech Dereverberation using Generative Adversarial Networks via Complex Time-Frequency Masking
Kothapally, Vinay, Hansen, J. H. L.
With the advancements in deep learning approaches, the performance of speech enhancing systems in the presence of background noise have shown significant improvements. However, improving the system's robustness against reverberation is still a work in progress, as reverberation tends to cause loss of formant structure due to smearing effects in time and frequency. A wide range of deep learning-based systems either enhance the magnitude response and reuse the distorted phase or enhance complex spectrogram using a complex time-frequency mask. Though these approaches have demonstrated satisfactory performance, they do not directly address the lost formant structure caused by reverberation. We believe that retrieving the formant structure can help improve the efficiency of existing systems. In this study, we propose SkipConvGAN - an extension of our prior work SkipConvNet. The proposed system's generator network tries to estimate an efficient complex time-frequency mask, while the discriminator network aids in driving the generator to restore the lost formant structure. We evaluate the performance of our proposed system on simulated and real recordings of reverberant speech from the single-channel task of the REVERB challenge corpus. The proposed system shows a consistent improvement across multiple room configurations over other deep learning-based generative adversarial frameworks.
Automated, not Automatic: Needs and Practices in European Fact-checking Organizations as a basis for Designing Human-centered AI Systems
Hrckova, Andrea, Moro, Robert, Srba, Ivan, Simko, Jakub, Bielikova, Maria
To mitigate the negative effects of false information more effectively, the development of automated AI (artificial intelligence) tools assisting fact-checkers is needed. Despite the existing research, there is still a gap between the fact-checking practitioners' needs and pains and the current AI research. We aspire to bridge this gap by employing methods of information behavior research to identify implications for designing better human-centered AI-based supporting tools. In this study, we conducted semi-structured in-depth interviews with Central European fact-checkers. The information behavior and requirements on desired supporting tools were analyzed using iterative bottom-up content analysis, bringing the techniques from grounded theory. The most significant needs were validated with a survey extended to fact-checkers from across Europe, in which we collected 24 responses from 20 European countries, i.e., 62% active European IFCN (International Fact-Checking Network) signatories. Our contributions are theoretical as well as practical. First, by being able to map our findings about the needs of fact-checking organizations to the relevant tasks for AI research, we have shown that the methods of information behavior research are relevant for studying the processes in the organizations and that these methods can be used to bridge the gap between the users and AI researchers. Second, we have identified fact-checkers' needs and pains focusing on so far unexplored dimensions and emphasizing the needs of fact-checkers from Central and Eastern Europe as well as from low-resource language groups which have implications for development of new resources (datasets) as well as for the focus of AI research in this domain.
How Financial Institutions Leverage AI to Stay Ahead of the Competition - CEOWORLD magazine
Rhett Power is responsible for helping corporate leadership take the actions needed to drive impact and courage in their teams that will improve organizational performance. He is the author of The Entrepreneur's Book of Actions: Essential Daily Exercises and Habits for Becoming Wealthier, Smarter, and More Successful (McGraw-Hill Education) and co-founder of Wild Creations, an award-winning start-up toy company. After a successful exit from the toy company, Rhett was named the best Small Business Coach in the United States. In 2019 he joined the prestigious Marshall Goldsmith's 100 Coaches and was named the #1 Thought Leader on Entrepreneurship by Thinkers360. He is a Fellow at The Institute of Coaching at McLean Hospital, a Harvard Medical School affiliate.
So, Can a Computer Really Be Irrational?
In a recent episode at Mind Matters News podcasting, "Can a computer be a person?" Wesley J. Smith: Let me ask the question in a different way. Can an AI ever be irrational? A classic example, and this happened a number of years ago, was that the Soviets during the Cold War developed a high technology to decide whether the US was being attacked by… I'm sorry, whether the Soviet Union was being attacked by the United States. And so they had these missile detectors.
Ask Me Anything: A simple strategy for prompting language models
Arora, Simran, Narayan, Avanika, Chen, Mayee F., Orr, Laurel, Guha, Neel, Bhatia, Kush, Chami, Ines, Sala, Frederic, Ré, Christopher
Large language models (LLMs) transfer well to new tasks out-of-the-box simply given a natural language prompt that demonstrates how to perform the task and no additional training. Prompting is a brittle process wherein small modifications to the prompt can cause large variations in the model predictions, and therefore significant effort is dedicated towards designing a painstakingly "perfect prompt" for a task. To mitigate the high degree of effort involved in prompt-design, we instead ask whether producing multiple effective, yet imperfect, prompts and aggregating them can lead to a high quality prompting strategy. Our observations motivate our proposed prompting method, ASK ME ANYTHING (AMA). We first develop an understanding of the effective prompt formats, finding that question-answering (QA) prompts, which encourage open-ended generation ("Who went to the park?") tend to outperform those that restrict the model outputs ("John went to the park. Output True or False."). Our approach recursively uses the LLM itself to transform task inputs to the effective QA format. We apply the collected prompts to obtain several noisy votes for the input's true label. We find that the prompts can have very different accuracies and complex dependencies and thus propose to use weak supervision, a procedure for combining the noisy predictions, to produce the final predictions for the inputs. We evaluate AMA across open-source model families (e.g., EleutherAI, BLOOM, OPT, and T0) and model sizes (125M-175B parameters), demonstrating an average performance lift of 10.2% over the few-shot baseline. This simple strategy enables the open-source GPT-J-6B model to match and exceed the performance of few-shot GPT3-175B on 15 of 20 popular benchmarks. Averaged across these tasks, the GPT-J-6B model outperforms few-shot GPT3-175B. We release our code here: https://github.com/HazyResearch/ama_prompting
My Sex Drive Roared Back as a 49-Year-Old Woman. Even I Can't Believe What I'm Doing About It.
Feeld Notes is a column about a middle-aged woman who suddenly realizes she wants to have sex again--and the beguiling app she uses to do it. The first man I had sex with in the decade since my divorce was not so much a man as, well, a boy. He was 29 years old, with a lean torso, olive-brown skin, and dark hair and eyes. He was more than 20 years younger than me. His name was Enrique, and like many of us on the app where we met, he looked different in his photographs than he did in real life.