"The field of Machine Learning seeks to answer these questions: How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?"
– from The Discipline of Machine Learning by Tom Mitchell. CMU-ML-06-108, 2006.
Artificial intelligence (AI) is a fast-growing field full of exciting possibilities, not only for life on Earth, but for extraterrestrial life as well. Space missions have been utilizing artificial intelligence for decades now, and it is only becoming more prevalent. As companies and space agencies shift to automation and seek to minimize the inherent risks of space exploration, we will see a boom of AI in this field. However, before describing past and future space missions and developments, let's discuss briefly what AI is and what it is capable of. Broadly, artificial intelligence is a branch of computer science involving the development of machines (programs or computers) that replicate intelligence (human or other forms).
Though originally developed for NLP, the transformer architecture is gradually making its way into many different areas of deep learning, including image classification and labeling and even reinforcement learning. It's an amazingly versatile architecture and very powerful at representing whatever it's being used to model. As part of my effort to understand fundamental architectures and their applications better, I decided to implement the vision transformer (ViT) from the paper¹ directly, without referencing the official codebase. In this post, I'll explain how it works (and how my version is implemented). I'll start with a brief review of how transformers work, but I won't get too deep into the weeds here since there are many other excellent guides to transformers (see The Illustrated Transformer for my favorite one).
Sleep staging using nocturnal sounds recorded from common mobile devices may allow daily at-home sleep tracking. The objective of this study is to introduce an end-to-end (sound-to-sleep stages) deep learning model for sound-based sleep staging designed to work with audio from microphone chips, which are essential in mobile devices such as modern smartphones. Patients and Methods: Two different audio datasets were used: audio data routinely recorded by a solitary microphone chip during polysomnography (PSG dataset, N 1154) and audio data recorded by a smartphone (smartphone dataset, N 327). The audio was converted into Mel spectrogram to detect latent temporal frequency patterns of breathing and body movement from ambient noise. The proposed neural network model learns to first extract features from each 30-second epoch and then analyze inter-epoch relationships of extracted features to finally classify the epochs into sleep stages. Results: Our model achieved 70% epoch-by-epoch agreement for 4-class (wake, light, deep, REM) sleep stage classification and robust performance across various signal-to-noise conditions. The model performance was not considerably affected by sleep apnea or periodic limb movement. Conclusion: The proposed end-to-end deep learning model shows potential of low-quality sounds recorded from microphone chips to be utilized for sleep staging. Future study using nocturnal sounds recorded from mobile devices at home environment may further confirm the use of mobile device recording as an at-home sleep tracker. Sound-based sleep staging can be a potential candidate for non-contact home sleep trackers. However, existing works were limited to audio measured with a contact manner (ie, tracheal sounds), with a limited distance (ie, 25 cm), or by a professional microphone. For convenience, a more practical way is to utilize easily obtainable audio, such as sounds recorded from commercial mobile devices.
There are plenty of organizations that are dabbling with AI, but relatively few have decided to go all in on the technology. One that is decidedly on that path is Mastercard. Employing a combination of acquisitions and internal capabilities, Mastercard has the clear objective of becoming an AI powerhouse. Just what does that term mean, and how is it being applied at the company? Some refer to the idea of aggressive, pervasive adoption of AI as being "AI first." Others use the term "AI fueled" or "all in on AI" (that's Tom's favorite, since it's the title of his forthcoming book on the subject).
Koomey's law This law posits that the energy efficiency of computation doubles roughly every one-and-a-half years (see Figure 1–7). In other words, the energy necessary for the same amount of computation halves in that time span. To visualize the exponential impact this has, consider the face that a fully charged MacBook Air, when applying the energy efficiency of computation of 1992, would completely drain its battery in a mere 1.5 seconds. According to Koomey's law, the energy requirements for computation in embedded devices is shrinking to the point that harvesting the required energy from ambient sources like solar power and thermal energy should suffice to power the computation necessary in many applications. Metcalfe's law This law has nothing to do with chips, but all to do with connectivity. Formulated by Robert Metcalfe as he invented Ethernet, the law essentially states that the value of a network increases exponentially with regard to the number of its nodes (see Figure 1–8).
Advances in computer vision and machine learning have made it possible for a wide range of technologies to perform sophisticated tasks with little or no human supervision. From autonomous drones and self-driving cars to medical imaging and product manufacturing, many computer applications and robots use visual information to make critical decisions. Cities increasingly rely on these automated technologies for public safety and infrastructure maintenance. However, compared to humans, computers see with a kind of tunnel vision that leaves them vulnerable to attacks with potentially catastrophic results. For example, a human driver, seeing graffiti covering a stop sign, will still recognize it and stop the car at an intersection.
Deep generative models can synthesize diverse and high-fidelity images. Computational understanding of art attracts more and more attention because of its importance for art history, computational creativity and human-computer interaction. The new research proposes the idea to use art for the purposes of benchmarking generative AI models. The dataset is composed of 60,000 images annotated with 10 artistic styles such as Baroque or Surrealism. The images are of high-quality with clean and balanced labels and can be easily incorporated in commonly used deep learning frameworks.
Generative Adversarial Networks (GANs) are powerful generative models for numerous tasks and datasets. However, most of the existing models suffer from mode collapse. The most recent research indicates that the reason for it is that the optimal transportation map from random noise to the data distribution is discontinuous, but deep neural networks (DNNs) can only approximate continuous ones. Instead, the latent representation is a better raw material used to construct a transportation map point to the data distribution than random noise. Because it is a low-dimensional mapping related to the data distribution, the construction procedure seems more like expansion rather than starting all over. Besides, we can also search for more transportation maps in this way with smoother transformation. Thus, we have proposed a new training methodology for GANs in this paper to search for more transportation maps and speed the training up, named Express Construction. The key idea is to train GANs with two independent phases for successively yielding latent representation and data distribution. To this end, an Auto-Encoder is trained to map the real data into the latent space, and two couples of generators and discriminators are used to produce them. To the best of our knowledge, we are the first to decompose the training procedure of GAN models into two more uncomplicated phases, thus tackling the mode collapse problem without much more computational cost. We also provide theoretical steps toward understanding the training dynamics of this procedure and prove assumptions. No extra hyper-parameters have been used in the proposed method, which indicates that Express Construction can be used to train any GAN models. Extensive experiments are conducted to verify the performance of realistic image generation and the resistance to mode collapse. The results show that the proposed method is lightweight, effective, and less prone to mode collapse.
In 2009, a computer scientist then at Princeton University named Fei-Fei Li invented a data set that would change the history of artificial intelligence. Known as ImageNet, the data set included millions of labeled images that could train sophisticated machine-learning models to recognize something in a picture. The machines surpassed human recognition abilities in 2015. Soon after, Li began looking for what she called another of the "North Stars" that would give AI a different push toward true intelligence. She found inspiration by looking back in time over 530 million years to the Cambrian explosion, when numerous land-dwelling animal species appeared for the first time.