Plotting

 South Korea




They sold their likeness to AI companies -- and regretted it

The Japan Times

South Korean actor Simon Lee was stunned when he saw his likeness -- at times as a gynecologist or a surgeon -- being used to promote questionable health cures on TikTok and Instagram. He is one of scores of people who have licensed their image to artificial intelligence marketing companies, and then ended up with the unpleasant surprise of seeing themselves feature in deepfakes, dubious adverts or even political propaganda. "If it was a nice advertisement, it would've been fine to me. But obviously it is such a scam," he said, adding that the terms of his contract prevented him from getting the videos removed.


An AI Image Generator's Exposed Database Reveals What People Really Used It For

WIRED

Tens of thousands of explicit AI-generated images, including AI-generated child sexual abuse material, were left open and accessible to anyone on the internet, according to new research seen by WIRED. An open database belonging to an AI image-generation firm contained more than 95,000 records, including some prompt data and images of celebrities such as Ariana Grande, the Kardashians, and Beyoncรฉ de-aged to look like children. The exposed database, which was discovered by security researcher Jeremiah Fowler, who shared details of the leak with WIRED, is linked to South Koreaโ€“based website GenNomis. The website and its parent company, AI-Nomis, hosted a number of image generation and chatbot tools for people to use. More than 45 GB of data, mostly made up of AI images, was left in the open.


A Self-Supervised Learning of a Foundation Model for Analog Layout Design Automation

arXiv.org Artificial Intelligence

We propose a UNet-based foundation model and its self-supervised learning method to address two key challenges: 1) lack of qualified annotated analog layout data, and 2) excessive variety in analog layout design tasks. For self-supervised learning, we propose random patch sampling and random masking techniques automatically to obtain enough training data from a small unannotated layout dataset. The obtained data are greatly augmented, less biased, equally sized, and contain enough information for excessive varieties of qualified layout patterns. By pre-training with the obtained data, the proposed foundation model can learn implicit general knowledge on layout patterns so that it can be fine-tuned for various downstream layout tasks with small task-specific datasets. Fine-tuning provides an efficient and consolidated methodology for diverse downstream tasks, reducing the enormous human effort to develop a model per task separately. In experiments, the foundation model was pre-trained using 324,000 samples obtained from 6 silicon-proved manually designed analog circuits, then it was fine-tuned for the five example downstream tasks: generating contacts, vias, dummy fingers, N-wells, and metal routings. The fine-tuned models successfully performed these tasks for more than one thousand unseen layout inputs, generating DRC/LVS-clean layouts for 96.6% of samples. Compared with training the model from scratch for the metal routing task, fine-tuning required only 1/8 of the data to achieve the same dice score of 0.95. With the same data, fine-tuning achieved a 90% lower validation loss and a 40% higher benchmark score than training from scratch.


MultiVENT: Multilingual Videos of Events with Aligned Natural Text

Neural Information Processing Systems

Everyday news coverage has shifted from traditional broadcasts towards a wide range of presentation formats such as first-hand, unedited video footage. Datasets that reflect the diverse array of multimodal, multilingual news sources available online could be used to teach models to benefit from this shift, but existing news video datasets focus on traditional news broadcasts produced for English-speaking audiences. We address this limitation by constructing MultiVENT, a dataset of multilingual, event-centric videos grounded in text documents across five target languages. MultiVENT includes both news broadcast videos and non-professional event footage, which we use to analyze the state of online news videos and how they can be leveraged to build robust, factually accurate models. Finally, we provide a model for complex, multilingual video retrieval to serve as a baseline for information retrieval using MultiVENT.


Ex Uno Pluria: Insights on Ensembling in Low Precision Number Systems

Neural Information Processing Systems

While ensembling deep neural networks has shown promise in improving generalization performance, scaling current ensemble methods for large models remains challenging. Given that recent progress in deep learning is largely driven by the scale, exemplified by the widespread adoption of large-scale neural network architectures, scalability emerges an increasingly critical issue for machine learning algorithms in the era of large-scale models. In this work, we first showcase the potential of low precision ensembling, where ensemble members are derived from a single model within low precision number systems in a training-free manner. Our empirical analysis demonstrates the effectiveness of our proposed low precision ensembling method compared to existing ensemble approaches.


Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach

Neural Information Processing Systems

With the continued advancement of Large Language Models (LLMs) Agents in reasoning, planning, and decision-making, benchmarks have become crucial in evaluating these skills. However, there is a notable gap in benchmarks for real-time strategic decision-making. StarCraft II (SC2), with its complex and dynamic nature, serves as an ideal setting for such evaluations. To this end, we have developed TextStarCraft II, a specialized environment for assessing LLMs in real-time strategic scenarios within SC2. Addressing the limitations of traditional Chain of Thought (CoT) methods, we introduce the Chain of Summarization (CoS) method, enhancing LLMs' capabilities in rapid and effective decision-making. Our key experiments included: 1. LLM Evaluation: Tested 10 LLMs in TextStarCraft II, most of them defeating LV5 build-in AI, showcasing effective strategy skills.


A tiny shapeshifting robot could be the next big thing in biomedicine

Mashable

Developed by a team of scientists at Seoul National University and Gachon University in South Korea, PB, or the Particle-armored liquid roBot, is designed to behave the way cells do, and imitate biological forms and functions. The morphing bot can ooze around tiny pillars, skim across water to reach a dry surface without bursting, merge with another PB, and swallow a glass bead, all without compromising structural integrity. The robot is still in the research stages, but the promising results so far raise hopes that PB could potentially help advance drug delivery and even tumor cell destruction in the future.


4D Gaussian Splatting in the Wild with Uncertainty-Aware Regularization

Neural Information Processing Systems

Novel view synthesis of dynamic scenes is becoming important in various applications, including augmented and virtual reality. We propose a novel 4D Gaussian Splatting (4DGS) algorithm for dynamic scenes from casually recorded monocular videos. To overcome the overfitting problem of existing work for these real-world videos, we introduce an uncertainty-aware regularization that identifies uncertain regions with few observations and selectively imposes additional priors based on diffusion models and depth smoothness on such regions. This approach improves both the performance of novel view synthesis and the quality of training image reconstruction. We also identify the initialization problem of 4DGS in fast-moving dynamic regions, where the Structure from Motion (SfM) algorithm fails to provide reliable 3D landmarks. To initialize Gaussian primitives in such regions, we present a dynamic region densification method using the estimated depth maps and scene flow. Our experiments show that the proposed method improves the performance of 4DGS reconstruction from a video captured by a handheld monocular camera and also exhibits promising results in few-shot static scene reconstruction.