Vision
Peek-a-boo, Big Tech sees you: Expert warns just 20 cloud images can make an AI deepfake video of your child
Texas high school student Elliston Berry joins'Fox & Friends' to discuss the House's passage of a new bill that criminalizes the sharing of non-consensual intimate images, including content created with artificial intelligence. Parents love capturing their kids' big moments, from first steps to birthday candles. But a new study out of the U.K. shows many of those treasured images may be scanned, analyzed and turned into data by cloud storage services, and nearly half of parents don't even realize it. A survey of 2,019 U.K. parents, conducted by Perspectus Global and commissioned by Swiss privacy tech company Proton, found that 48% of parents were unaware providers like Google Photos, Apple iCloud, Amazon Photos and Dropbox can access and analyze the photos they upload. First lady Melania Trump, joined by President Donald Trump, delivers remarks before President Trump signed the Take it Down Act into law in the Rose Garden of the White House May 19, 2025, in Washington, D.C. (Chip Somodevilla/Getty Images) These companies use artificial intelligence to sort images into albums, recognize faces and locations and suggest memories.
Sharing Key Semantics in Transformer Makes Efficient Image Restoration
Image Restoration (IR), a classic low-level vision task, has witnessed significant advancements through deep models that effectively model global information. Notably, the emergence of Vision Transformers (ViTs) has further propelled these advancements. When computing, the self-attention mechanism, a cornerstone of ViTs, tends to encompass all global cues, even those from semantically unrelated objects or regions. This inclusivity introduces computational inefficiencies, particularly noticeable with high input resolution, as it requires processing irrelevant information, thereby impeding efficiency. Additionally, for IR, it is commonly noted that small segments of a degraded image, particularly those closely aligned semantically, provide particularly relevant information to aid in the restoration process, as they contribute essential contextual cues crucial for accurate reconstruction. To address these challenges, we propose boosting IR's performance by sharing the key semantics via Transformer for IR (i.e., SemanIR) in this paper.
Learning to Orient Surfaces by Self-supervised Spherical CNNs, Federico Stella 1, Luciano Silva
Defining and reliably finding a canonical orientation for 3D surfaces is key to many Computer Vision and Robotics applications. This task is commonly addressed by handcrafted algorithms exploiting geometric cues deemed as distinctive and robust by the designer. Yet, one might conjecture that humans learn the notion of the inherent orientation of 3D objects from experience and that machines may do so alike. In this work, we show the feasibility of learning a robust canonical orientation for surfaces represented as point clouds. Based on the observation that the quintessential property of a canonical orientation is equivariance to 3D rotations, we propose to employ Spherical CNNs, a recently introduced machinery that can learn equivariant representations defined on the Special Orthogonal group SO(3). Specifically, spherical correlations compute feature maps whose elements define 3D rotations. Our method learns such feature maps from raw data by a self-supervised training procedure and robustly selects a rotation to transform the input point cloud into a learned canonical orientation. Thereby, we realize the first end-to-end learning approach to define and extract the canonical orientation of 3D shapes, which we aptly dub Compass. Experiments on several public datasets prove its effectiveness at orienting local surface patches as well as whole objects.
Urgent warning to Americans over 'dangerous' technology quietly rolled out in 80 airports
Within seconds, you've been scanned, stored, and tracked--before even reaching airport security. Without ever handing over your ID, the Transportation Security Administration (TSA) already knows exactly who you are. This is happening at 84 airports across the US. And chances are, you didn't even notice. Marketed as a tool to enhance security, TSA's facial recognition system is drawing criticism for its potential to track Americans from the terminal entrance to their final destination.
Its now a federal crime to publish AI deepfake porn
The Take It Down Act, a controversial bipartisan bill recently hailed by First Lady Melania Trump as a tool to build a safer internet, is officially law, as President Donald Trump took to the White House Rose Garden today to put ink to legislative paper. It's the first high-profile tech legislation to pass under the new administration. "With the rise of AI image generation, countless women have been harassed with deepfakes and other explicit images distributed against their will. This is wrong, so horribly wrong, and it's a very abusive situation," said Trump at the time of signing. "This will be the first ever federal law to combat the distribution of explicit, imaginary, posted without subject's consent... We've all heard about deepfakes. I have them all the time, but nobody does anything. I ask Pam [Bondi], 'Can you help me Pam?' She says, 'No I'm too busy doing other things. But a lot of people don't survive, that's true and so horrible... Today, we're making it totally illegal."
Deep love or deepfake: Dating in the time of AI
Beth Hyland thought she had met the love of her life on Tinder. In reality, the Michigan-based administrative assistant had been manipulated by an online scam artist who posed as a French man named "Richard," used deepfake video on Skype calls and posted photos of another man to pull off his con. Deepfakes -- manipulated video or audio made using artificial intelligence to look and sound real -- are often difficult to detect without specialized tools.
A Appendix
A.1 Conventional Test-Time Augmentation Center-Crop is the standard test-time augmentation for most of computer vision tasks [56, 29, 5, 7, 18, 26, 52]. The Center-Crop first resizes an image to a fixed size and then crops the central area to make a predefined input size. We resize an image to 256 pixels and crop the central 224 pixels for ResNet-50 in ImageNet experiment, as the same way as [18, 26, 52]. In the case of CIFAR, all images in the dataset are 32 by 32 pixels; we use the original images without any modification at the test time. Horizontal-Flip is an ensemble method using the original image and the horizontally inverted image.
Learning Loss for Test-Time Augmentation
Data augmentation has been actively studied for robust neural networks. Most of the recent data augmentation methods focus on augmenting datasets during the training phase. At the testing phase, simple transformations are still widely used for test-time augmentation. This paper proposes a novel instance-level testtime augmentation that efficiently selects suitable transformations for a test input. Our proposed method involves an auxiliary module to predict the loss of each possible transformation given the input. Then, the transformations having lower predicted losses are applied to the input.