vision
MLP-Mixer: An all-MLP Architecture for Vision
Convolutional Neural Networks (CNNs) are the go-to model for computer vision. Recently, attention-based networks, such as the Vision Transformer, have also become popular. In this paper we show that while convolutions and attention are both sufficient for good performance, neither of them are necessary. We present MLP-Mixer, an architecture based exclusively on multi-layer perceptrons (MLPs). MLP-Mixer contains two types of layers: one with MLPs applied independently to image patches (i.e.
Explainable Semantic Space by Grounding Language to Vision with Cross-Modal Contrastive Learning
In natural language processing, most models try to learn semantic representations merely from texts. The learned representations encode the "distributional semantics" but fail to connect to any knowledge about the physical world. In contrast, humans learn language by grounding concepts in perception and action and the brain encodes "grounded semantics" for cognition. Inspired by this notion and recent work in vision-language learning, we design a two-stream model for grounding language learning in vision. The model includes a VGG-based visual stream and a Bert-based language stream. The two streams merge into a joint representational space. Through cross-modal contrastive learning, the model first learns to align visual and language representations with the MS COCO dataset. The model further learns to retrieve visual objects with language queries through a cross-modal attention module and to infer the visual relations between the retrieved objects through a bilinear operator with the Visual Genome dataset. After training, the model's language stream is a stand-alone language model capable of embedding concepts in a visually grounded semantic space.
ViSioNS: Visual Search in Natural Scenes Benchmark
Visual search is an essential part of almost any everyday human interaction with the visual environment. Nowadays, several algorithms are able to predict gaze positions during simple observation, but few models attempt to simulate human behavior during visual search in natural scenes. Furthermore, these models vary widely in their design and exhibit differences in the datasets and metrics with which they were evaluated. Thus, there is a need for a reference point, on which each model can be tested and from where potential improvements can be derived. In this study, we select publicly available state-of-the-art visual search models and datasets in natural scenes, and provide a common framework for their evaluation. To this end, we apply a unified format and criteria, bridging the gaps between them, and we estimate the models' efficiency and similarity with humans using a specific set of metrics. This integration has allowed us to enhance the Ideal Bayesian Searcher by combining it with a neural network-based visual search model, which enables it to generalize to other datasets. The present work sheds light on the limitations of current models and how integrating different approaches with a unified criteria can lead to better algorithms. Moreover, it moves forward on bringing forth a solution for the urgent need for benchmarking data and metrics to support the development of more general human visual search computational models.
Watch out! Motion is Blurring the Vision of Your Deep Neural Networks
The state-of-the-art deep neural networks (DNNs) are vulnerable against adversarial examples with additive random-like noise perturbations. While such examples are hardly found in the physical world, the image blurring effect caused by object motion, on the other hand, commonly occurs in practice, making the study of which greatly important especially for the widely adopted real-time image processing tasks (e.g., object detection, tracking). In this paper, we initiate the first step to comprehensively investigate the potential hazards of blur effect for DNN, caused by object motion. We propose a novel adversarial attack method that can generate visually natural motion-blurred adversarial examples, named motion-based adversarial blur attack (ABBA). To this end, we first formulate the kernel-prediction-based attack where an input image is convolved with kernels in a pixel-wise way, and the misclassification capability is achieved by tuning the kernel weights.
Brain Gear Is the Hot New Wearable
Smartwatches are cool and all, but have you considered wearable neurotech? Ten years ago, a Fitbit was about as sophisticated a wearable as you could get. Then came the sleeker, more unassuming Oura ring . Now there's a new breed of wearables--built for your head. Instead of tracking your step count, heart rate, and skin temperature, these devices are designed to read your brain waves.
- Europe > United Kingdom (0.05)
- Asia > China (0.05)
- Oceania > Australia (0.05)
- (7 more...)
- Information Technology > Hardware (0.91)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)
Apple surprises fans with three brand NEW products - the iPad Pro, MacBook Pro and Vision Pro
My mansion creeps 17 inches closer to the ocean every week... but I refuse to leave Police say they have FOUND woman seen in viral'kidnapping' video and reveal what happened to her after harrowing footage emerged Why tonsil stones are behind your bad breath: Foul-smelling'pebbles' of rotting food and bacteria are lurking in your throat. In heartbreaking sit down, Fox News' Harris Faulkner reveals her last talk with Charlie Kirk... and the change she saw in him before his death Bella Hadid's heath battle takes dark turn: Loved ones reveal hellish new details about'missing' model... as ominous texts emerge The world's most powerful passport revealed - as UK and USA both drop to record lows Unmasked after 80 years - the Nazi executioner in infamous WWII photo: Historian uses AI to uncover identity of killer in'The Last Jew of Vinnytsia' image MARK DUBOWITZ: I've uncovered the Muslim Brotherhood plot to sabotage Trump's peace deal'Pathetic' JD Vance slammed for'cheap' reaction to racist texts as Young Republicans spark Trump world crisis Kim Kardashian says she wasn't'emotionally or financially safe' during'toxic' marriage to Kanye West as she claims rapper hasn't contacted their children for MONTHS and has destroyed her dating life Every woman I date has the same repulsive bedroom kink... it feels so wrong, but I always say yes: DEAR JANE Jason Kelce speaks out after'brutal comments' about Bad Bunny's Super Bowl halftime show go viral Victoria's Secret Fashion Show 2025: Brand back to'super sexy' with Irina Shayk and Emily Ratajkowski after going'woke' Full horrors of torture suffered by Noa Argamani's commando boyfriend are revealed - including how 6ft 5in hostage was beaten and kept chained in 6ft cell for a year after he tried to escape from Hamas Mother, 52, and daughter, 21, die after eating'poisoned birthday cake delivered by relative who owed them money' in Brazil I had 30 debilitating symptoms but doctors dismissed me. Ellen Greenberg's ex breaks his silence after court hearing rules her 20-stab-wound death was'suicide'... see inside his plush new life Ugly divorce war between Mitt Romney's wealthy brother and estranged wife before she was found dead READ MORE: Apple has rebranded its TV service as part of a'new identity' It's barely been a month since Apple released its latest generation of iPhones, but the tech giant has already released three new products. In an unexpected launch, Apple has unveiled new models of the iPad Pro, MacBook Pro, and Vision Pro - which are all now available to pre-order. All of the new devices feature the M5 chip, Apple's latest and most powerful in-house processor.
- South America > Brazil (0.24)
- Europe > Ukraine > Vinnytsia Oblast > Vinnytsia (0.24)
- North America > Canada > Alberta (0.14)
- (13 more...)
- Media > Music (1.00)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- (4 more...)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Communications > Mobile (1.00)
- Information Technology > Artificial Intelligence (1.00)
Apple Just Upgraded the iPad Pro, MacBook Pro, and Vision Pro with Its New M5 Chip
The hardware largely remains the same, but performance gets a boost. All products featured on WIRED are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links. Without much fanfare, Apple has unveiled three new flagship products today via a press release--no special event, no pre-recorded show. That might be because the new iPad Pro, MacBook Pro, and Vision Pro don't change the mold--they're identical to their predecessors--but internally, they're debuting Apple's highly anticipated M5 chip.
- Asia > Middle East > Palestine > Gaza Strip > Gaza Governorate > Gaza (0.05)
- North America > United States > California (0.05)
- Europe > Slovakia (0.05)
- Europe > Czechia (0.05)
- Information Technology (0.70)
- Transportation > Ground > Road (0.70)
- Information Technology > Communications > Mobile (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.48)
Jony Ive Says He Wants His OpenAI Devices to 'Make Us Happy'
Jony Ive Says He Wants His OpenAI Devices to'Make Us Happy' "I don't think we have an easy relationship with our technology at the moment," the former Apple designer said at OpenAI's developer conference in San Francisco on Monday. At OpenAI's developer conference in San Francisco on Monday, CEO Sam Altman and ex-Apple designer Jony Ive spoke in vague terms about the "family of devices" the pair are currently working to develop . "As great as phones and computers are, there's something new to do," Altman said on stage with Ive. The duo confirmed that OpenAI is working on more than one hardware product but finer details, ranging from use cases to to specifications, remain under wraps. Figuring out new computing form factors is hard," said Altman in a media briefing earlier in the day. "I think we have a chance to do something amazing, but it will take a while." Ive said that his team has generated "15 to 20 really compelling product" ideas on the journey to find the right kind of ...
- North America > United States > California > San Francisco County > San Francisco (0.47)
- Europe > Slovakia (0.05)
- Europe > Czechia (0.05)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)
The Vision Pro Was An Expensive Misstep. Now Apple Has to Catch Up With Smart Glasses
The Vision Pro Was an Expensive Misstep. Having reportedly shelved work on a cheaper Vision Pro, Apple is apparently pivoting its focus to smart glasses--and hoping it's not too late. If Apple wants to match the grip Meta has on the smart glasses market, it just might have to simplify its face computers. According to an internal announcement reported in Bloomberg by serial Apple leaker Mark Gurman, Apple has deprioritized efforts to make a lighter, more affordable version of its Vision Pro headset in favor of focusing on AI-enabled smart glasses . Apple now seems to be aiming to launch a pair of Meta-style smart glasses in 2027, with another pair featuring a display on the lens aimed for release in 2028--if not before.
- North America > United States > California > San Francisco County > San Francisco (0.05)
- Europe > Slovakia (0.05)
- Europe > Czechia (0.05)
This Startup Wants to Put Its Brain-Computer Interface in the Apple Vision Pro
California-based Cognixion is launching a clinical trial to allow paralyzed patients with speech disorders the ability to communicate without an invasive brain implant. The trials will be conducted with a modified version of the Apple Vision Pro headset. Startup Cognixion announced today that it is launching a clinical trial of its wearable brain-computer interface technology integrated with the Apple Vision Pro to help paralyzed people with speech disorders communicate with their thoughts. Cognixion is one of several companies, including Elon Musk's Neuralink, that is developing a brain-computer interface, or BCI, a system that captures brain signals and translates them into commands to control external devices. While Neuralink and others are working on implants that are surgically placed in the head, Cognixion's technology is noninvasive.
- North America > United States > California > Santa Barbara County > Santa Barbara (0.05)
- North America > United States > California > San Francisco County > San Francisco (0.05)
- Europe > Slovakia (0.05)
- Europe > Czechia (0.05)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (1.00)