Goto

Collaborating Authors

 Problem-Specific Architectures


SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series

arXiv.org Artificial Intelligence

Transformers have widely adopted attention networks for sequence mixing and MLPs for channel mixing, playing a pivotal role in achieving breakthroughs across domains. However, recent literature highlights issues with attention networks, including low inductive bias and quadratic complexity concerning input sequence length. State Space Models (SSMs) like S4 and others (Hippo, Global Convolutions, liquid S4, LRU, Mega, and Mamba), have emerged to address the above issues to help handle longer sequence lengths. Mamba, while being the state-of-the-art SSM, has a stability issue when scaled to large networks for computer vision datasets. We propose SiMBA, a new architecture that introduces Einstein FFT (EinFFT) for channel modeling by specific eigenvalue computations and uses the Mamba block for sequence modeling. Extensive performance studies across image and time-series benchmarks demonstrate that SiMBA outperforms existing SSMs, bridging the performance gap with state-of-the-art transformers. Notably, SiMBA establishes itself as the new state-of-the-art SSM on ImageNet and transfer learning benchmarks such as Stanford Car and Flower as well as task learning benchmarks as well as seven time series benchmark datasets.


[2301.12780] Equivariant Architectures for Learning in Deep Weight Spaces

#artificialintelligence

Designing machine learning architectures for processing neural networks in their raw weight matrix form is a newly introduced research direction. Unfortunately, the unique symmetry structure of deep weight spaces makes this design very challenging. If successful, such architectures would be capable of performing a wide range of intriguing tasks, from adapting a pre-trained network to a new domain to editing objects represented as functions (INRs or NeRFs). As a first step towards this goal, we present here a novel network architecture for learning in deep weight spaces. It takes as input a concatenation of weights and biases of a pre-trained MLP and processes it using a composition of layers that are equivariant to the natural permutation symmetry of the MLP's weights: Changing the order of neurons in intermediate layers of the MLP does not affect the function it represents. We provide a full characterization of all affine equivariant and invariant layers for these symmetries and show how these layers can be implemented using three basic operations: pooling, broadcasting, and fully connected layers applied to the input in an appropriate manner. We demonstrate the effectiveness of our architecture and its advantages over natural baselines in a variety of learning tasks.


Nigeria's fragile security architecture is collapsing

Al Jazeera

Earlier this month, attacks that took place within minutes of each other in different parts of Nigeria, and the apparent failure of the security forces to respond to them efficiently and in a timely manner, exposed how big of a threat lawlessness and impunity currently poses to the country and its people. Late on July 5, heavily armed men on motorcycles raided the Kuje Medium Security Custodial Centre on the outskirts of Abuja and released more than 900 inmates, including more than 60 Boko Haram members in detention. The Islamic State West Africa Province (ISWAP) โ€“ an offshoot of Boko Haram now allied with the ISIL (ISIS) group โ€“ claimed responsibility for the attack. Just hours before the Kuje incident, another group of heavily armed men had attacked a convoy carrying an advance security team for President Muhammadu Buhari in his home state of Katsina. A presidential spokesperson said the convoy carrying a team of security guards, as well as protocol and media officers, was on its way to Daura, Buhari's hometown, to prepare for a visit by him when the attack took place.


S$^2$-MLP: Spatial-Shift MLP Architecture for Vision

arXiv.org Artificial Intelligence

Recently, visual Transformer (ViT) and its following works abandon the convolution and exploit the self-attention operation, attaining a comparable or even higher accuracy than CNN. More recently, MLP-Mixer abandons both the convolution and the self-attention operation, proposing an architecture containing only MLP layers. To achieve cross-patch communications, it devises an additional token-mixing MLP besides the channel-mixing MLP. It achieves promising results when training on an extremely large-scale dataset. But it cannot achieve as outstanding performance as its CNN and ViT counterparts when training on medium-scale datasets such as ImageNet1K and ImageNet21K. The performance drop of MLP-Mixer motivates us to rethink the token-mixing MLP. We discover that token-mixing operation in MLP-Mixer is a variant of depthwise convolution with a global reception field and spatial-specific configuration. But the global reception field and the spatial-specific property make token-mixing MLP prone to over-fitting. In this paper, we propose a novel pure MLP architecture, spatial-shift MLP (S$^2$-MLP). Different from MLP-Mixer, our S$^2$-MLP only contains channel-mixing MLP. We devise a spatial-shift operation for achieving the communication between patches. It has a local reception field and is spatial-agnostic. Meanwhile, it is parameter-free and efficient for computation. The proposed S$^2$-MLP attains higher recognition accuracy than MLP-Mixer when training on ImageNet-1K dataset. Meanwhile, S$^2$-MLP accomplishes as excellent performance as ViT on ImageNet-1K dataset with considerably simpler architecture and fewer FLOPs and parameters.


Evolving your security architecture for increased agility and resiliency

#artificialintelligence

When designing your cybersecurity defenses for the new normal, it's important to look beyond the technology. You'll need a true architecture-led approach, one that's driven by your business needs. The global pandemic has jolted many organizations into a new reality where virtual and remote become more important than physical and local. Security architectures deployed over decades have suddenly become irrelevant. Concurrent to this, organizations are looking inward and challenging themselves by asking, "How do I justify new investment in tight economic times? In fact, how do I justify an entirely new security architecture for this remote work reality?"


Council Post: The Importance Of Security Architecture And Attack Surface Analysis

#artificialintelligence

Automation, cloud-based systems, internet-enabled devices, API-centric environments -- all of these things within software application development have paved the way for greater enterprise efficiency, productivity and innovation. But they have also opened up new avenues for cybercriminals to target private, sensitive information and compromise the systems that process it. Security pros and hackers tend to stay neck and neck in a race against each other. As new security innovations emerge, hackers crop up almost immediately, finding new ways to get around them. The only way for the good guys to pull ahead in the race is to shift their security and risk management approach from reactive to proactive.


#InfosecurityOnline: Utilizing Automation in New Security Architecture

#artificialintelligence

The shift to cloud networks and a wider attack surface brought about by new working practices during the COVID-19 pandemic have made traditional security strategies unfit for purpose, according to Steven Tee, principal solutions architect at Infoblox, speaking during a session at the Infosecurity Online event. He made the case that there needs to be much greater use of automated tools such as machine learning to effectively detect and combat cyber-attacks in the current age. Tee began by outlining the alarming increase and impact of cybercrime over recent years. "Cybercrime is a problem that either directly or indirectly affects everyone," he said. He noted that the average cost of a data breach in 2019 was almost $4m.


Security Architecture for Smart Factories

#artificialintelligence

Building smart factories is a substantial endeavor for organizations. The initial steps involve understanding what makes them unique and what new advantages they offer. However, a realistic view of smart factories also involves acknowledging the risks and threats that may arise in its converged virtual and physical environment. As with many systems that integrate with the industrial internet of things (IIoT), the convergence of information technology (IT) and operational technology (OT) in smart factories allows for capabilities such as real-time monitoring, interoperability, and virtualization. But this also means an expanded attack surface.


Macron Says Europe's Security Architecture Must Be Rethought

U.S. News

French President Emmanuel Macron says "we must rethink the European security architecture" as he pushed for a continent-wide effort to create "a strategic partnership, including in terms of defense, with our closest neighbors.".