Goto

Collaborating Authors

 expedite




CubistMerge: Spatial-Preserving Token Merging For Diverse ViT Backbones

arXiv.org Artificial Intelligence

Many modern ViT backbones adopt spatial architectural designs, such as window attention, decomposed relative positional embeddings in SAM, and RoPE in DINOv3. Such architectures impose new challenges on token reduction, as the vast majority of existing methods fail to preserve the spatial structure these architectures depend on. In this paper, we introduce a simple yet effective token merging method that maintains spatial integrity, enabling seamless compatibility with spatial architectures. We reconcile two seemingly conflicting requirements: (i)exploiting the uneven information distribution across the spatial layout while (ii)preserving the spatial structure post-merging. Our approach employs (i)a 2D reduction strategy to enforce structured token layouts, (ii)a spatial-aware merging algorithm that maintains relative token positions, and (iii)a novel max-magnitude-per-dimension token representation that preserves salient features. Our method demonstrates strong performance both off-the-shelf and with fine-tuning, achieving state-of-the-art results on spatial and non-spatial architectures across various vision tasks. Specifically, we achieve 1.25x speedup on SAM-H with only 0.7% mIOU drop evaluated on COCO off-the-shelf, and 1.15x speedup on DeiT-B with no top-1 accuracy drop on ImageNet within just one epoch of fine-tuning.


AiluRus: A Scalable ViT Framework for Dense Prediction

arXiv.org Artificial Intelligence

Vision transformers (ViTs) have emerged as a prevalent architecture for vision tasks owing to their impressive performance. However, when it comes to handling long token sequences, especially in dense prediction tasks that require high-resolution input, the complexity of ViTs increases significantly. Notably, dense prediction tasks, such as semantic segmentation or object detection, emphasize more on the contours or shapes of objects, while the texture inside objects is less informative. Motivated by this observation, we propose to apply adaptive resolution for different regions in the image according to their importance. Specifically, at the intermediate layer of the ViT, we utilize a spatial-aware density-based clustering algorithm to select representative tokens from the token sequence. Once the representative tokens are determined, we proceed to merge other tokens into their closest representative token. Consequently, semantic similar tokens are merged together to form low-resolution regions, while semantic irrelevant tokens are preserved independently as high-resolution regions. This strategy effectively reduces the number of tokens, allowing subsequent layers to handle a reduced token sequence and achieve acceleration. We evaluate our proposed method on three different datasets and observe promising performance. For example, the "Segmenter ViT-L" model can be accelerated by 48% FPS without fine-tuning, while maintaining the performance. Additionally, our method can be applied to accelerate fine-tuning as well. Experimental results demonstrate that we can save 52% training time while accelerating 2.46 times FPS with only a 0.09% performance drop. The code is available at https://github.com/caddyless/ailurus/tree/main.


Optimizing Large Language Models to Expedite the Development of Smart Contracts

arXiv.org Artificial Intelligence

Programming has always been at the heart of technological innovation in the 21st century. With the advent of blockchain technologies and the proliferation of web3 paradigms of decentralised applications, smart contracts have been very instrumental in enabling developers to build applications that reside on decentralised blockchains. Despite the huge interest and potential of smart contracts, there is still a significant knowledge and skill gap that developers need to cross in order to build web3 applications. In light of this, we introduce MazzumaGPT, a large language model that has been optimised to generate smart contract code and aid developers to scaffold development and improve productivity. As part of this research, we outline the optimisation and fine-tuning parameters, evaluate the model's performance on functional correctness and address the limitations and broader impacts of our research.


Artificial Intelligence: How the adoption of AI in healthcare is advancing in medical treatment, Health News, ET HealthWorld

#artificialintelligence

By Nilesh Jahagirdar Artificial Intelligence (AI) has been prevalent in almost every business sector. However, in recent years, technology has burst into the healthcare landscape, propelling innovations and showcasing the potential to support medical practitioners and patients. From early disease diagnosis, drug discovery and trials, and precision in patient monitoring to self-care, AI algorithms have augmented the expertise of healthcare providers. According to the stats โ€“ AI expenditure in India is estimated to reach $11.78 bn by 2025, expected to add $1 trillion to the Indian economy by 2035. The new-age technology is dominating the healthcare industry so much that it's being referred to as the new nervous system.


How You Can Expedite Your Venture With machine learning - Dataconomy

#artificialintelligence

Machine learning (ML) is a definite branch of artificial intelligence (AI) that brings together significant insights to solve complex and data-rich business problems by means of algorithms. ML understands the past data that is usually in a raw form to envisage the future outcome. It is gaining more and more popularity in the IT space, and every organization is seeking to grab the advantages of ML advancements. According to Fortune Business Insights, the expected value of the global machine learning market will be $117.19 billion by 2027 at a CAGR of 39.2% during the forecast period. Easy data availability, growing data volumes, faster computational processing, and economical data storage are driving the growth of machine learning. With machine learning tools, organizations can figure out gainful opportunities as well as possible risks more promptly.


How AI could enhance our human approach to creativity - Tech Wire Asia

#artificialintelligence

Unless you're a machine and you're reading this, you are in fact a human being. You perceive, make sense of, adapt, and respond to what life presents you. In doing so, you come up with creative acts and solutions, thriving for stability or exciting new outcomes as you go. You do so even if you don't feel yourself to be all that creative. As humans, we make sense of, adapt, and respond to what life presents to us.


Can artificial intelligence prompt a creative revolution? - TechHQ

#artificialintelligence

Unless you're a machine and you're reading this, you are in fact a human being. You perceive, make sense of, adapt, and respond to what life presents you. In doing so, you come up with creative acts and solutions, thriving for stability or exciting new outcomes as you go. You do so even if you don't feel yourself to be all that creative. To offset this side of us, we all need a certain degree of methodical focus too.


Inclusive AI: Are AI hiring tools hurting corporate diversity?

#artificialintelligence

In recent years, a growing number of organizations have utilized artificial intelligence (AI) to revolutionize their traditional workflows. These systems are implemented to enhance cost-efficiency, reduce employee burnout, and even identify premium talent. Many organizations are using AI tools to expedite the arduous hiring processes. These algorithms have been viewed as objective tools capable of eliminating human subjectivity from the employment screening process. Paradoxically, many of these models are riddled with the same inherent biases these systems are intended to remove.