Goto

Collaborating Authors

 hera


HERA: Hybrid Edge-cloud Resource Allocation for Cost-Efficient AI Agents

Liu, Shiyi, Shen, Haiying, Che, Shuai, Ghandi, Mahdi, Li, Mingqin

arXiv.org Artificial Intelligence

In the realm of AI, large language models (LLMs) like GPT-4, central to the operation of AI agents, predominantly operate in the cloud, incurring high operational costs. With local-based small language models (SLMs) becoming more accurate, the necessity of cloud-exclusive processing is being reconsidered. An AI agent's response to a user's request comprises a series of subtasks or iterations. Existing approaches only allocate a single request between SLM and LLM to ensure their outputs are similar, but adopting this approach in the AI agent scenario for assigning each subtask is not effective since SLM will output a different subsequent subtask, which affects the accuracy of the final output. In this paper, we first conduct experimental analysis to understand the features of AI agent operations. Leveraging our findings, we propose the Adaptive Iteration-level Model Selector (AIMS), a lightweight scheduler to automatically partition AI agent's subtasks between local-based SLM and cloud-based LLM. AIMS considers the varying subtask features and strategically decides the location for each subtask in order to use SLM as much as possible while attaining the accuracy level. Our experimental results demonstrate that AIMS increases accuracy by up to 9.1% and SLM usage by up to 10.8% compared to HybridLLM. It offloads 45.67% of subtasks to a local SLM while attaining similar accuracy on average compared with the cloud-only LLM approach.


HERA: Improving Long Document Summarization using Large Language Models with Context Packaging and Reordering

Li, Taiji, Chen, Hao, Yu, Fei, Zhang, Yin

arXiv.org Artificial Intelligence

Despite the rapid growth of context length of large language models (LLMs) , LLMs still perform poorly in long document summarization. An important reason for this is that relevant information about an event is scattered throughout long documents, and the messy narrative order impairs the accurate understanding and utilization of LLMs for long documents. To address these issues, we propose a novel summary generation framework, called HERA. Specifically, we first segment a long document by its semantic structure and retrieve text segments about the same event, and finally reorder them to form the input context. We evaluate our approach on two long document summarization datasets. The experimental results show that HERA outperforms foundation models in ROUGE, BERTScore and faithfulness metrics, while HERA does not require additional fine-tuning and resources.


Flow Exporter Impact on Intelligent Intrusion Detection Systems

Pinto, Daniela, Vitorino, João, Maia, Eva, Amorim, Ivone, Praça, Isabel

arXiv.org Artificial Intelligence

High-quality datasets are critical for training machine learning models, as inconsistencies in feature generation can hinder the accuracy and reliability of threat detection. For this reason, ensuring the quality of the data in network intrusion detection datasets is important. A key component of this is using reliable tools to generate the flows and features present in the datasets. This paper investigates the impact of flow exporters on the performance and reliability of machine learning models for intrusion detection. Using HERA, a tool designed to export flows and extract features, the raw network packets of two widely used datasets, UNSW-NB15 and CIC-IDS2017, were processed from PCAP files to generate new versions of these datasets. These were compared to the original ones in terms of their influence on the performance of several models, including Random Forest, XGBoost, LightGBM, and Explainable Boosting Machine. The results obtained were significant. Models trained on the HERA version of the datasets consistently outperformed those trained on the original dataset, showing improvements in accuracy and indicating a better generalisation. This highlighted the importance of flow generation in the model's ability to differentiate between benign and malicious traffic.


Reviews: Large Scale computation of Means and Clusters for Persistence Diagrams using Optimal Transport

Neural Information Processing Systems

This paper proposes a new method for the clustering of persistence diagrams using recent techniques in optimal transport. The problem is quite important; clustering provides a sensible way to group data according to their topological characterizations. It is also very challenging due to the Wasserstein distance between the persistence diagrams. This paper proposes to (1) approximate the Wasserstein distance between diagrams using the regularized optimal transport, and (2) treat the computation of the Frechet means as another optimal transport problem, and find the optimal solution using gradient descent. Several major technical challenges are addressed, include: 1) the Wasserstein distance may involve matching points with the a diagonal line. The proposed method is compared with the state-of-the-art (Hera) and is shown to be more efficient.


HERA: High-efficiency Matrix Compression via Element Replacement

Wang, Yanshu, Li, Wang, Yang, Tong

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have significantly advanced natural language processing tasks such as machine translation, text generation, and sentiment analysis. However, their large size, often consisting of billions of parameters, poses challenges for storage, computation, and deployment, particularly in resource-constrained environments like mobile devices and edge computing platforms. Additionally, the key-value (k-v) cache used to speed up query processing requires substantial memory and storage, exacerbating these challenges. Vector databases have emerged as a crucial technology to efficiently manage and retrieve the high-dimensional vectors produced by LLMs, facilitating faster data access and reducing computational demands. Effective compression and quantization techniques are essential to address these challenges, as they reduce the memory footprint and computational requirements without significantly compromising performance. Traditional methods that uniformly map parameters to compressed spaces often fail to account for the uneven distribution of parameters, leading to considerable accuracy loss. Therefore, innovative approaches are needed to achieve better compression ratios while preserving model performance. In this work, we propose HERA, a novel algorithm that employs heuristic Element Replacement for compressing matrix. HERA systematically replaces elements within the model using heuristic methods, which simplifies the structure of the model and makes subsequent compression more effective. By hierarchically segmenting, compressing, and reorganizing the matrix dataset, our method can effectively reduce the quantization error to 12.3% of the original at the same compression ratio.


Hera: A Heterogeneity-Aware Multi-Tenant Inference Server for Personalized Recommendations

Choi, Yujeong, Kim, John, Rhu, Minsoo

arXiv.org Artificial Intelligence

While providing low latency is a fundamental requirement in deploying recommendation services, achieving high resource utility is also crucial in cost-effectively maintaining the datacenter. Co-locating multiple workers of a model is an effective way to maximize query-level parallelism and server throughput, but the interference caused by concurrent workers at shared resources can prevent server queries from meeting its SLA. Hera utilizes the heterogeneous memory requirement of multi-tenant recommendation models to intelligently determine a productive set of co-located models and its resource allocation, providing fast response time while achieving high throughput. We show that Hera achieves an average 37.3% improvement in effective machine utilization, enabling 26% reduction in required servers, significantly improving upon the baseline recommedation inference server.


ESA's first self-driving spacecraft heads to space for maiden tests – Fanatical Futurist by International Keynote Speaker Matthew Griffin

#artificialintelligence

Interested in the future and want to experience even more?! eXplore More. You've heard of self-driving cars and trucks, flying cars, and probably even self-driving cargo ships, but soon we'll be able to add a new type of vehicle to the self-driving category – spacecraft. And if you're going to spend millions of dollars on a spacecraft, you might as well try to cram in as much as possible. With that in mind, the European Space Agency (ESA) has now detailed the "side mission" it's planning for the asteroid-visiting spacecraft Hera. After the projects main work is accomplished, which is to bump the asteroid off course, the new spacecraft will then test out some new autonomous navigation systems, which should help future spacecraft get around without relying on ground control all the way back on Earth – something that the agency sees as a necessity as we continue to explore the further reaches of space – and one day visit them.


HERA: Partial Label Learning by Combining Heterogeneous Loss with Sparse and Low-Rank Regularization

Lyu, Gengyu, Feng, Songhe, Jin, Yi, Dai, Guojun, Lang, Congyan, Li, Yidong

arXiv.org Machine Learning

Partial Label Learning (PLL) aims to learn from the data where each training instance is associated with a set of candidate labels, among which only one is correct. Most existing methods deal with such problem by either treating each candidate label equally or identifying the ground-truth label iteratively. In this paper, we propose a novel PLL approach called HERA, which simultaneously incorporates the HeterogEneous Loss and the SpaRse and Low-rAnk procedure to estimate the labeling confidence for each instance while training the model. Specifically, the heterogeneous loss integrates the strengths of both the pairwise ranking loss and the pointwise reconstruction loss to provide informative label ranking and reconstruction information for label identification, while the embedded sparse and low-rank scheme constrains the sparsity of ground-truth label matrix and the low rank of noise label matrix to explore the global label relevance among the whole training data for improving the learning model. Extensive experiments on both artificial and real-world data sets demonstrate that our method can achieve superior or comparable performance against the state-of-the-art methods.


In Yemen Conflict, Some See A New Age Of Drone Warfare

NPR Technology

Iranian soldiers carry part of a target drone used in air-defense exercises. Iran is also turning some target drones into low-tech weapons for its proxies. Iranian soldiers carry part of a target drone used in air-defense exercises. Iran is also turning some target drones into low-tech weapons for its proxies. In January, a group of high-level military commanders gathered at an air base in Yemen.


Self-driving spacecraft may save Earth from doomsday

FOX News

Hera uses infrared to scan impact crater. Judging by the valuations of companies such as Waymo, Lyft and Uber, humanity is placing a big bet on self-driving cars as the future of transportation. But the future of humanity itself may rest on the hopes of self-driving spacecraft. The European Space Agency is currently developing a self-driving craft for its Hera planetary defense mission to the Didymos asteroid, which could happen as soon as 2023. "If you think self-driving cars are the future on Earth, then Hera is the pioneer of autonomy in deep space," Paolo Martino, lead systems engineer of ESA's proposed Hera mission, said in a statement.