Goto

Collaborating Authors

 Oceania


HedgeAgents: A Balanced-aware Multi-agent Financial Trading System

arXiv.org Artificial Intelligence

As automated trading gains traction in the financial market, algorithmic investment strategies are increasingly prominent. While Large Language Models (LLMs) and Agent-based models exhibit promising potential in real-time market analysis and trading decisions, they still experience a significant -20% loss when confronted with rapid declines or frequent fluctuations, impeding their practical application. Hence, there is an imperative to explore a more robust and resilient framework. This paper introduces an innovative multi-agent system, HedgeAgents, aimed at bolstering system robustness via ``hedging'' strategies. In this well-balanced system, an array of hedging agents has been tailored, where HedgeAgents consist of a central fund manager and multiple hedging experts specializing in various financial asset classes. These agents leverage LLMs' cognitive capabilities to make decisions and coordinate through three types of conferences. Benefiting from the powerful understanding of LLMs, our HedgeAgents attained a 70% annualized return and a 400% total return over a period of 3 years. Moreover, we have observed with delight that HedgeAgents can even formulate investment experience comparable to those of human experts (https://hedgeagents.github.io/).


VLMs as GeoGuessr Masters: Exceptional Performance, Hidden Biases, and Privacy Risks

arXiv.org Artificial Intelligence

Visual-Language Models (VLMs) have shown remarkable performance across various tasks, particularly in recognizing geographic information from images. However, significant challenges remain, including biases and privacy concerns. To systematically address these issues in the context of geographic information recognition, we introduce a benchmark dataset consisting of 1,200 images paired with detailed geographic metadata. Evaluating four VLMs, we find that while these models demonstrate the ability to recognize geographic information from images, achieving up to $53.8\%$ accuracy in city prediction, they exhibit significant regional biases. Specifically, performance is substantially higher for economically developed and densely populated regions compared to less developed ($-12.5\%$) and sparsely populated ($-17.0\%$) areas. Moreover, the models exhibit regional biases, frequently overpredicting certain locations; for instance, they consistently predict Sydney for images taken in Australia. The strong performance of VLMs also raises privacy concerns, particularly for users who share images online without the intent of being identified. Our code and dataset are publicly available at https://github.com/uscnlp-lime/FairLocator.


Valentine's Day dangers: Dating app killers lure love seekers in unsuspecting ways

FOX News

Kurt "The Cyberguy" Knutsson explains how facial recognition technology can help you find your perfect match. From a poisonous date to finding love with a serial killer, these six chilling cases show how unsuspecting dating app users on the quest for romance led them into the clutches of danger. Dating apps โ€“ from Tinder to Grindr โ€“ are the modern way for people to connect with potential partners from the comfort of their own space. Brace yourself for stories that blur the line between love and terror. Here is Fox News Digital's list of some recent cases where love went wrong.


Balancing the Scales: A Theoretical and Algorithmic Framework for Learning from Imbalanced Data

arXiv.org Machine Learning

Class imbalance remains a major challenge in machine learning, especially in multi-class problems with long-tailed distributions. Existing methods, such as data resampling, cost-sensitive techniques, and logistic loss modifications, though popular and often effective, lack solid theoretical foundations. As an example, we demonstrate that cost-sensitive methods are not Bayes consistent. This paper introduces a novel theoretical framework for analyzing generalization in imbalanced classification. We propose a new class-imbalanced margin loss function for both binary and multi-class settings, prove its strong $H$-consistency, and derive corresponding learning guarantees based on empirical loss and a new notion of class-sensitive Rademacher complexity. Leveraging these theoretical results, we devise novel and general learning algorithms, IMMAX (Imbalanced Margin Maximization), which incorporate confidence margins and are applicable to various hypothesis sets. While our focus is theoretical, we also present extensive empirical results demonstrating the effectiveness of our algorithms compared to existing baselines.


Tempo: Helping Data Scientists and Domain Experts Collaboratively Specify Predictive Modeling Tasks

arXiv.org Artificial Intelligence

Temporal predictive models have the potential to improve decisions in health care, public services, and other domains, yet they often fail to effectively support decision-makers. Prior literature shows that many misalignments between model behavior and decision-makers' expectations stem from issues of model specification, namely how, when, and for whom predictions are made. However, model specifications for predictive tasks are highly technical and difficult for non-data-scientist stakeholders to interpret and critique. To address this challenge we developed Tempo, an interactive system that helps data scientists and domain experts collaboratively iterate on model specifications. Using Tempo's simple yet precise temporal query language, data scientists can quickly prototype specifications with greater transparency about pre-processing choices. Moreover, domain experts can assess performance within data subgroups to validate that models behave as expected. Through three case studies, we demonstrate how Tempo helps multidisciplinary teams quickly prune infeasible specifications and identify more promising directions to explore.


Learning Euler Factors of Elliptic Curves

arXiv.org Artificial Intelligence

We apply transformer models and feedforward neural networks to predict Frobenius traces $a_p$ from elliptic curves given other traces $a_q$. We train further models to predict $a_p \bmod 2$ from $a_q \bmod 2$, and cross-analysis such as $a_p \bmod 2$ from $a_q$. Our experiments reveal that these models achieve high accuracy, even in the absence of explicit number-theoretic tools like functional equations of $L$-functions. We also present partial interpretability findings.


User Profile with Large Language Models: Construction, Updating, and Benchmarking

arXiv.org Artificial Intelligence

User profile modeling plays a key role in personalized systems, as it requires building accurate profiles and updating them with new information. In this paper, we present two high-quality open-source user profile datasets: one for profile construction and another for profile updating. These datasets offer a strong basis for evaluating user profile modeling techniques in dynamic settings. We also show a methodology that uses large language models (LLMs) to tackle both profile construction and updating. Our method uses a probabilistic framework to predict user profiles from input text, allowing for precise and context-aware profile generation. Our experiments demonstrate that models like Mistral-7b and Llama2-7b perform strongly in both tasks. LLMs improve the precision and recall of the generated profiles, and high evaluation scores confirm the effectiveness of our approach.


Automatic Evaluation Metrics for Artificially Generated Scientific Research

arXiv.org Artificial Intelligence

Foundation models are increasingly used in scientific research, but evaluating AI-generated scientific work remains challenging. While expert reviews are costly, large language models (LLMs) as proxy reviewers have proven to be unreliable. To address this, we investigate two automatic evaluation metrics, specifically citation count prediction and review score prediction. We parse all papers of OpenReview and augment each submission with its citation count, reference, and research hypothesis. Our findings reveal that citation count prediction is more viable than review score prediction, and predicting scores is more difficult purely from the research hypothesis than from the full paper. Furthermore, we show that a simple prediction model based solely on title and abstract outperforms LLM-based reviewers, though it still falls short of human-level consistency.


Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow

arXiv.org Artificial Intelligence

This paper introduces Generalized Attention Flow (GAF), a novel feature attribution method for Transformer-based models to address the limitations of current approaches. By extending Attention Flow and replacing attention weights with the generalized Information Tensor, GAF integrates attention weights, their gradients, the maximum flow problem, and the barrier method to enhance the performance of feature attributions. The proposed method exhibits key theoretical properties and mitigates the shortcomings of prior techniques that rely solely on simple aggregation of attention weights. Our comprehensive benchmarking on sequence classification tasks demonstrates that a specific variant of GAF consistently outperforms state-of-the-art feature attribution methods in most evaluation settings, providing a more reliable interpretation of Transformer model outputs.


SmartEdge: Smart Healthcare End-to-End Integrated Edge and Cloud Computing System for Diabetes Prediction Enabled by Ensemble Machine Learning

arXiv.org Artificial Intelligence

The Internet of Things (IoT) revolutionizes smart city domains such as healthcare, transportation, industry, and education. The Internet of Medical Things (IoMT) is gaining prominence, particularly in smart hospitals and Remote Patient Monitoring (RPM). The vast volume of data generated by IoMT devices should be analyzed in real-time for health surveillance, prognosis, and prediction of diseases. Current approaches relying on Cloud computing to provide the necessary computing and storage capabilities do not scale for these latency-sensitive applications. Edge computing emerges as a solution by bringing cloud services closer to IoMT devices. This paper introduces SmartEdge, an AI-powered smart healthcare end-to-end integrated edge and cloud computing system for diabetes prediction. This work addresses latency concerns and demonstrates the efficacy of edge resources in healthcare applications within an end-to-end system. The system leverages various risk factors for diabetes prediction. We propose an Edge and Cloud-enabled framework to deploy the proposed diabetes prediction models on various configurations using edge nodes and main cloud servers. Performance metrics are evaluated using, latency, accuracy, and response time. By using ensemble machine learning voting algorithms we can improve the prediction accuracy by 5% versus a single model prediction.