Goto

Collaborating Authors

 nft




Bridging Supervised Learning and Reinforcement Learning in Math Reasoning

Chen, Huayu, Zheng, Kaiwen, Zhang, Qinsheng, Cui, Ganqu, Cui, Yin, Ye, Haotian, Lin, Tsung-Yi, Liu, Ming-Yu, Zhu, Jun, Wang, Haoxiang

arXiv.org Artificial Intelligence

Reinforcement Learning (RL) has played a central role in the recent surge of LLMs' math abilities by enabling self-improvement through binary verifier signals. In contrast, Supervised Learning (SL) is rarely considered for such verification-driven training, largely due to its heavy reliance on reference answers and inability to reflect on mistakes. In this work, we challenge the prevailing notion that self-improvement is exclusive to RL and propose Negative-aware Fine-Tuning (NFT) -- a supervised approach that enables LLMs to reflect on their failures and improve autonomously with no external teachers. In online training, instead of throwing away self-generated negative answers, NFT constructs an implicit negative policy to model them. This implicit policy is parameterized with the same positive LLM we target to optimize on positive data, enabling direct policy optimization on all LLMs' generations. We conduct experiments on 7B and 32B models in math reasoning tasks. Results consistently show that through the additional leverage of negative feedback, NFT significantly improves over SL baselines like Rejection sampling Fine-Tuning, matching or even surpassing leading RL algorithms like GRPO and DAPO. Furthermore, we demonstrate that NFT and GRPO are actually equivalent in strict-on-policy training, even though they originate from entirely different theoretical foundations. Our experiments and theoretical findings bridge the gap between SL and RL methods in binary-feedback learning systems.


Clustering and analysis of user behaviour in blockchain: A case study of Planet IX

Zelenyanszki, Dorottya, Hou, Zhe, Biswas, Kamanashis, Muthukkumarasamy, Vallipuram

arXiv.org Artificial Intelligence

Decentralised applications (dApps) that run on public blockchains have the benefit of trustworthiness and transparency as every activity that happens on the blockchain can be publicly traced through the transaction data. However, this introduces a potential privacy problem as this data can be tracked and analysed, which can reveal user-behaviour information. A user behaviour analysis pipeline was proposed to present how this type of information can be extracted and analysed to identify separate behavioural clusters that can describe how users behave in the game. The pipeline starts with the collection of transaction data, involving smart contracts, that is collected from a blockchain-based game called Planet IX. Both the raw transaction information and the transaction events are considered in the data collection. From this data, separate game actions can be formed and those are leveraged to present how and when the users conducted their in-game activities in the form of user flows. An extended version of these user flows also presents how the Non-Fungible Tokens (NFTs) are being leveraged in the user actions. The latter is given as input for a Graph Neural Network (GNN) model to provide graph embeddings for these flows which then can be leveraged by clustering algorithms to cluster user behaviours into separate behavioural clusters. We benchmark and compare well-known clustering algorithms as a part of the proposed method. The user behaviour clusters were analysed and visualised in a graph format. It was found that behavioural information can be extracted regarding the users that belong to these clusters. Such information can be exploited by malicious users to their advantage. To demonstrate this, a privacy threat model was also presented based on the results that correspond to multiple potentially affected areas.


Content ARCs: Decentralized Content Rights in the Age of Generative AI

Balan, Kar, Gilbert, Andrew, Collomosse, John

arXiv.org Artificial Intelligence

The rise of Generative AI (GenAI) has sparked significant debate over balancing the interests of creative rightsholders and AI developers. As GenAI models are trained on vast datasets that often include copyrighted material, questions around fair compensation and proper attribution have become increasingly urgent. To address these challenges, this paper proposes a framework called \emph{Content ARCs} (Authenticity, Rights, Compensation). By combining open standards for provenance and dynamic licensing with data attribution, and decentralized technologies, Content ARCs create a mechanism for managing rights and compensating creators for using their work in AI training. We characterize several nascent works in the AI data licensing space within Content ARCs and identify where challenges remain to fully implement the end-to-end framework.


IITP-VDLand: A Comprehensive Dataset on Decentraland Parcels

Bhagat, Ankit K., Jha, Dipika, Halder, Raju, Paramanik, Rajendra N., Kumar, Chandra M.

arXiv.org Artificial Intelligence

This paper presents IITP-VDLand, a comprehensive dataset of Decentraland parcels sourced from diverse platforms. Unlike existing datasets which have limited attributes and records, IITP-VDLand offers a rich array of attributes, encompassing parcel characteristics, trading history, past activities, transactions, and social media interactions. Alongside, we introduce a key attribute in the dataset, namely Rarity score, which measures the uniqueness of each parcel within the virtual world. Addressing the significant challenge posed by the dispersed nature of this data across various sources, we employ a systematic approach, utilizing both available APIs and custom scripts, to gather it. Subsequently, we meticulously curate and organize the information into four distinct segments: (1) Characteristics Data-Fragment, (2) OpenSea Trading History Data-Fragment, (3) Ethereum Activity Transactions Data-Fragment, and (4) Social Media Data-Fragment. We envisage that this dataset would serve as a robust resource for training machine- and deep-learning models specifically designed to address real-world challenges within the domain of Decentraland parcels. The performance benchmarking of more than 20 state-of-the-art price prediction models on our dataset yields promising results, achieving a maximum R2 score of 0.8251 and an accuracy of 74.23% in case of Extra Trees Regressor and Classifier. The key findings reveal that the ensemble models performs better than both deep learning and linear models for our dataset. We observe a significant impact of coordinates, geographical proximity, rarity score, and few other economic indicators on the prediction of parcel prices.


A Recommender System for NFT Collectibles with Item Feature

Choi, Minjoo, Kim, Seonmi, Kim, Yejin, Lee, Youngbin, Hong, Joohwan, Lee, Yongjae

arXiv.org Artificial Intelligence

Recommender systems have been actively studied and applied in various domains to deal with information overload. Although there are numerous studies on recommender systems for movies, music, and e-commerce, comparatively less attention has been paid to the recommender system for NFTs despite the continuous growth of the NFT market. This paper presents a recommender system for NFTs that utilizes a variety of data sources, from NFT transaction records to external item features, to generate precise recommendations that cater to individual preferences. We develop a data-efficient graph-based recommender system to efficiently capture the complex relationship between each item and users and generate node(item) embeddings which incorporate both node feature information and graph structure. Furthermore, we exploit inputs beyond user-item interactions, such as image feature, text feature, and price feature. Numerical experiments verify the performance of the graph-based recommender system improves significantly after utilizing all types of item features as side information, thereby outperforming all other baselines.


Negative Feedback Training: A Novel Concept to Improve Robustness of NVCIM DNN Accelerators

Qin, Yifan, Yan, Zheyu, Wen, Wujie, Hu, Xiaobo Sharon, Shi, Yiyu

arXiv.org Artificial Intelligence

Compute-in-memory (CIM) accelerators built upon non-volatile memory (NVM) devices excel in energy efficiency and latency when performing Deep Neural Network (DNN) inference, thanks to their in-situ data processing capability. However, the stochastic nature and intrinsic variations of NVM devices often result in performance degradation in DNN inference. Introducing these non-ideal device behaviors during DNN training enhances robustness, but drawbacks include limited accuracy improvement, reduced prediction confidence, and convergence issues. This arises from a mismatch between the deterministic training and non-deterministic device variations, as such training, though considering variations, relies solely on the model's final output. In this work, we draw inspiration from the control theory and propose a novel training concept: Negative Feedback Training (NFT) leveraging the multi-scale noisy information captured from network. We develop two specific NFT instances, Oriented Variational Forward (OVF) and Intermediate Representation Snapshot (IRS). Extensive experiments show that our methods outperform existing state-of-the-art methods with up to a 46.71% improvement in inference accuracy while reducing epistemic uncertainty, boosting output confidence, and improving convergence probability. Their effectiveness highlights the generality and practicality of our NFT concept in enhancing DNN robustness against device variations.


A Survey of Blockchain, Artificial Intelligence, and Edge Computing for Web 3.0

Zhu, Jianjun, Li, Fan, Chen, Jinyuan

arXiv.org Artificial Intelligence

Web 3.0, as the third generation of the World Wide Web, aims to solve contemporary problems of trust, centralization, and data ownership. Driven by the latest advances in cutting-edge technologies, Web 3.0 is moving towards a more open, decentralized, intelligent, and interconnected network. However, increasingly widespread data breaches have raised awareness of online privacy and security of personal data. Additionally, since Web 3.0 is a sophisticated and complex convergence, the technical details behind it are not as clear as the characteristics it presents. In this survey, we conduct an in-depth exploration of Web 3.0 from the perspectives of blockchain, artificial intelligence, and edge computing. Specifically, we begin with summarizing the evolution of the Internet and providing an overview of these three key technological factors. Afterward, we provide a thorough analysis of each technology separately, including its relevance to Web 3.0, key technology components, and practical applications. We also propose decentralized storage and computing solutions by exploring the integration of technologies. Finally, we highlight the key challenges alongside potential research directions. Through the combination and mutual complementation of multiple technologies, Web 3.0 is expected to return more control and ownership of data and digital assets back to users.


Decentralized Energy Marketplace via NFTs and AI-based Agents

Nikbakht, Rasoul, Javed, Farhana, Rezazadeh, Farhad, Bartzoudis, Nikolaos, Mangues-Bafalluy, Josep

arXiv.org Artificial Intelligence

The paper introduces an advanced Decentralized Energy Marketplace (DEM) integrating blockchain technology and artificial intelligence to manage energy exchanges among smart homes with energy storage systems. The proposed framework uses Non-Fungible Tokens (NFTs) to represent unique energy profiles in a transparent and secure trading environment. Leveraging Federated Deep Reinforcement Learning (FDRL), the system promotes collaborative and adaptive energy management strategies, maintaining user privacy. A notable innovation is the use of smart contracts, ensuring high efficiency and integrity in energy transactions. Extensive evaluations demonstrate the system's scalability and the effectiveness of the FDRL method in optimizing energy distribution. This research significantly contributes to developing sophisticated decentralized smart grid infrastructures. Our approach broadens potential blockchain and AI applications in sustainable energy systems and addresses incentive alignment and transparency challenges in traditional energy trading mechanisms. The implementation of this paper is publicly accessible at \url{https://github.com/RasoulNik/DEM}.