Goto

Collaborating Authors

 visualizer


MarkDiffusion: An Open-Source Toolkit for Generative Watermarking of Latent Diffusion Models

Pan, Leyi, Guan, Sheng, Fu, Zheyu, Si, Luyang, Wang, Huan, Wang, Zian, Li, Hanqian, Hu, Xuming, King, Irwin, Yu, Philip S., Liu, Aiwei, Wen, Lijie

arXiv.org Artificial Intelligence

We introduce MarkDiffusion, an open-source Python toolkit for generative watermarking of latent diffusion models. It comprises three key components: a unified implementation framework for streamlined watermarking algorithm integrations and user-friendly interfaces; a mechanism visualization suite that intuitively showcases added and extracted watermark patterns to aid public understanding; and a comprehensive evaluation module offering standard implementations of 24 tools across three essential aspects - detectability, robustness, and output quality - plus 8 automated evaluation pipelines. Through MarkDiffusion, we seek to assist researchers, enhance public awareness and engagement in generative watermarking, and promote consensus while advancing research and applications.


OWLViz: An Open-World Benchmark for Visual Question Answering

Nguyen, Thuy, Nguyen, Dang, Nguyen, Hoang, Luong, Thuan, Dang, Long Hoang, Lai, Viet Dac

arXiv.org Artificial Intelligence

We present a challenging benchmark for the Open WorLd VISual question answering (OWLViz) task. OWLViz presents concise, unambiguous queries that require integrating multiple capabilities, including visual understanding, web exploration, and specialized tool usage. While humans achieve 69.2% accuracy on these intuitive tasks, even state-of-the-art VLMs struggle, with the best model, Gemini 2.0, achieving only 26.6% accuracy. Current agentic VLMs, which rely on limited vision and vision-language models as tools, perform even worse. This performance gap reveals significant limitations in multimodal systems' ability to select appropriate tools and execute complex reasoning sequences, establishing new directions for advancing practical AI research.


MidiTok Visualizer: a tool for visualization and analysis of tokenized MIDI symbolic music

Wiszenko, Michał, Stefański, Kacper, Malesa, Piotr, Pokorzyński, Łukasz, Modrzejewski, Mateusz

arXiv.org Artificial Intelligence

Symbolic music research plays a crucial role in musicrelated machine learning, but MIDI data can be complex 2. SOFTWARE OVERVIEW for those without musical expertise. To address this issue, 2.1 Key functionality we present MidiTok Visualizer, a web application designed to facilitate the exploration and visualization of various MidiTok Visualizer is a web application designed for visualizing MIDI tokenization methods from the MidiTok Python and analyzing MIDI file tokenization techniques package. MidiTok Visualizer offers numerous customizable from the MidiTok Python package. The key capabilities parameters, enabling users to upload MIDI files to visualize of the tool are as follows: tokenized data alongside an interactive piano roll. Allows users to upload a MIDI file and view a graphical representation of the tokens generated by 1. INTRODUCTION


pyBregMan: A Python library for Bregman Manifolds

Nielsen, Frank, Soen, Alexander

arXiv.org Artificial Intelligence

A Bregman manifold is a synonym for a dually flat space in information geometry which admits as a canonical divergence a Bregman divergence. Bregman manifolds are induced by smooth strictly convex functions like the cumulant or partition functions of regular exponential families, the negative entropy of mixture families, or the characteristic functions of regular cones just to list a few such convex Bregman generators. We describe the design of pyBregMan, a library which implements generic operations on Bregman manifolds and instantiate several common Bregman manifolds used in information sciences. At the core of the library is the notion of Legendre-Fenchel duality inducing a canonical pair of dual potential functions and dual Bregman divergences. The library also implements the Fisher-Rao manifolds of categorical/multinomial distributions and multivariate normal distributions. To demonstrate the use of the pyBregMan kernel manipulating those Bregman and Fisher-Rao manifolds, the library also provides several core algorithms for various applications in statistics, machine learning, information fusion, and so on.


MineObserver 2.0: A Deep Learning & In-Game Framework for Assessing Natural Language Descriptions of Minecraft Imagery

Mahajan, Jay, Hum, Samuel, Henhapl, Jack, Yunus, Diya, Gadbury, Matthew, Brown, Emi, Ginger, Jeff, Lane, H. Chad

arXiv.org Artificial Intelligence

MineObserver 2.0 is an AI framework that uses Computer Vision and Natural Language Processing for assessing the accuracy of learner-generated descriptions of Minecraft images that include some scientifically relevant content. The system automatically assesses the accuracy of participant observations, written in natural language, made during science learning activities that take place in Minecraft. We demonstrate our system working in real-time and describe a teacher support dashboard to showcase observations, both of which advance our previous work. We present the results of a study showing that MineObserver 2.0 improves over its predecessor both in perceived accuracy of the system's generated descriptions as well as in usefulness of the system's feedback. In future work we intend improve system-generated descriptions, give teachers more control and upgrade the system to perform continuous learning to more effectively and rapidly respond to novel observations made by learners.


Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces

Sridharan, Srinivas, Heo, Taekyung, Feng, Louis, Wang, Zhaodong, Bergeron, Matt, Fu, Wenyin, Zheng, Shengbao, Coutinho, Brian, Rashidi, Saeed, Man, Changhai, Krishna, Tushar

arXiv.org Artificial Intelligence

Benchmarking and co-design are essential for driving optimizations and innovation around ML models, ML software, and next-generation hardware. Full workload benchmarks, e.g. MLPerf, play an essential role in enabling fair comparison across different software and hardware stacks especially once systems are fully designed and deployed. However, the pace of AI innovation demands a more agile methodology to benchmark creation and usage by simulators and emulators for future system co-design. We propose Chakra, an open graph schema for standardizing workload specification capturing key operations and dependencies, also known as Execution Trace (ET). In addition, we propose a complementary set of tools/capabilities to enable collection, generation, and adoption of Chakra ETs by a wide range of simulators, emulators, and benchmarks. For instance, we use generative AI models to learn latent statistical properties across thousands of Chakra ETs and use these models to synthesize Chakra ETs. These synthetic ETs can obfuscate key proprietary information and also target future what-if scenarios. As an example, we demonstrate an end-to-end proof-of-concept that converts PyTorch ETs to Chakra ETs and uses this to drive an open-source training system simulator (ASTRA-sim). Our end-goal is to build a vibrant industry-wide ecosystem of agile benchmarks and tools to drive future AI system co-design.


10 Amazing Machine Learning Visualizations You Should Know in 2023 - KDnuggets

#artificialintelligence

Data visualization plays an important role in machine learning. Visualizations that are directly related to the above key things in machine learning are called machine learning visualizations. Creating machine learning visualizations is sometimes a complicated process as it requires a lot of code to write even in Python. But, thanks to Python's open-source Yellowbrick library, even complex machine learning visualizations can be created with less code. That library extends the Scikit-learn API and provides high-level functions for visual diagnostics that are not provided by Scikit-learn.


10 Amazing Machine Learning Visualizations You Should Know in 2023 - KDnuggets

#artificialintelligence

Data visualization plays an important role in machine learning. Visualizations that are directly related to the above key things in machine learning are called machine learning visualizations. Creating machine learning visualizations is sometimes a complicated process as it requires a lot of code to write even in Python. But, thanks to Python's open-source Yellowbrick library, even complex machine learning visualizations can be created with less code. That library extends the Scikit-learn API and provides high-level functions for visual diagnostics that are not provided by Scikit-learn.


A Flexible MATLAB/Simulink Simulator for Robotic Floating-base Systems in Contact with the Ground

Guedelha, Nuno, Pasandi, Venus, L'Erario, Giuseppe, Traversaro, Silvio, Pucci, Daniele

arXiv.org Artificial Intelligence

Physics simulators are widely used in robotics fields, from mechanical design to dynamic simulation, and controller design. This paper presents an open-source MATLAB/Simulink simulator for rigid-body articulated systems, including manipulators and floating-base robots. Thanks to MATLAB/Simulink features like MATLAB system classes and Simulink function blocks, the presented simulator combines a programmatic and block-based approach, resulting in a flexible design in the sense that different parts, including its physics engine, robot-ground interaction model, and state evolution algorithm are simply accessible and editable. Moreover, through the use of Simulink dynamic mask blocks, the proposed simulation framework supports robot models integrating open-chain and closed-chain kinematics with any desired number of links interacting with the ground. The simulator can also integrate second-order actuator dynamics. Furthermore, the simulator benefits from a one-line installation and an easy-to-use Simulink interface.


IMBENS: Ensemble Class-imbalanced Learning in Python

Liu, Zhining, Wei, Zhepei, Yu, Erxin, Huang, Qiang, Guo, Kai, Yu, Boyang, Cai, Zhaonian, Ye, Hangting, Cao, Wei, Bian, Jiang, Wei, Pengfei, Jiang, Jing, Chang, Yi

arXiv.org Artificial Intelligence

It provides access to multiple state-of-art ensemble imbalanced learning (EIL) methods, visualizer, and utility functions for dealing with the class imbalance problem. These ensemble methods include resampling-based, e.g., under/over-sampling, and reweighting-based ones, e.g., cost-sensitive learning. Beyond the implementation, we also extend conventional binary EIL algorithms with new functionalities like multi-class support and resampling scheduler, thereby enabling them to handle more complex tasks. The package was developed under a simple, well-documented API design follows that of scikit-learn for increased ease of use.