Goto

Collaborating Authors

 Generative AI


TRISKELION-1: Unified Descriptive-Predictive-Generative AI

arXiv.org Artificial Intelligence

TRISKELION-1 is a unified descriptive-predictive-generative architecture that integrates statistical, mechanistic, and generative reasoning within a single encoder-decoder framework. The model demonstrates how descriptive representation learning, predictive inference, and generative synthesis can be jointly optimized using variational objectives. Experiments on MNIST validate that descriptive reconstruction, predictive classification, and generative sampling can coexist stably within one model. The framework provides a blueprint toward universal intelligence architectures that connect interpretability, accuracy, and creativity.


Oitijjo-3D: Generative AI Framework for Rapid 3D Heritage Reconstruction from Street View Imagery

arXiv.org Artificial Intelligence

Cultural heritage restoration in Bangladesh faces a dual challenge of limited resources and scarce technical expertise. Traditional 3D digitization methods, such as photogrammetry or LiDAR scanning, require expensive hardware, expert operators, and extensive on-site access, which are often infeasible in developing contexts. As a result, many of Bangladesh's architectural treasures, from the Paharpur Buddhist Monastery to Ahsan Manzil, remain vulnerable to decay and inaccessible in digital form. This paper introduces Oitijjo-3D, a cost-free generative AI framework that democratizes 3D cultural preservation. By using publicly available Google Street View imagery, Oitijjo-3D reconstructs faithful 3D models of heritage structures through a two-stage pipeline - multimodal visual reasoning with Gemini 2.5 Flash Image for structure-texture synthesis, and neural image-to-3D generation through Hexagen for geometry recovery. The system produces photorealistic, metrically coherent reconstructions in seconds, achieving significant speedups compared to conventional Structure-from-Motion pipelines, without requiring any specialized hardware or expert supervision. Experiments on landmarks such as Ahsan Manzil, Choto Sona Mosque, and Paharpur demonstrate that Oitijjo-3D preserves both visual and structural fidelity while drastically lowering economic and technical barriers. By turning open imagery into digital heritage, this work reframes preservation as a community-driven, AI-assisted act of cultural continuity for resource-limited nations.


A Survey on Cache Methods in Diffusion Models: Toward Efficient Multi-Modal Generation

arXiv.org Artificial Intelligence

Diffusion Models have become a cornerstone of modern generative AI for their exceptional generation quality and controllability. However, their inherent \textit{multi-step iterations} and \textit{complex backbone networks} lead to prohibitive computational overhead and generation latency, forming a major bottleneck for real-time applications. Although existing acceleration techniques have made progress, they still face challenges such as limited applicability, high training costs, or quality degradation. Against this backdrop, \textbf{Diffusion Caching} offers a promising training-free, architecture-agnostic, and efficient inference paradigm. Its core mechanism identifies and reuses intrinsic computational redundancies in the diffusion process. By enabling feature-level cross-step reuse and inter-layer scheduling, it reduces computation without modifying model parameters. This paper systematically reviews the theoretical foundations and evolution of Diffusion Caching and proposes a unified framework for its classification and analysis. Through comparative analysis of representative methods, we show that Diffusion Caching evolves from \textit{static reuse} to \textit{dynamic prediction}. This trend enhances caching flexibility across diverse tasks and enables integration with other acceleration techniques such as sampling optimization and model distillation, paving the way for a unified, efficient inference framework for future multimodal and interactive applications. We argue that this paradigm will become a key enabler of real-time and efficient generative AI, injecting new vitality into both theory and practice of \textit{Efficient Generative Intelligence}.


Measuring Algorithmic Partisanship via Zero-Shot Classification and Its Implications on Political Discourse

arXiv.org Artificial Intelligence

Amidst the rapid normalization of generative artificial intelligence (GAI), intelligent systems have come to dominate political discourse across information media. However, internalized political biases stemming from training data skews, human prejudice, and algorithmic flaws continue to plague this novel technology. This study employs a zero-shot classification approach to evaluate algorithmic political partisanship through a methodical combination of ideological alignment, topicality, response sentiment, and objectivity. A total of 1800 model responses across six mainstream large language models (LLMs) were individually input into four distinct fine-tuned classification algorithms, each responsible for computing one of the aforementioned metrics. The results show an amplified liberal-authoritarian alignment across the six LLMs evaluated, with notable instances of reasoning supersessions and canned refusals. The study subsequently highlights the psychological influences underpinning human-computer interactions and how intrinsic biases can permeate public discourse. The resulting distortion of the political landscape can ultimately manifest as conformity or polarization, depending on the region's pre-existing socio-political structures.


Recognising, Anticipating, and Mitigating LLM Pollution of Online Behavioural Research

arXiv.org Artificial Intelligence

Online behavioural research faces an emerging threat as participants increasingly turn to large language models (LLMs) for advice, translation, or task delegation: LLM Pollution. We identify three interacting variants through which LLM Pollution threatens the validity and integrity of online behavioural research. First, Partial LLM Mediation occurs when participants make selective use of LLMs for specific aspects of a task, such as translation or wording support, leading researchers to (mis)interpret LLM-shaped outputs as human ones. Second, Full LLM Delegation arises when agentic LLMs complete studies with little to no human oversight, undermining the central premise of human-subject research at a more foundational level. Third, LLM Spillover signifies human participants altering their behaviour as they begin to anticipate LLM presence in online studies, even when none are involved. While Partial Mediation and Full Delegation form a continuum of increasing automation, LLM Spillover reflects second-order reactivity effects. Together, these variants interact and generate cascading distortions that compromise sample authenticity, introduce biases that are difficult to detect post hoc, and ultimately undermine the epistemic grounding of online research on human cognition and behaviour. Crucially, the threat of LLM Pollution is already co-evolving with advances in generative AI, creating an escalating methodological arms race. To address this, we propose a multi-layered response spanning researcher practices, platform accountability, and community efforts. As the challenge evolves, coordinated adaptation will be essential to safeguard methodological integrity and preserve the validity of online behavioural research.


Agentic Large Language Models for Conceptual Systems Engineering and Design

arXiv.org Artificial Intelligence

Early-stage engineering design involves complex, iterative reasoning, yet existing large language model (LLM) workflows struggle to maintain task continuity and generate executable models. We evaluate whether a structured multi-agent system (MAS) can more effectively manage requirements extraction, functional decomposition, and simulator code generation than a simpler two-agent system (2AS). The target application is a solar-powered water filtration system as described in a cahier des charges. We introduce the Design-State Graph (DSG), a JSON-serializable representation that bundles requirements, physical embodiments, and Python-based physics models into graph nodes. A nine-role MAS iteratively builds and refines the DSG, while the 2AS collapses the process to a Generator-Reflector loop. Both systems run a total of 60 experiments (2 LLMs - Llama 3.3 70B vs reasoning-distilled DeepSeek R1 70B x 2 agent configurations x 3 temperatures x 5 seeds). We report a JSON validity, requirement coverage, embodiment presence, code compatibility, workflow completion, runtime, and graph size. Across all runs, both MAS and 2AS maintained perfect JSON integrity and embodiment tagging. Requirement coverage remained minimal (less than 20%). Code compatibility peaked at 100% under specific 2AS settings but averaged below 50% for MAS. Only the reasoning-distilled model reliably flagged workflow completion. Powered by DeepSeek R1 70B, the MAS generated more granular DSGs (average 5-6 nodes) whereas 2AS mode-collapsed. Structured multi-agent orchestration enhanced design detail. Reasoning-distilled LLM improved completion rates, yet low requirements and fidelity gaps in coding persisted.


AI-Generated Video Detection via Perceptual Straightening

arXiv.org Artificial Intelligence

The rapid advancement of generative AI enables highly realistic synthetic videos, posing significant challenges for content authentication and raising urgent concerns about misuse. Existing detection methods often struggle with generalization and capturing subtle temporal inconsistencies. We propose ReStraV(Representation Straightening Video), a novel approach to distinguish natural from AI-generated videos. Inspired by the "perceptual straightening" hypothesis -- which suggests real-world video trajectories become more straight in neural representation domain -- we analyze deviations from this expected geometric property. Using a pre-trained self-supervised vision transformer (DINOv2), we quantify the temporal curvature and stepwise distance in the model's representation domain. We aggregate statistics of these measures for each video and train a classifier. Our analysis shows that AI-generated videos exhibit significantly different curvature and distance patterns compared to real videos. A lightweight classifier achieves state-of-the-art detection performance (e.g., 97.17% accuracy and 98.63% AUROC on the VidProM benchmark), substantially outperforming existing image- and video-based methods. ReStraV is computationally efficient, it is offering a low-cost and effective detection solution. This work provides new insights into using neural representation geometry for AI-generated video detection.


OpenAI Signs 38 Billion Deal With Amazon

WIRED

OpenAI has committed to buying billions of dollars worth of compute from AWS--the latest in a string of major deals brokered by the AI startup. OpenAI has signed a multi-year deal with Amazon to buy $38 billion worth of AWS cloud infrastructure to train its models and serve its users. The deal is yet another sign of the AI industry becoming increasingly entangled, with OpenAI now at the center of major partnerships with industry players including Google, Oracle, Nvidia, and AMD. The AWS agreement is also notable because OpenAI rose to prominence in part through its partnership with Microsoft--Amazon's biggest cloud rival. Amazon is also a major backer of one of OpenAI's key competitors, Anthropic.


OpenAI, Amazon sign 38bn AI deal

Al Jazeera

OpenAI has signed a new deal valued at $38bn with Amazon that will allow the artificial intelligence giant to run AI workloads across Amazon Web Services (AWS) cloud infrastructure. The seven-year deal announced on Monday is the first big AI push for the e-commerce giant after a restructuring last week. Experts say this does not mean that it will allow OpenAI to train its model on websites hosted by AWS - which includes the websites of The New York Times, Reddit and United Airlines. "Running OpenAI training inside AWS doesn't change their ability to scrape content from AWS-hosted websites [which they could already do for anything publicly readable]. This is strictly speaking about the economics of rent vs buy for GPU [graphics processing unit] capacity," Joshua McKenty, CEO of the AI detection company PolyguardAI, told Al Jazeera. The deal is also a major vote of confidence for the e-commerce giant's cloud unit, AWS, which some investors feared had fallen behind rivals Microsoft and Google in the artificial intelligence (AI) race.


OpenAI signs 38bn cloud computing deal with Amazon

The Guardian

OpenAI said the deal would give it access to hundreds of thousands of Nvidia graphics processors to train and run its AI models. OpenAI said the deal would give it access to hundreds of thousands of Nvidia graphics processors to train and run its AI models. Agreement to use AWS datacentres, and Nvidia chips inside them, part of $1.4tn spending spree on AI infrastructure Mon 3 Nov 2025 13.09 ESTLast modified on Mon 3 Nov 2025 15.16 EST OpenAI has signed a $38bn (£29bn) deal to use Amazon infrastructure to operate its artificial intelligence products, as part of a more than $1tn spending spree on computing power. The agreement with Amazon Web Services means OpenAI will be able to use AWS datacentres, and the Nvidia chips inside them, immediately. Last week, OpenAIâ s chief executive, Sam Altman, said his company had committed to spending $1.4tn on AI infrastructure, amid concerns over the sustainability of the boom in using and building datacentres.