Goa
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- North America > United States > Arizona > Maricopa County > Phoenix (0.04)
- (8 more...)
Modular Jets for Supervised Pipelines: Diagnosing Mirage vs Identifiability
Classical supervised learning evaluates models primarily via predictive risk on hold-out data. Such evaluations quantify how well a function behaves on a distribution, but they do not address whether the internal decomposition of a model is uniquely determined by the data and evaluation design. In this paper, we introduce \emph{Modular Jets} for regression and classification pipelines. Given a task manifold (input space), a modular decomposition, and access to module-level representations, we estimate empirical jets, which are local linear response maps that describe how each module reacts to small structured perturbations of the input. We propose an empirical notion of \emph{mirage} regimes, where multiple distinct modular decompositions induce indistinguishable jets and thus remain observationally equivalent, and contrast this with an \emph{identifiable} regime, where the observed jets single out a decomposition up to natural symmetries. In the setting of two-module linear regression pipelines we prove a jet-identifiability theorem. Under mild rank assumptions and access to module-level jets, the internal factorisation is uniquely determined, whereas risk-only evaluation admits a large family of mirage decompositions that implement the same input-to-output map. We then present an algorithm (MoJet) for empirical jet estimation and mirage diagnostics, and illustrate the framework using linear and deep regression as well as pipeline classification.
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
- Asia > India > Goa (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- North America > United States > Arizona > Maricopa County > Phoenix (0.04)
- (7 more...)
Temporal Fusion Transformer for Multi-Horizon Probabilistic Forecasting of Weekly Retail Sales
Punati, Santhi Bharath, Kanta, Sandeep, Cheerala, Udaya Bhasker, Lanjewar, Madhusudan G, Damacharla, Praveen
-- Accurate multi - horizon retail forecasts are critical for inventory and promotions. We present a novel study of weekly Walmart sales (45 stores, 2010 - 2012) using a Temporal Fusion Transformer (TFT) that fuses static store identifiers with time - varying exoge nous signals (holidays, CPI, fuel price, temperature). The pipeline produces 1 - 5 - week - ahead probabilistic forecasts via QuantileLoss, yielding calibrated 90% prediction intervals and interpretability through variable - selection networks, static enr ichment, and temporal attention. On a fixed 2012 hold - out dataset, TFT achieves an RMSE of $ 57.9k USD per store - week and an R of 0.9875. Across 5 - fold chronological cross - validation, the averages are RMSE = $ 64.6k USD and R = 0.9844, outperforming XGB, CNN, LSTM, and CNN - LSTM baseline models .
- North America > United States > Texas > Montgomery County > The Woodlands (0.04)
- North America > United States > Texas > Dallas County > Dallas (0.04)
- North America > United States > South Carolina (0.04)
- (2 more...)
- Retail (1.00)
- Banking & Finance > Economy (0.47)
Perception Learning: A Formal Separation of Sensory Representation Learning from Decision Learning
We introduce Perception Learning (PeL), a paradigm that optimizes an agent's sensory interface $f_ϕ:\mathcal{X}\to\mathcal{Z}$ using task-agnostic signals, decoupled from downstream decision learning $g_θ:\mathcal{Z}\to\mathcal{Y}$. PeL directly targets label-free perceptual properties, such as stability to nuisances, informativeness without collapse, and controlled geometry, assessed via objective representation-invariant metrics. We formalize the separation of perception and decision, define perceptual properties independent of objectives or reparameterizations, and prove that PeL updates preserving sufficient invariants are orthogonal to Bayes task-risk gradients. Additionally, we provide a suite of task-agnostic evaluation metrics to certify perceptual quality.
- North America > United States > New York (0.04)
- Asia > India > Goa (0.04)
Symbolic Neural Generation with Applications to Lead Discovery in Drug Design
Srinivasan, Ashwin, Baskar, A, Dash, Tirtharaj, Bain, Michael, Dey, Sanjay Kumar, Banerjee, Mainak
We investigate a relatively underexplored class of hybrid neurosymbolic models integrating symbolic learning with neural reasoning to construct data generators meeting formal correctness criteria. In \textit{Symbolic Neural Generators} (SNGs), symbolic learners examine logical specifications of feasible data from a small set of instances -- sometimes just one. Each specification in turn constrains the conditional information supplied to a neural-based generator, which rejects any instance violating the symbolic specification. Like other neurosymbolic approaches, SNG exploits the complementary strengths of symbolic and neural methods. The outcome of an SNG is a triple $(H, X, W)$, where $H$ is a symbolic description of feasible instances constructed from data, $X$ a set of generated new instances that satisfy the description, and $W$ an associated weight. We introduce a semantics for such systems, based on the construction of appropriate \textit{base} and \textit{fibre} partially-ordered sets combined into an overall partial order, and outline a probabilistic extension relevant to practical applications. In this extension, SNGs result from searching over a weighted partial ordering. We implement an SNG combining a restricted form of Inductive Logic Programming (ILP) with a large language model (LLM) and evaluate it on early-stage drug design. Our main interest is the description and the set of potential inhibitor molecules generated by the SNG. On benchmark problems -- where drug targets are well understood -- SNG performance is statistically comparable to state-of-the-art methods. On exploratory problems with poorly understood targets, generated molecules exhibit binding affinities on par with leading clinical candidates. Experts further find the symbolic specifications useful as preliminary filters, with several generated molecules identified as viable for synthesis and wet-lab testing.
- North America > United States (0.28)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Asia > India > Goa (0.04)
- (4 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Immunology (0.45)
- Materials > Metals & Mining > Lead (0.40)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
Gen-LangSplat: Generalized Language Gaussian Splatting with Pre-Trained Feature Compression
Modeling open-vocabulary language fields in 3D is essential for intuitive human-AI interaction and querying within physical environments. State-of-the-art approaches, such as LangSplat, leverage 3D Gaussian Splatting to efficiently construct these language fields, encoding features distilled from high-dimensional models like CLIP. However, this efficiency is currently offset by the requirement to train a scene-specific language autoencoder for feature compression, introducing a costly, per-scene optimization bottleneck that hinders deployment scalability. In this work, we introduce Gen-LangSplat, that eliminates this requirement by replacing the scene-wise autoencoder with a generalized autoencoder, pre-trained extensively on the large-scale ScanNet dataset. This architectural shift enables the use of a fixed, compact latent space for language features across any new scene without any scene-specific training. By removing this dependency, our entire language field construction process achieves a efficiency boost while delivering querying performance comparable to, or exceeding, the original LangSplat method. To validate our design choice, we perform a thorough ablation study empirically determining the optimal latent embedding dimension and quantifying representational fidelity using Mean Squared Error and cosine similarity between the original and reprojected 512-dimensional CLIP embeddings. Our results demonstrate that generalized embeddings can efficiently and accurately support open-vocabulary querying in novel 3D scenes, paving the way for scalable, real-time interactive 3D AI applications.
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- Asia > Singapore (0.04)
- Asia > India > Goa (0.04)
A Comprehensive Dataset for Human vs. AI Generated Text Detection
Roy, Rajarshi, Imanpour, Nasrin, Aziz, Ashhar, Bajpai, Shashwat, Singh, Gurpreet, Biswas, Shwetangshu, Wanaskar, Kapil, Patwa, Parth, Ghosh, Subhankar, Dixit, Shreyas, Pal, Nilesh Ranjan, Rawte, Vipula, Garimella, Ritvik, Jena, Gaytri, Sheth, Amit, Sharma, Vasu, Reganti, Aishwarya Naresh, Jain, Vinija, Chadha, Aman, Das, Amitava
The rapid advancement of large language models (LLMs) has led to increasingly human-like AI-generated text, raising concerns about content authenticity, misinformation, and trustworthiness. Addressing the challenge of reliably detecting AI-generated text and attributing it to specific models requires large-scale, diverse, and well-annotated datasets. In this work, we present a comprehensive dataset comprising over 58,000 text samples that combine authentic New York Times articles with synthetic versions generated by multiple state-of-the-art LLMs including Gemma-2-9b, Mistral-7B, Qwen-2-72B, LLaMA-8B, Yi-Large, and GPT-4-o. The dataset provides original article abstracts as prompts, full human-authored narratives. We establish baseline results for two key tasks: distinguishing human-written from AI-generated text, achieving an accuracy of 58.35\%, and attributing AI texts to their generating models with an accuracy of 8.92\%. By bridging real-world journalistic content with modern generative models, the dataset aims to catalyze the development of robust detection and attribution methods, fostering trust and transparency in the era of generative AI. Our dataset is available at: https://huggingface.co/datasets/gsingh1-py/train.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > Washington (0.04)
- (15 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)
ZING-3D: Zero-shot Incremental 3D Scene Graphs via Vision-Language Models
Understanding and reasoning about complex 3D environments requires structured scene representations that capture not only objects but also their semantic and spatial relationships. While recent works on 3D scene graph generation have leveraged pretrained VLMs without task-specific fine-tuning, they are largely confined to single-view settings, fail to support incremental updates as new observations arrive and lack explicit geometric grounding in 3D space, all of which are essential for embodied scenarios. In this paper, we propose, ZING-3D, a framework that leverages the vast knowledge of pretrained foundation models to enable open-vocabulary recognition and generate a rich semantic representation of the scene in a zero-shot manner while also enabling incremental updates and geometric grounding in 3D space, making it suitable for downstream robotics applications. Our approach leverages VLM reasoning to generate a rich 2D scene graph, which is grounded in 3D using depth information. Nodes represent open-vocabulary objects with features, 3D locations, and semantic context, while edges capture spatial and semantic relations with inter-object distances. Our experiments on scenes from the Replica and HM3D dataset show that ZING-3D is effective at capturing spatial and relational knowledge without the need of task-specific training.
LLM-RG: Referential Grounding in Outdoor Scenarios using Large Language Models
Saxena, Pranav, Bhattacharya, Avigyan, Zhang, Ji, Wang, Wenshan
Referential grounding in outdoor driving scenes is challenging due to large scene variability, many visually similar objects, and dynamic elements that complicate resolving natural-language references (e.g., "the black car on the right"). We propose LLM-RG, a hybrid pipeline that combines off-the-shelf vision-language models for fine-grained attribute extraction with large language models for symbolic reasoning. LLM-RG processes an image and a free-form referring expression by using an LLM to extract relevant object types and attributes, detecting candidate regions, generating rich visual descriptors with a VLM, and then combining these descriptors with spatial metadata into natural-language prompts that are input to an LLM for chain-of-thought reasoning to identify the referent's bounding box. Evaluated on the Talk2Car benchmark, LLM-RG yields substantial gains over both LLM and VLM-based baselines. Additionally, our ablations show that adding 3D spatial cues further improves grounding. Our results demonstrate the complementary strengths of VLMs and LLMs, applied in a zero-shot manner, for robust outdoor referential grounding.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > India > Goa (0.04)