inventory
ISOMORPH: A Supply Chain Digital Twin for Simulation, Dataset Generation, and Forecasting Benchmarks
Zhang, Zhizhen, Gu, Hyemin, Zhang, Benjamin J., Elenius, Daniel, Tyrrell, Michael, Bourdais, Theo J., Owhadi, Houman, Katsoulakis, Markos A., Sahai, Tuhin
Open time-series forecasting (TSF) benchmarks cover retail, energy, weather, and traffic, but supply-chain logistics remains underserved. We introduce ISOMORPH, the first public digital twin of a multi-echelon logistics network with fully interpretable, user-configurable parameters and modular topology, demand process, and control rules. The simulator advances a directed routing graph in discrete time: demand arrives at the destination, is served from stock or recorded as backlog, and triggers replenishment through the network. The state vector tracks per-node on-hand inventory with outstanding orders, in-transit shipments, and a smoothed demand estimate, so the dynamics close as a Markov chain on a tractable state space whose transition kernel acts linearly on the empirical distribution of the state. The released data reproduces the bullwhip effect at empirically consistent magnitudes, and three conservation laws encoded in the Markov chain serve as verification tools when users extend the simulator. We release datasets at two catalogue scales ($C=50$ and $C=200$) with six scenario sweeps producing 30 additional rollouts and 20 Latin-hypercube perturbations, exhibiting dynamics absent from fixed TSF benchmarks: variance amplification, cascading bottlenecks, regime shifts, and cross-channel coupling through shared macro shocks. Zero-shot evaluation of four foundation models (Chronos, Moirai, TimesFM, Lag-Llama) shows MASE values exceeding public GIFT-Eval references at low-to-moderate horizons, supporting incorporation into existing benchmarks. The same pairing produces forecast confidence bands via Latin-hypercube perturbation of demand-side knobs, forward UQ from parameter uncertainty unavailable on standard TSF datasets, demonstrating that foundation models can serve as fast surrogates for the digital twin's forward UQ. Code (MIT): https://github.com/tuhinsahai/ISOMORPH.
e6c2e85db1f1039177c4495ccd399ac4-Supplemental-Conference.pdf
A.1 Preliminary Study2 The basic GPT-2 model1 is trained from scratch on each corpus, which has 12 transformer blocks3 and 12 attention heads with 768 hidden dimensions. The Huggingface transformers [4] and Pytorch4 toolkit [2] are used to train the GPT-2 model in the distributed manner on A100 GPU server. The5 hyper-parameters during training are shown in Table 1.6 Hyper-parameter Value Optimization steps 100K Test interval 10K Dropout rate 0.1 Grad clipping 1.0 Learning rate 5e 5 Batch size 128 Maximum sequence length 256 Warmup steps 10K Learning scheduler Linear decay Random seed 0 Number of GPUs 4 Learning objective Cross-Entropy Loss Table 1: The hyper-parameters during GPT-2 training procedure. Most of the hyper-parameters for our proposed method are the same as that in Table 1 for better8 variable controlling. The specific hyper-parameters for our proposed method are the length of9 repetitive n-gram and its repetition dropout rate p, which are set as 2 and 0.6, respectively.10
Evaluating and Inducing Personality in Pre-trained Language Models
Standardized and quantified evaluation of machine behaviors is a crux of understanding LLMs. In this study, we draw inspiration from psychometric studies by leveraging human personality theory as a tool for studying machine behaviors. Originating as a philosophical quest for human behaviors, the study of personality delves into how individuals differ in thinking, feeling, and behaving. Toward building and understanding human-like social machines, we are motivated to ask: Can we assess machine behaviors by leveraging human psychometric tests in a principled and quantitative manner? If so, can we induce a specific personality in LLMs? To answer these questions, we introduce the Machine Personality Inventory (MPI) tool for studying machine behaviors; MPI follows standardized personality tests, built upon the Big Five Personality Factors (Big Five) theory and personality assessment inventories.
Supplementary Material
Tab. 13 shows the parameters and variables used in this optimization. Table 13: Parameters and variables used in credit optimization.Known Parameters Description ϱ = R Eq. 5 presents the optimization formulation, where Eq. 5a calculates the total credits gained by the The following examples illustrate the prompts used in LLM-C for each mini-game. The prompts vary slightly for different mini-games and also differ across stages within the same mini-game. Specifically, the prompt for the dynamic scenario in Social Structure is presented in Listing 1. The corresponding prompts are provided in Listing 4 and Listing 5. 27 Listing 1: Prompt example for dynamic scenario in Social Structure . Instructions: - The AdaSociety game is an open-ended multi-agent environment. The game consists ofa complex crafting tree, where the agent needs to obtain as many resources aspossible in the limited time and craft tools to mine more advanced resources tomaximize its benefit. At the same time, agents can also take other actions tohelp them increase their returns. The numbers of resources are limited.- Map: AdaSociety is a 2D grid-world game. The map size is 15*15.- Some of them can only bediscovered with some specific tools, which will be introduced next.-
HHS Is Using AI Tools From Palantir to Target 'DEI' and 'Gender Ideology' in Grants
HHS Is Using AI Tools From Palantir to Target'DEI' and'Gender Ideology' in Grants Since March of 2025, the Trump Administration has used tools from Palantir and the startup Credal AI to weed out "DEI" and "gender ideology from child welfare programs. A view of the Palantir building is seen during the World Economic Forum Annual Meeting 2026 in Davos, Switzerland. Since last March, the Department of Health and Human Services has been using AI tools from Palantir to screen and audit grants, grant applications, and job descriptions for noncompliance with President Donald Trump's executive orders targeting "gender ideology" and anything related to diversity, equity, inclusion (DEI), according to a recently published inventory of all use cases HHS had for AI in 2025. Neither Palantir nor HHS has publicly announced that the company's software was being used for these purposes. During the first year of Trump's second term, Palantir earned more than $35 million in payments and obligations ...