information
What was Doge? How Elon Musk tried to gamify government
In 2025, when Elon Musk joined the government as the de facto head of something called the "department of government efficiency", he declared that governments were poorly configured "big dumb machines". To the senator Ted Cruz, he explained that "the only way to reconcile the databases and get rid of waste and fraud is to actually look at the computers". Muskism came to Washington soaked in memes, adolescent boasts and sadistic victory dances over mass firings. Leading a team of teenage coders and mid-level managers drawn from his suite of companies, Musk aimed to enter the codebase and rewrite regulations and budget lines from within. He would drag the paper-pushing bureaucracy kicking and screaming into the digital 21st century, scanning the contents of cavernous rooms of filing cabinets and feeding the data into a single interoperable system. The undertaking combined features of private equity-led restructuring with startup management, shot through with the sensibility of gaming and rightwing culture war. To succeed, he would need "God mode", an overview of the whole. If the mandate of Doge was to "[modernise] federal technology and software to maximise governmental efficiency and productivity", in the words of the executive order that launched the initiative on 20 January 2025, the reality was a strengthening of the state's surveillance capacities. Over time, Musk had become convinced that the real bugs in the code were people, especially the non-white illegal immigrants whom he saw as pawns in a liberal scheme to corrupt democracy and beneficiaries of what he called "suicidal empathy". He understood empathy itself in coding terms.
- North America > United States > New York (0.04)
- North America > United States > California (0.04)
- Oceania > Australia (0.04)
- (4 more...)
Relational recurrent neural networks
Memory-based neural networks model temporal data by leveraging an ability to remember information for long periods. It is unclear, however, whether they also have an ability to perform complex relational reasoning with the information they remember. Here, we first confirm our intuitions that standard memory architectures may struggle at tasks that heavily involve an understanding of the ways in which entities are connected -- i.e., tasks involving relational reasoning. We then improve upon these deficits by using a new memory module -- a Relational Memory Core (RMC) -- which employs multi-head dot product attention to allow memories to interact. Finally, we test the RMC on a suite of tasks that may profit from more capable relational reasoning across sequential information, and show large gains in RL domains (BoxWorld & Mini PacMan), program evaluation, and language modeling, achieving state-of-the-art results on the WikiText-103, Project Gutenberg, and GigaWord datasets.
Unsupervised Image-to-Image Translation Using Domain-Specific Variational Information Bound
Unsupervised image-to-image translation is a class of computer vision problems which aims at modeling conditional distribution of images in the target domain, given a set of unpaired images in the source and target domains. An image in the source domain might have multiple representations in the target domain. Therefore, ambiguity in modeling of the conditional distribution arises, specially when the images in the source and target domains come from different modalities. Current approaches mostly rely on simplifying assumptions to map both domains into a shared-latent space. Consequently, they are only able to model the domain-invariant information between the two modalities. These approaches cannot model domain-specific information which has no representation in the target domain. In this work, we propose an unsupervised image-to-image translation framework which maximizes a domain-specific variational information bound and learns the target domain-invariant representation of the two domain. The proposed framework makes it possible to map a single source image into multiple images in the target domain, utilizing several target domain-specific codes sampled randomly from the prior distribution, or extracted from reference images.
Estimators for Multivariate Information Measures in General Probability Spaces
Information theoretic quantities play an important role in various settings in machine learning, including causality testing, structure inference in graphical models, time-series problems, feature selection as well as in providing privacy guarantees. A key quantity of interest is the mutual information and generalizations thereof, including conditional mutual information, multivariate mutual information, total correlation and directed information. While the aforementioned information quantities are well defined in arbitrary probability spaces, existing estimators employ a $\Sigma H$ method, which can only work in purely discrete space or purely continuous case since entropy (or differential entropy) is well defined only in that regime. In this paper, we define a general graph divergence measure ($\mathbb{GDM}$), generalizing the aforementioned information measures and we construct a novel estimator via a coupling trick that directly estimates these multivariate information measures using the Radon-Nikodym derivative. These estimators are proven to be consistent in a general setting which includes several cases where the existing estimators fail, thus providing the only known estimators for the following settings: (1) the data has some discrete and some continuous valued components (2) some (or all) of the components themselves are discrete-continuous \textit{mixtures} (3) the data is real-valued but does not have a joint density on the entire space, rather is supported on a low-dimensional manifold. We show that our proposed estimators significantly outperform known estimators on synthetic and real datasets.
AI firm Anthropic seeks weapons expert to stop users from 'misuse'
AI firm Anthropic seeks weapons expert to stop users from'misuse' The US artificial intelligence (AI) firm Anthropic is looking to hire a chemical weapons and high-yield explosives expert to try to prevent catastrophic misuse of its software. In other words, it fears that its AI tools might tell someone how to make chemical or radioactive weapons, and wants an expert to ensure its guardrails are sufficiently robust. In the LinkedIn recruitment post, the firm says applicants should have a minimum of five years experience in chemical weapons and/or explosives defence as well as knowledge of radiological dispersal devices - also known as dirty bombs. The firm told the BBC the role was similar to jobs in other sensitive areas that it has already created. Anthropic is not the only AI firm adopting this strategy.
- North America > United States (1.00)
- North America > Central America (0.15)
- Oceania > Australia (0.06)
- (16 more...)
- Leisure & Entertainment (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military (1.00)
Reinforcement Learning with Multiple Experts: A Bayesian Model Combination Approach
Potential based reward shaping is a powerful technique for accelerating convergence of reinforcement learning algorithms. Typically, such information includes an estimate of the optimal value function and is often provided by a human expert or other sources of domain knowledge. However, this information is often biased or inaccurate and can mislead many reinforcement learning algorithms. In this paper, we apply Bayesian Model Combination with multiple experts in a way that learns to trust a good combination of experts as training progresses. This approach is both computationally efficient and general, and is shown numerically to improve convergence across discrete and continuous domains and different reinforcement learning algorithms.
Unsupervised Video Object Segmentation for Deep Reinforcement Learning
We present a new technique for deep reinforcement learning that automatically detects moving objects and uses the relevant information for action selection. The detection of moving objects is done in an unsupervised way by exploiting structure from motion. Instead of directly learning a policy from raw images, the agent first learns to detect and segment moving objects by exploiting flow information in video sequences. The learned representation is then used to focus the policy of the agent on the moving objects. Over time, the agent identifies which objects are critical for decision making and gradually builds a policy based on relevant moving objects.
Geometry-Aware Recurrent Neural Networks for Active Visual Recognition
We present recurrent geometry-aware neural networks that integrate visual information across multiple views of a scene into 3D latent feature tensors, while maintaining an one-to-one mapping between 3D physical locations in the world scene and latent feature locations. Object detection, object segmentation, and 3D reconstruction is then carried out directly using the constructed 3D feature memory, as opposed to any of the input 2D images. The proposed models are equipped with differentiable egomotion-aware feature warping and (learned) depth-aware unprojection operations to achieve geometrically consistent mapping between the features in the input frame and the constructed latent model of the scene. We empirically show the proposed model generalizes much better than geometry-unaware LSTM/GRU networks, especially under the presence of multiple objects and cross-object occlusions. Combined with active view selection policies, our model learns to select informative viewpoints to integrate information from by "undoing cross-object occlusions, seamlessly combining geometry with learning from experience.
Watch Your Step: Learning Node Embeddings via Graph Attention
Graph embedding methods represent nodes in a continuous vector space, preserving different types of relational information from the graph. There are many hyper-parameters to these methods (e.g. the length of a random walk) which have to be manually tuned for every graph. In this paper, we replace previously fixed hyper-parameters with trainable ones that we automatically learn via backpropagation. In particular, we propose a novel attention model on the power series of the transition matrix, which guides the random walk to optimize an upstream objective. Unlike previous approaches to attention models, the method that we propose utilizes attention parameters exclusively on the data itself (e.g. on the random walk), and are not used by the model for inference. We experiment on link prediction tasks, as we aim to produce embeddings that best-preserve the graph structure, generalizing to unseen information. We improve state-of-the-art results on a comprehensive suite of real-world graph datasets including social, collaboration, and biological networks, where we observe that our graph attention model can reduce the error by up to 20\%-40\%. We show that our automatically-learned attention parameters can vary significantly per graph, and correspond to the optimal choice of hyper-parameter if we manually tune existing methods.
FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction
The basic principles in designing convolutional neural network (CNN) structures for predicting objects on different levels, e.g., image-level, region-level, and pixel-level, are diverging. Generally, network structures designed specifically for image classification are directly used as default backbone structure for other tasks including detection and segmentation, but there is seldom backbone structure designed under the consideration of unifying the advantages of networks designed for pixel-level or region-level predicting tasks, which may require very deep features with high resolution. Towards this goal, we design a fish-like network, called FishNet. In FishNet, the information of all resolutions is preserved and refined for the final task. Besides, we observe that existing works still cannot \emph{directly} propagate the gradient information from deep layers to shallow layers. Our design can better handle this problem. Extensive experiments have been conducted to demonstrate the remarkable performance of the FishNet. In particular, on ImageNet-1k, the accuracy of FishNet is able to surpass the performance of DenseNet and ResNet with fewer parameters. FishNet was applied as one of the modules in the winning entry of the COCO Detection 2018 challenge.