convex cone
- North America > United States > Maryland (0.04)
- North America > Canada (0.04)
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper considers the estimation of an unknown vector v0 from noisy quadratic observations and some additional information regarding v0. Specifically, it considers that the unknown vector v0 is from a convex cone. It rigorously shows that the resulting optimization problem is tractable. Note that the resulting optimization problem in Eq.(3) is non-convex.
Measuring and Controlling Instruction (In)Stability in Language Model Dialogs
Li, Kenneth, Liu, Tianle, Bashkansky, Naomi, Bau, David, Viégas, Fernanda, Pfister, Hanspeter, Wattenberg, Martin
System-prompting is a standard tool for customizing language-model chatbots, enabling them to follow a specific instruction. An implicit assumption in the use of system prompts is that they will be stable, so the chatbot will continue to generate text according to the stipulated instructions for the duration of a conversation. We propose a quantitative benchmark to test this assumption, evaluating instruction stability via self-chats between two instructed chatbots. Testing popular models like LLaMA2-chat-70B and GPT-3.5, we reveal a significant instruction drift within eight rounds of conversations. An empirical and theoretical analysis of this phenomenon suggests the transformer attention mechanism plays a role, due to attention decay over long exchanges. To combat attention decay and instruction drift, we propose a lightweight method called split-softmax, which compares favorably against two strong baselines.
Information Processing by Neuron Populations in the Central Nervous System: Mathematical Structure of Data and Operations
In the intricate architecture of the mammalian central nervous system, neurons form populations. Axonal bundles communicate between these clusters using spike trains. However, these neuron populations' precise encoding and operations have yet to be discovered. In our analysis, the starting point is a state-of-the-art mechanistic model of a generic neuron endowed with plasticity. From this simple framework emerges a subtle mathematical construct: The representation and manipulation of information can be precisely characterized by an algebra of convex cones. Furthermore, these neuron populations are not merely passive transmitters. They act as operators within this algebraic structure, mirroring the functionality of a low-level programming language. When these populations interconnect, they embody succinct yet potent algebraic expressions. These networks allow them to implement many operations, such as specialization, generalization, novelty detection, dimensionality reduction, inverse modeling, prediction, and associative memory. In broader terms, this work illuminates the potential of matrix embeddings in advancing our understanding in fields like cognitive science and AI. These embeddings enhance the capacity for concept processing and hierarchical description over their vector counterparts.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (9 more...)
Embedding Ontologies in the Description Logic ALC by Axis-Aligned Cones
Özcep, Özgür Lütfü (University of Lübeck) | Leemhuis, Mena | Wolter, Diedrich
This paper is concerned with knowledge graph embedding with background knowledge, taking the formal perspective of logics. In knowledge graph embedding, knowledge— expressed as a set of triples of the form (a R b) (“a is R-related to b”)—is embedded into a real-valued vector space. The embedding helps exploiting geometrical regularities of the space in order to tackle typical inductive tasks of machine learning such as link prediction. Recent embedding approaches also consider incorporating background knowledge, in which the intended meanings of the symbols a, R, b are further constrained via axioms of a theory. Of particular interest are theories expressed in a formal language with a neat semantics and a good balance between expressivity and feasibility. In that case, the knowledge graph together with the background can be considered to be an ontology. This paper develops a cone-based theory for embedding in order to advance the expressivity of the ontology: it works (at least) with ontologies expressed in the description logic ALC, which comprises restricted existential and universal quantifiers, as well as concept negation and concept disjunction. In order to align the classical Tarskian Style semantics for ALC with the sub-symbolic representation of triples, we use the notion of a geometric model of an ALC ontology and show, as one of our main results, that an ALC ontology is satisfiable in the classical sense iff it is satisfiable by a geometric model based on cones. The geometric model, if treated as a partial model, can even be chosen to be faithful, i.e., to reflect all and only the knowledge captured by the ontology. We introduce the class of axis-aligned cones and show that modulo simple geometric operations any distributive logic (such as ALC) interpreted over cones employs this class of cones. Cones are also attractive from a machine learning perspective on knowledge graph embeddings since they give rise to applying conic optimization techniques.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Blackwell's Approachability with Time-Dependent Outcome Functions and Dot Products. Application to the Big Match
Blackwell's approachability is a very general sequential decision framework where a Decision Maker obtains vector-valued outcomes, and aims at the convergence of the average outcome to a given "target" set. Blackwell gave a sufficient condition for the decision maker having a strategy guaranteeing such a convergence against an adversarial environment, as well as what we now call the Blackwell's algorithm, which then ensures convergence. Blackwell's approachability has since been applied to numerous problems, in online learning and game theory, in particular. We extend this framework by allowing the outcome function and the dot product to be time-dependent. We establish a general guarantee for the natural extension to this framework of Blackwell's algorithm. In the case where the target set is an orthant, we present a family of time-dependent dot products which yields different convergence speeds for each coordinate of the average outcome. We apply this framework to the Big Match (one of the most important toy examples of stochastic games) where an $\epsilon$-uniformly optimal strategy for Player I is given by Blackwell's algorithm in a well-chosen auxiliary approachability problem.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Texas (0.04)
Refined approachability algorithms and application to regret minimization with global costs
Blackwell's approachability is a framework where two players, the Decision Maker and the Environment, play a repeated game with vector-valued payoffs. The goal of the Decision Maker is to make the average payoff converge to a given set called the target. When this is indeed possible, simple algorithms which guarantee the convergence are known. This abstract tool was successfully used for the construction of optimal strategies in various repeated games, but also found several applications in online learning. By extending an approach proposed by Abernethy et al. (2011), we construct and analyze a class of Follow the Regularized Leader algorithms (FTRL) for Blackwell's approachability which are able to minimize not only the Euclidean distance to the target set (as it is often the case in the context of Blackwell's approachability) but a wide range of distance-like quantities. This flexibility enables us to apply these algorithms to minimize the exact quantity of interest in various online learning problems. In particular, for regret minimization with global costs, we obtain novel guarantees for general norm cost functions, and for the case of $\ell_p$ cost functions, we obtain the first regret bounds with explicit dependence in $p$ and the dimension $d$.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
Orthologics for Cones
Leemhuis, Mena, Özçep, Özgür L., Wolter, Diedrich
In applications that use knowledge representation (KR) techniques, in particular those that combine data-driven and logic methods, the domain of objects is not an abstract unstructured domain, but it exhibits a dedicated, deep structure of geometric objects. One example is the class of convex sets used to model natural concepts in conceptual spaces, which also links via convex optimization techniques to machine learning. In this paper we study logics for such geometric structures. Using the machinery of lattice theory, we describe an extension of minimal orthologic with a partial modularity rule that holds for closed convex cones. This logic combines a feasible data structure (exploiting convexity/conicity) with sufficient expressivity, including full orthonegation (exploiting conicity).
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (4 more...)
Non-Sparse PCA in High Dimensions via Cone Projected Power Iteration
In this paper, we propose a cone projected power iteration algorithm to recover the principal eigenvector from a noisy positive semidefinite matrix. When the true principal eigenvector is assumed to belong to a convex cone, the proposed algorithm is fast and has a tractable error. Specifically, the method achieves polynomial time complexity for certain convex cones equipped with fast projection such as the monotone cone. It attains a small error when the noisy matrix has a small cone-restricted operator norm. We supplement the above results with a minimax lower bound of the error under the spiked covariance model. Our numerical experiments on simulated and real data, show that our method achieves shorter run time and smaller error in comparison to the ordinary power iteration and some sparse principal component analysis algorithms if the principal eigenvector is in a convex cone.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York (0.04)
- Asia > Middle East > Jordan (0.04)
Reinforcement Learning with Convex Constraints
Miryoosefi, Sobhan, Brantley, Kianté, Daumé, Hal III, Dudik, Miroslav, Schapire, Robert
In standard reinforcement learning (RL), a learning agent seeks to optimize the overall reward. However, many key aspects of a desired behavior are more naturally expressed as constraints. For instance, the designer may want to limit the use of unsafe actions, increase the diversity of trajectories to enable exploration, or approximate expert trajectories when rewards are sparse. In this paper, we propose an algorithmic scheme that can handle a wide class of constraints in RL tasks, specifically, any constraints that require expected values of some vector measurements (such as the use of an action) to lie in a convex set. This captures previously studied constraints (such as safety and proximity to an expert), but also enables new classes of constraints (such as diversity). Our approach comes with rigorous theoretical guarantees and only relies on the ability to approximately solve standard RL tasks. As a result, it can be easily adapted to work with any model-free or model-based RL algorithm. In our experiments, we show that it matches previous algorithms that enforce safety via constraints, but can also enforce new properties that these algorithms cannot incorporate, such as diversity.
- North America > United States > Maryland (0.04)
- Asia > Middle East > Jordan (0.04)