A Appendix: Optimality of One-Hot Encodings

May-28-2025, 22:54:28 GMT–Neural Information Processing Systems

We present a brief proof about the local optimality of one-hot encodings in the decision-theoretic framework presented in Section 3.2. We seek to prove that, under assumptions of an identity reward matrix, tokens constrained to a unit hypercube, and gaussian additive noise, one-hot tokens are an optimally robust communication strategy. We only seek to prove local optimality, as one many trivially generate multiple, equally optimal tokens by, for example, flipping all bits. The following derivation uses Karush-Kuhn-Tucker (KKT) conditions, a generalization of Lagrange multipliers [17]. We maximize the function, subject to constraints. We seek to show that one-hot vectors are an optimum, so we now show that one-hot vectors indeed respect the constraints and set the derivatives to zero.

agent, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

May-28-2025, 22:54:28 GMT

Conferences PDF

Add feedback

Genre:
- Questionnaire & Opinion Survey (0.47)
- Research Report > New Finding (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Representation & Reasoning (0.69)

Duplicate Docs Excel Report

Title
5812f92450ccaf17275500841c70924a-Supplemental.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found