What Happens to a Dataset Transformed by a Projection-based Concept Removal Method?
–arXiv.org Artificial Intelligence
We investigate the behavior of methods that use linear projections to remove information about a concept from a language representation, and we consider the question of what happens to a dataset transformed by such a method. A theoretical analysis and experiments on real-world and synthetic data show that these methods inject strong statistical dependencies into the transformed datasets. After applying such a method, the representation space is highly structured: in the transformed space, an instance tends to be located near instances of the opposite label. As a consequence, the original labeling can in some cases be reconstructed by applying an anti-clustering method.
arXiv.org Artificial Intelligence
Mar-24-2024
- Country:
- Oceania > Australia
- North America > United States
- Washington > King County
- Seattle (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.04)
- California > San Francisco County
- San Francisco (0.14)
- Washington > King County
- Europe
- Italy (0.04)
- Czechia > Prague (0.04)
- Sweden > Vaestra Goetaland
- Gothenburg (0.04)
- France > Hauts-de-France
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- China > Hong Kong (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Africa > Rwanda
- Genre:
- Research Report > New Finding (0.68)
- Technology: