cak
APPENDIX AOverview of group representations
In this section we briefly introduce the representation theory of the three groups we used in this work. Planar rotations group SO(2) The standard representation of r 2 SO(2) is as a 2 2 rotation matrix (r)= cos sin sin cos The complex irreducible representations are often used and correspond to the circular harmonics. Planar rotations and reflections group O(2) The standard representation of O(2) is as a 2 2 orthogonal matrix (r)= cos sin sin cos and (r f)= cos sin sin cos 10 01 Apart from the trivial representation 0,0(h)=1 8h 2 O(2) and the sign-flip representation 1,0(r)=1 and 1,0(f)= 1, all other irreps are 2 dimensional. These representations are isomorphic to the Wigner D matrices. In particular, 0 is the trivial representation and i is isomorphic to the standard representation of SO(3) as 3 3 rotation matrices. An element g =( m,r) 2 O(3) is a pair of a mirroring m 2{ e,mz} and a rotation r 2 SO(3). In general, if G is a group, we denote with bG the set of its irreducible representations. Recall the generative process for cryo-EM images: oi = (g 1i) with gi 2 SO(3) (12) 14 Let Rz = SO(2) < SO(3) the subgroup of SO(3) containing rotations around the Z axis and H = O(2) < SO(3) the subgroup containing also the rotation ry by around the Y axis.
CAK: Emergent Audio Effects from Minimal Deep Learning
We demonstrate that a single 3x3 convolutional kernel can produce emergent audio effects when trained on 200 samples from a personalized corpus. We achieve this through two key techniques: (1) Conditioning Aware Kernels (CAK), where output = input + (learned_pattern x control), with a soft-gate mechanism supporting identity preservation at zero control; and (2) AuGAN (Audit GAN), which reframes adversarial training from "is this real?" to "did you apply the requested value?" Rather than learning to generate or detect forgeries, our networks cooperate to verify control application, discovering unique transformations. The learned kernel exhibits a diagonal structure creating frequency-dependent temporal shifts that are capable of producing musical effects based on input characteristics. Our results show the potential of adversarial training to discover audio transformations from minimal data, enabling new approaches to effect design.