Benchmarking atmospheric circulation variability in an AI emulator, ACE2, and a hybrid model, NeuralGCM