Memory-augmented Transformers can implement Linear First-Order Optimization Methods