How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not