Design, Results and Industry Implications of the World's First Insurance Large Language Model Evaluation Benchmark