Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models