SI-Bench: Benchmarking Social Intelligence of Large Language Models in Human-to-Human Conversations