Stress-Testing Long-Context Language Models with Lifelong ICL and Task Haystack