Visual-information-driven model for crowd simulation using temporal convolutional network