On the Linguistic and Computational Requirements for Creating Face-to-Face Multimodal Human-Machine Interaction