DiVISe: Direct Visual-Input Speech Synthesis Preserving Speaker Characteristics And Intelligibility