Deriving the Scaled-Dot-Function via Maximum Likelihood Estimation and Maximum Entropy Approach