Can we use gradient desent method in maximum entropy model?