Bayesian optimization for sparse neural networks with trainable activation functions