The problem comes when he talks about the hidden layers. He basically says that the more hidden layers the better(at the price of being more'expensive' to compute) and that the amount of neurons in each layer should be a comparable amount of the number of initial inputs or greater. But this explanation seems kind of vague/random, there is an infinite amount of combinations you can choose from: You just go trying one by one until one architecture seems to work? For example, what architecture would you use to make a program that distinguishes numbers from 1 to 10, say, on a 50x50 pixel window? How would you come up with that?
Jun-10-2016, 05:05:24 GMT