Reviews: Greedy Feature Construction

Neural Information Processing Systems 

The paper is well written, and easy to understand. My biggest concern is a lack of comparison to other related approaches, such as single index models. It seems to be that the method you propose is somewhere between SIM and kernel methods, and while you do better than the latter, the former might be better (in terms of error, but slower algorithmically or need more data). So a comparison is warranted - Line 17: here and elsewhere, you use the term "capacity". Can you make the notion of capacity precise?