We aim to jointly optimize the antenna tilt angle, and the vertical and horizontal half-power beamwidths of the macrocells in a heterogeneous cellular network (HetNet) via a synergistic combination of deep learning (DL) and reinforcement learning (RL). The interactions between the cells, most notably due to their coupled interference and the large number of users, renders this optimization problem prohibitively complex. This makes the proposed deep RL technique attractive as a practical online solution for real deployments, which should automatically adapt to new base stations being added and other environmental changes in the network. In the proposed algorithm, DL is used to extract the features by learning the locations of the users, and mean field RL is used to learn the average interference values for different antenna settings. Our results illustrate that the proposed deep RL algorithm can approach the optimum weighted sum rate with hundreds of online trials, as opposed to millions of trials for standard Q-learning, assuming relatively low environmental dynamics. Furthermore, the proposed algorithm is compact and implementable, and empirically appears to provide a performance guarantee regardless of the amount of environmental dynamics.