figure1
cf9a242b70f45317ffd281241fa66502-AuthorFeedback.pdf
We thank the reviewers for their close reading of the paper and helpful feedback. Forexample, one can use thedensity ratio estimates7 provided by DualDICE to modify (importance-weight) the off-policy data distribution before passing it to a policy8 gradient orQ-learning method. The figures are overall too small... In Figure 2 the x axis label is missing. The x-axis is training step.
0e900ad84f63618452210ab8baae0218-AuthorFeedback.pdf
All hyper-parameters are the same as the ones used in the paper or are default to A2C. The same set of auxiliary tasks are also used. Ability to separate harmful auxiliary tasks: In Figure 3 of the orignal paper, we show that AutoEncoder is a11 harmful auxiliary task for Finger Turn environment. Here, a toy example in Figure 7 with one positive auxiliary12 task and one harmful auxiliary task shows that our algorithm is able to avoid adversarial auxiliary tasks without13 any prior knowledge.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.06)
- North America > United States > Oregon > Multnomah County > Portland (0.05)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
AdversarialCrowdsourcingThroughRobust Rank-OneMatrixCompletion
Notation and conventions: [n] = {1,,n}; |S| is the size of setP; dxe is the smallest integer greater thanx; bxc is the largest integer smaller thanx; kXk is the nuclear norm of matrixL, i.e., the sum of the singular values of matrixX; Z+ is the set of positive integers;Z i is the set of integers which are greater thani; Given S1, S2, the reduction ofS1 by S2 is denoted as S1\S2={i S1:i / S2};finally,A(n) B(n)meansA(n)/B(n) 1asn .
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England > Lancashire > Lancaster (0.04)
- Asia > Middle East > Jordan (0.04)
SupplementaryMaterial
We adopt four bioinformatics datasets in the experiment. Given the input graph, it will randomly add or cut a certain portion ofconnections between nodes withtheprobability of0.2. It will set the feature of 20% nodes in the graph to Gaussian noises with mean and standard deviation is 0.5. We adopt the Adam [5] optimizer, which is a variant of Stochastic Gradient Descent (SGD) with adaptivemoment estimation.
- Asia > China (0.05)
- Oceania > Australia > Western Australia > Perth (0.05)
- North America > United States > Massachusetts > Suffolk County > Boston (0.05)
- (2 more...)
- North America > United States > New York (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > China (0.04)