Joint Modeling of Visual Objects and Relations for Scene Graph Generation (Supplementary Material)

Neural Information Processing Systems 

Based on the formulation of the likelihood function pΘ(G|I) = fΘ(G,I)/ZΘ(I), we can reformulate the gradient of log-likelihood function as: ΘL(Θ) = EG pd[ Θ log fΘ(G,I)] Θ log ZΘ(I). Theorem 2. In the initialization phase, the potential function ψtriplet(r,yoh,yot) for modeling label dependency is omitted in p(G|I), yielding a simplified model distribution ˆp(G|I). Now, we can exactly derive that q(G) = ˆp(G|I). Theorem 3. In the update phase, we use the full expression of p(G|I) with the potential function ψtriplet(r,yoh,yot) for modeling label dependency. In this case, maximizing L(q) is equivalent to minimizing the KL divergence term, and the minimum occurs when q(yo) = p(yo,I).

Similar Docs  Excel Report  more

TitleSimilaritySource
None found