Weiss, Martin
Language Models Can Reduce Asymmetry in Information Markets
Rahaman, Nasim, Weiss, Martin, Wüthrich, Manuel, Bengio, Yoshua, Li, Li Erran, Pal, Chris, Schölkopf, Bernhard
This work addresses the buyer's inspection paradox for information markets. The paradox is that buyers need to access information to determine its value, while sellers need to limit access to prevent theft. To study this, we introduce an open-source simulated digital marketplace where intelligent agents, powered by language models, buy and sell information on behalf of external participants. The central mechanism enabling this marketplace is the agents' dual capabilities: they not only have the capacity to assess the quality of privileged information but also come equipped with the ability to forget. This ability to induce amnesia allows vendors to grant temporary access to proprietary information, significantly reducing the risk of unauthorized retention while enabling agents to accurately gauge the information's relevance to specific queries or tasks. To perform well, agents must make rational decisions, strategically explore the marketplace through generated sub-queries, and synthesize answers from purchased information. Concretely, our experiments (a) uncover biases in language models leading to irrational behavior and evaluate techniques to mitigate these biases, (b) investigate how price affects demand in the context of informational goods, and (c) show that inspection and higher budgets both lead to higher quality outcomes.
Visual Question Answering From Another Perspective: CLEVR Mental Rotation Tests
Beckham, Christopher, Weiss, Martin, Golemo, Florian, Honari, Sina, Nowrouzezahrai, Derek, Pal, Christopher
Different types of mental rotation tests have been used extensively in psychology to understand human visual reasoning and perception. Understanding what an object or visual scene would look like from another viewpoint is a challenging problem that is made even harder if it must be performed from a single image. We explore a controlled setting whereby questions are posed about the properties of a scene if that scene was observed from another viewpoint. To do this we have created a new version of the CLEVR dataset that we call CLEVR Mental Rotation Tests (CLEVR-MRT). Using CLEVR-MRT we examine standard methods, show how they fall short, then explore novel neural architectures that involve inferring volumetric representations of a scene. These volumes can be manipulated via camera-conditioned transformations to answer the question. We examine the efficacy of different model variants through rigorous ablations and demonstrate the efficacy of volumetric representations.
gradSim: Differentiable simulation for system identification and visuomotor control
Jatavallabhula, Krishna Murthy, Macklin, Miles, Golemo, Florian, Voleti, Vikram, Petrini, Linda, Weiss, Martin, Considine, Breandan, Parent-Levesque, Jerome, Xie, Kevin, Erleben, Kenny, Paull, Liam, Shkurti, Florian, Nowrouzezahrai, Derek, Fidler, Sanja
We consider the problem of estimating an object's physical properties such as mass, friction, and elasticity directly from video sequences. Such a system identification problem is fundamentally ill-posed due to the loss of information during image formation. Current solutions require precise 3D labels which are labor-intensive to gather, and infeasible to create for many systems such as deformable solids or cloth. We present gradSim, a framework that overcomes the dependence on 3D supervision by leveraging differentiable multiphysics simulation and differentiable rendering to jointly model the evolution of scene dynamics and image formation. This novel combination enables backpropagation from pixels in a video sequence through to the underlying physical attributes that generated them. Moreover, our unified computation graph -- spanning from the dynamics and through the rendering process -- enables learning in challenging visuomotor control tasks, without relying on state-based (3D) supervision, while obtaining performance competitive to or better than techniques that rely on precise 3D labels.
Predicting Infectiousness for Proactive Contact Tracing
Bengio, Yoshua, Gupta, Prateek, Maharaj, Tegan, Rahaman, Nasim, Weiss, Martin, Deleu, Tristan, Muller, Eilif, Qu, Meng, Schmidt, Victor, St-Charles, Pierre-Luc, Alsdurf, Hannah, Bilanuik, Olexa, Buckeridge, David, Caron, Gáetan Marceau, Carrier, Pierre-Luc, Ghosn, Joumana, Ortiz-Gagne, Satya, Pal, Chris, Rish, Irina, Schölkopf, Bernhard, Sharma, Abhinav, Tang, Jian, Williams, Andrew
The COVID-19 pandemic has spread rapidly worldwide, overwhelming manual contact tracing in many countries and resulting in widespread lockdowns for emergency containment. Various DCT methods have been proposed, each making tradeoffs between privacy, mobility restrictions, and public health. The most common approach, binary contact tracing (BCT), models infection as a binary event, informed only by an individual's test results, with corresponding binary recommendations that either all or none of the individual's contacts quarantine. BCT ignores the inherent uncertainty in contacts and the infection process, which could be used to tailor messaging to high-risk individuals, and prompt proactive testing or earlier warnings. It also does not make use of observations such as symptoms or preexisting medical conditions, which could be used to make more accurate infectiousness predictions. In this paper, we use a recently-proposed COVID-19 epidemiological simulator to develop and test methods that can be deployed to a smartphone to locally and proactively predict an individual's infectiousness (risk of infecting others) based on their contact history and other information, while respecting strong privacy constraints. Predictions are used to provide personalized recommendations to the individual via an app, as well as to send anonymized messages to the individual's contacts, who use this information to better predict their own infectiousness, an approach we call proactive contact tracing (PCT). Similarly to other works, we find that compared to no tracing, all DCT methods tested are able to reduce spread of the disease and thus save lives, even at low adoption rates, strongly supporting a role for DCT methods in managing the pandemic. Further, we find a deep-learning based PCT method which improves over BCT for equivalent average mobility, suggesting PCT could help in safe reopening and second-wave prevention. Until pharmaceutical interventions such as a vaccine become available, control of the COVID-19 pandemic relies on nonpharmaceutical interventions such as lockdown and social distancing. While these have often been successful in limiting spread of the disease in the short term, these restrictive measures have important negative social, mental health, and economic impacts. Digital contact tracing (DCT), a technique to track the spread of the virus among individuals in a population using smartphones, is an attractive potential solution to help reduce growth in the number of cases and thereby allow more economic and social activities to resume while keeping the number of cases low. All bolded terms are defined in the Glossary; Appendix 1.
COVI White Paper
Alsdurf, Hannah, Belliveau, Edmond, Bengio, Yoshua, Deleu, Tristan, Gupta, Prateek, Ippolito, Daphne, Janda, Richard, Jarvie, Max, Kolody, Tyler, Krastev, Sekoul, Maharaj, Tegan, Obryk, Robert, Pilat, Dan, Pisano, Valerie, Prud'homme, Benjamin, Qu, Meng, Rahaman, Nasim, Rish, Irina, Rousseau, Jean-Francois, Sharma, Abhinav, Struck, Brooke, Tang, Jian, Weiss, Martin, Yu, Yun William
The SARS-CoV-2 (Covid-19) pandemic has caused significant strain on public health institutions around the world. Contact tracing is an essential tool to change the course of the Covid-19 pandemic. Manual contact tracing of Covid-19 cases has significant challenges that limit the ability of public health authorities to minimize community infections. Personalized peer-to-peer contact tracing through the use of mobile apps has the potential to shift the paradigm. Some countries have deployed centralized tracking systems, but more privacy-protecting decentralized systems offer much of the same benefit without concentrating data in the hands of a state authority or for-profit corporations. Machine learning methods can circumvent some of the limitations of standard digital tracing by incorporating many clues and their uncertainty into a more graded and precise estimation of infection risk. The estimated risk can provide early risk awareness, personalized recommendations and relevant information to the user. Finally, non-identifying risk data can inform epidemiological models trained jointly with the machine learning predictor. These models can provide statistical evidence for the importance of factors involved in disease transmission. They can also be used to monitor, evaluate and optimize health policy and (de)confinement scenarios according to medical and economic productivity indicators. However, such a strategy based on mobile apps and machine learning should proactively mitigate potential ethical and privacy risks, which could have substantial impacts on society (not only impacts on health but also impacts such as stigmatization and abuse of personal data). Here, we present an overview of the rationale, design, ethical considerations and privacy strategy of `COVI,' a Covid-19 public peer-to-peer contact tracing and risk awareness mobile application developed in Canada.
Analysis of Gene Interaction Graphs for Biasing Machine Learning Models
Bertin, Paul, Hashir, Mohammad, Weiss, Martin, Boucher, Geneviève, Frappier, Vincent, Cohen, Joseph Paul
Gene interaction graphs aim to capture various relationships between genes and can be used to create more biologically-intuitive models for machine learning. There are many such graphs available which can differ in the number of genes and edges covered. In this work, we attempt to evaluate the biases provided by those graphs through utilizing them for 'Single Gene Inference' (SGI) which serves as, what we believe is, a proxy for more relevant prediction tasks. The SGI task assesses how well a gene's neighbors in a particular graph can 'explain' the gene itself in comparison to the baseline of using all the genes in the dataset. We evaluate seven major gene interaction graphs created by different research groups on two distinct datasets, TCGA and GTEx. We find that some graphs perform on par with the unbiased baseline for most genes with a significantly smaller feature set.
A Survey of Mobile Computing for the Visually Impaired
Weiss, Martin, Luck, Margaux, Girgis, Roger, Pal, Chris, Cohen, Joseph Paul
The number of visually impaired or blind (VIB) people in the world is estimated at several hundred million[4]. Based on a series of interviews with the VIB and developers of assistive technology, this paper provides a survey of machine-learning based mobile applications and identifies the most relevant applications. We discuss the functionality of these apps, how they align with the needs and requirements of the VIB users, and how they can be improved with techniques such as federated learning and model compression. As a result of this study we identify promising future directions of research in mobile perception, micro-navigation, and contentsummarization.
Towards Gene Expression Convolutions using Gene Interaction Graphs
Dutil, Francis, Cohen, Joseph Paul, Weiss, Martin, Derevyanko, Georgy, Bengio, Yoshua
We study the challenges of applying deep learning to gene expression data. We find experimentally that there exists non-linear signal in the data, however is it not discovered automatically given the noise and low numbers of samples used in most research. We discuss how gene interaction graphs (same pathway, protein-protein, co-expression, or research paper text association) can be used to impose a bias on a deep model similar to the spatial bias imposed by convolutions on an image. We explore the usage of Graph Convolutional Neural Networks coupled with dropout and gene embeddings to utilize the graph information. We find this approach provides an advantage for particular tasks in a low data regime but is very dependent on the quality of the graph used. We conclude that more work should be done in this direction. We design experiments that show why existing methods fail to capture signal that is present in the data when features are added which clearly isolates the problem that needs to be addressed.