Africa
Privacy fears as MILLIONS of photos used to train facial recognition AI without users' consent
Many facial recognition systems are being trained using millions of online photos uploaded by everyday people and, more often than not, the photos are being taken without users' consent, an NBC News investigation has found. In one worrying case, IBM scraped almost a million photos from unsuspecting users on Flickr to build its facial recognition database. The practice not only raises privacy concerns, but also fuels fears that the systems could one day be used to disproportionately target minorities. Many facial recognition systems are being trained using millions of online photos uploaded by everyday people and, more often than not, the photos are being taken without users' consent IBM's database, called'Diversity in Faces,' was released in January as part of the company's efforts to'advance the study of fairness and accuracy in facial recognition technology.' The database was released following a study from MIT Media Lab researcher Joy Buolamwini, which found that popular facial recognition services from Microsoft, IBM and Face vary in accuracy based on gender and race.
Tensor Grid Decomposition with Application to Tensor Completion
Huang, Huyan, Liu, Yipeng, Zhu, Ce
The recently prevalent tensor train (TT) and tensor ring (TR) decompositions can be graphically interpreted as (locally) linear interconnected latent factors and possess exponential decay of correlation. The projected entangled pair state (PEPS, also called two-dimensional TT) extends the spatial dimension of TT and its polycyclic structure can be considered as a square grid. Compared with TT, its algebraic decay of correlation means the enhancement of interaction between tensor modes. In this paper we adopt the PEPS and develop a tensor grid (TG) decomposition with its efficient realization termed splitting singular value decomposition (SSVD). By utilizing the alternating least squares (ALS) a method called TG-ALS is used to interpolate the missing entries of a tensor from its partial observations. Different kinds of data are used in the experiments, including synthetic data, color images and real-world videos. Experimental results demonstrate that the TG has much power of representation than TT and TR.
On the Pitfalls of Measuring Emergent Communication
Lowe, Ryan, Foerster, Jakob, Boureau, Y-Lan, Pineau, Joelle, Dauphin, Yann
How do we know if communication is emerging in a multi-agent system? The vast majority of recent papers on emergent communication show that adding a communication channel leads to an increase in reward or task success. This is a useful indicator, but provides only a coarse measure of the agent's learned communication abilities. As we move towards more complex environments, it becomes imperative to have a set of finer tools that allow qualitative and quantitative insights into the emergence of communication. This may be especially useful to allow humans to monitor agents' behaviour, whether for fault detection, assessing performance, or even building trust. In this paper, we examine a few intuitive existing metrics for measuring communication, and show that they can be misleading. Specifically, by training deep reinforcement learning agents to play simple matrix games augmented with a communication channel, we find a scenario where agents appear to communicate (their messages provide information about their subsequent action), and yet the messages do not impact the environment or other agent in any way. We explain this phenomenon using ablation studies and by visualizing the representations of the learned policies. We also survey some commonly used metrics for measuring emergent communication, and provide recommendations as to when these metrics should be used.
Is Nigeria's Compliance Industry Ready for Challenges of Regulatory Technology? - THISDAYLIVE
Today's customers demand more options, more creative solutions, greater flexibility and faster responses from banks and other financial institutions. Survival and success for financial institutions in this new world requires that they operate with intelligence, agility and speed to keep up with evolving customer preferences and technologies. Consequently, more and more customer interactions and financial transactions are going digital as online and mobile payments, customer on-boarding and account opening are on the rise. Yet, while digital interfaces present an opening for innovative business services, they also yield new challenges, such as pressure on back office operations or increased regulatory scrutiny. Largely automated interactions generate more data to analyse, demand higher volumes of sample testing, and expand the compliance burden. To create a flawless customer experience, the back office has to keep up as well.
Bayesian Allocation Model: Inference by Sequential Monte Carlo for Nonnegative Tensor Factorizations and Topic Models using Polya Urns
Cemgil, Ali Taylan, Kurutmaz, Mehmet Burak, Yildirim, Sinan, Barsbey, Melih, Simsekli, Umut
We introduce a dynamic generative model, Bayesian allocation model (BAM), which establishes explicit connections between nonnegative tensor factorization (NTF), graphical models of discrete probability distributions and their Bayesian extensions, and the topic models such as the latent Dirichlet allocation. BAM is based on a Poisson process, whose events are marked by using a Bayesian network, where the conditional probability tables of this network are then integrated out analytically. We show that the resulting marginal process turns out to be a Polya urn, an integer valued self-reinforcing process. This urn processes, which we name a Polya-Bayes process, obey certain conditional independence properties that provide further insight about the nature of NTF. These insights also let us develop space efficient simulation algorithms that respect the potential sparsity of data: we propose a class of sequential importance sampling algorithms for computing NTF and approximating their marginal likelihood, which would be useful for model selection. The resulting methods can also be viewed as a model scoring method for topic models and discrete Bayesian networks with hidden variables. The new algorithms have favourable properties in the sparse data regime when contrasted with variational algorithms that become more accurate when the total sum of the elements of the observed tensor goes to infinity. We illustrate the performance on several examples and numerically study the behaviour of the algorithms for various data regimes.
The Symbiotic Nature of AI and Neuroscience
Neuroscience and artificial intelligence (AI) are two very different scientific disciplines. Neuroscience traces back to ancient civilizations, and AI is a decidedly modern phenomenon. At a cursory glance, it would seem that a branch of science of living systems would have little in common with one that springs from inanimate machines wholly created by humans. Yet discoveries in one field may result in breakthroughs in the other-- the two fields share a significant problem, and future opportunities. The origins of modern neuroscience is rooted in ancient human civilizations. One of the first descriptions of the brain's structure and neurosurgery can be traced back to 3000 - 2500 B.C. largely due to the efforts of the American Egyptologist Edwin Smith.
Algorithms for an Efficient Tensor Biclustering
Faneva, Andriantsiory Dina, Lebbah, Mustapha, Azzag, Hanane, Beck, Gaël
Consider a data set collected by (individuals-features) pairs in different times. It can be represented as a tensor of three dimensions (Individuals, features and times). The tensor biclustering problem computes a subset of individuals and a subset of features whose signal trajectories over time lie in a low-dimensional subspace, modeling similarity among the signal trajectories while allowing different scalings across different individuals or different features. This approach are based on spectral decomposition in order to build the desired biclusters. We evaluate the quality of the results from each algorithms with both synthetic and real data set.
Getting smart about artificial intelligence
Genomics is set to become the biggest source of data on the planet, overtaking the current leading heavyweights – astronomy, YouTube and Twitter. Genome sequencing currently produces a staggering 25 petabytes of digital information per year. A petabyte is 1015 bytes, or about 1,000 times the average storage on a personal computer. And there is no sign of a slowdown. The amount of DNA sequencing data produced around the world is doubling approximately every seven months.
Is Artificial Intelligence the future tool for anti-corruption?
Artificial Intelligence (AI) can be an effective tool in anti-corruption work. Its potential for handling big data is unique, its ability to detect anomalies or patterns, for example in financial transaction data, unparalleled. Some of the ways AI is applied in society also raise sceptic voices who fear a society under ever more surveillance where privacy and individual freedom is at risk. The risks and opportunities of new technologies for anti-corruption is up for discussion at the OECD Global Anti-Corruption & Integrity Forum conference in Paris on March 20-21. Ethical dilemmas, perils and promises, in short, AI's potential and pitfalls as a tool in and for anti-corruption in development programming are up for debate.
BCI: Will you connect your brain to a computer? – Trust This Robot
Brain to Computer Interfaces (BCI) are a tough subject to write on. The most current technology is likely in the research stage and is not yet being publicly reported. So let's take a look at what has been reported over the last couple of years with the understanding that scientists are likely years ahead. In other words, the technology is here….we The first publicly reported successful and non-non-invassive BCI was reported in a press release and titled the "Brainternet" by the University of Witwatersrand, Johannesburg.