DeepMind's new AI system can perform over 600 tasks – TechCrunch


The ultimate achievement to some in the AI industry is creating a system with artificial general intelligence (AGI), or the ability to understand and learn any task that a human can. Long relegated to the domain of science fiction, it's been suggested that AGI would bring about systems with the ability to reason, plan, learn, represent knowledge, and communicate in natural language. Not every expert is convinced that AGI is a realistic goal -- or even possible. Gato is what DeepMind describes as a "general-purpose" system, a system that can be taught to perform many different types of tasks. Researchers at DeepMind trained Gato to complete 604, to be exact, including captioning images, engaging in dialogue, stacking blocks with a real robot arm, and playing Atari games. Jack Hessel, a research scientist at the Allen Institute for AI, points out that a single AI system that can solve many tasks isn't new.

Building capacity for artificial intelligence


In support of the Alberta Technology and Innovation Strategy (ATIS) and in partnership with AltaML, a leading Canadian artificial intelligence company, the AI lab (named AltaML will work alongside government staff and post-secondary students and graduates as they work to develop smart products and models that leverage AI to solve complex, real-world problems. The lab will create opportunities for Alberta's public and private sectors to create intellectual property while accelerating Alberta's recovery and economic diversification. "Alberta is a world leader in AI and machine learning research. With the launch of, Ultimately this will help Alberta's government offer better services, better results and better value to Albertans."

Artificial intelligence beats eight world champions at bridge

The Guardian

An artificial intelligence has beaten eight world champions at bridge, a game in which human supremacy has resisted the march of the machines until now. The victory represents a new milestone for AI because in bridge players work with incomplete information and must react to the behaviour of several other players – a scenario far closer to human decision-making. In contrast, chess and Go – in both of which AIs have already beaten human champions – a player has a single opponent at a time and both are in possession of all the information. "What we've seen represents a fundamentally important advance in the state of artificial intelligence systems," said Stephen Muggleton, a professor of machine learning at Imperial College London. French startup NukkAI announced the news of its AI's victory on Friday, at the end of a two-day tournament in Paris.

Working in Artificial Intelligence and Machine Learning at Electronic Arts and Bioware Presentation, March 25, 2022 (University of Alberta)


He has been involved in many areas that make use of AI and ML at EA, particularly AI for games development and verification. He started out with game development, but is now with the AI support team, which supports all of the company's teams.

Top resources to learn reinforcement learning in 2022


Rich S. Sutton, a research scientist at DeepMind and computing science professor at the University of Alberta, explains the underlying formal problem like the Markov decision processes, core solution methods, dynamic programming, Monte Carlo methods, and temporal-difference learning in this in-depth tutorial.

Scientists create cube robots that can shapeshift in space


Scientists from MIT's Computer Science and Artificial Intelligence Laboratory ( CSAIL) and the University of Calgary have developed a modular robot system that can morph into different shapes. ElectroVoxels don't have any motors or moving parts. Instead, they use electromagnets to shift around each other. Each edge of an ElectroVoxel cube is an electromagnetic ferrite core wrapped with copper wire. The length of each ElectroVoxel side is around 60 millimeters.

It's all in the research: Using AI to solve issues in health care


The University of Alberta uses SAS Viya to help its researchers expand their capacity for big data analysis and support the use of open source software and other tools popular among students. Conducting research is not a straightforward process, and the terabytes of data cascading into labs (both physical and virtual) requires serious horsepower to analyze. Personal desktops and small servers are increasingly coming up short in meeting the demands of artificial intelligence and machine learning projects. Data also comes in various shapes and sizes. Researchers often combine data related to diagnostic imaging, risk prediction, clinical trials and much more.

Reward-Respecting Subtasks for Model-Based Reinforcement Learning Artificial Intelligence

To achieve the ambitious goals of artificial intelligence, reinforcement learning must include planning with a model of the world that is abstract in state and time. Deep learning has made progress in state abstraction, but, although the theory of time abstraction has been extensively developed based on the options framework, in practice options have rarely been used in planning. One reason for this is that the space of possible options is immense and the methods previously proposed for option discovery do not take into account how the option models will be used in planning. Options are typically discovered by posing subsidiary tasks such as reaching a bottleneck state, or maximizing a sensory signal other than the reward. Each subtask is solved to produce an option, and then a model of the option is learned and made available to the planning process. The subtasks proposed in most previous work ignore the reward on the original problem, whereas we propose subtasks that use the original reward plus a bonus based on a feature of the state at the time the option stops. We show that options and option models obtained from such reward-respecting subtasks are much more likely to be useful in planning and can be learned online and off-policy using existing learning algorithms. Reward respecting subtasks strongly constrain the space of options and thereby also provide a partial solution to the problem of option discovery. Finally, we show how the algorithms for learning values, policies, options, and models can be unified using general value functions.

Legal Innovation Data Institute joint venture launches machine learning research tool


AltaML was founded in 2018, employs around 130 employees and has offices in Edmonton, Calgary and Toronto. The company works in industries such as oil and gas, banking, forestry, agriculture and health. "We work with them to uncover those possibilities for the application of machine learning," says Rabelo. "When we identify those opportunities, we develop studies – basically, experiments – to see if our hypotheses really hold true when we apply them to real world data." "As we validate those hypotheses, those opportunities move on a chain and eventually they reach solution phase, where they are deployed to production. They are developed as part of software system or an [application programming interface] or something that can be directly deployed to those industries, to those clients."

A Temporal-Difference Approach to Policy Gradient Estimation Artificial Intelligence

The policy gradient theorem (Sutton et al., 2000) prescribes the usage of a cumulative discounted state distribution under the target policy to approximate the gradient. Most algorithms based on this theorem, in practice, break this assumption, introducing a distribution shift that can cause the convergence to poor solutions. In this paper, we propose a new approach of reconstructing the policy gradient from the start state without requiring a particular sampling strategy. The policy gradient calculation in this form can be simplified in terms of a gradient critic, which can be recursively estimated due to a new Bellman equation of gradients. By using temporal-difference updates of the gradient critic from an off-policy data stream, we develop the first estimator that sidesteps the distribution shift issue in a model-free way. We prove that, under certain realizability conditions, our estimator is unbiased regardless of the sampling strategy. We empirically show that our technique achieves a superior bias-variance trade-off and performance in presence of off-policy samples.