Collaborating Authors


13 Free Model Zoos for Deep Learning and Computer Vision Models


Computer vision is a fast-growing subfield of AI and deep learning. From cashierless stores in retail to crop detection in agriculture, there's an increasing interest in CV applications. This has created a vibrant community that gladly shares architectures, codes, pre-trained models, and even tips for every stage of the development cycle. Starting a CV project from scratch takes time. So, the usual process is, given a problem or a use case, you look for models that partially solve it.

Manipulating the future


As robots evolve, society's collective imagination forever ponders what else robots can do, with recent fascinations coming to life as self-driving cars or robots that can walk and interact with objects as humans do. These sophisticated systems are powered by advances in deep learning that triggered breakthroughs in robotic perception, so that robots today have greater potential for better decision-making and improved functioning in real-world environments. But tomorrow's roboticists need to understand how to combine deep learning with dynamics, controls, and long-term planning. To keep this momentum in robotic manipulation going forward, engineers today must learn to hover above the whole field, connecting an increasingly diverse set of ideas with an interdisciplinary focus needed to design increasingly complex robotic systems. Last fall, MIT's Department of Electrical Engineering and Computer Science launched a new course, 6.800 (Robotic Manipulation) to help engineering students broadly survey the latest advancements in robotics while troubleshooting real industry problems.

Deep Learning Code Generation from Simulink Applications - MATLAB & Simulink


You can accelerate the simulation of your algorithms in Simulink by using different execution environments. By using support packages, you can also generate and deploy C/C and CUDA code on target hardware. Simulate and generate code for deep learning models in Simulink using MATLAB function blocks. Simulate and generate code for deep learning models in Simulink using library blocks. This example shows how to develop a CUDA application from a Simulink model that performs lane and vehicle detection using convolutional neural networks (CNN).

My 2-year journey into deep learning as a medical student -- Part II: Courses


Deep learning and machine learning courses that I've taken along the way in learning deep learning. It's time to introduce the courses that I've used along this way that helped me get started and grow in the field. You should also keep in mind that there are probably many more and newer courses out there as the community keeps providing interesting educational material every day. So, keep on searching too. This fact aside, I believe the following list introduces high quality courses for many fields that most of you will be okay to start with and learn lots of new things from.

Deep Learning Computer Vision CNN, OpenCV, YOLO, SSD & GANs


If you want to learn all the latest 2019 concepts in applying Deep Learning to Computer Vision, look no further - this is the course for you! You'll get hands the following Deep Learning frameworks in Python:

Molecular Deep Learning using DeepChem


I vividly remember my high school Chemistry teacher teaching us about covalent bonds using a 3D model of a water molecule. I also remember enjoying my time in the Chemistry lab trying to determine if a given salt is more acidic or alkaline by performing many tests. How would this setup change if we needed to replace the human performing these experiments with a machine? Recently, my curiosity about applying deep learning architectures in the life sciences resulted in an interesting learning opportunity. I stumbled onto some libraries like RDKit and DeepChem that help with training and developing deep learning data models for use in Drug Discovery.

AI language processing startup Cohere raises US$125 million: The Globe and Mail


Cohere Inc., an AI startup founded by University of Toronto alumni that uses natural language processing to improve human-machine interactions, has raised US$125 million as it looks to open a new office in Silicon Valley, the Globe and Mail reports. The latest financing round, led by New York-based Tiger Global Management, comes only five months after Cohere secured $US40 million in venture capital financing, according to the Globe. Cohere's software platform helps companies infuse natural language processing capabilities into their business using tools like chatbots, without requiring AI expertise of their own. The company originated in a 2017 paper co-authored by CEO Aidan Gomez, who interned at the Google Brain lab of deep learning pioneer and University Professor Emeritus Geoffrey Hinton, a Cohere investor. Cohere's other co-founders are alumnus Nick Frosst, who also worked with Hinton at Google, and Ivan Zhang, a former U of T computer science student.

State of AI Ethics Report (Volume 6, February 2022) Artificial Intelligence

This report from the Montreal AI Ethics Institute (MAIEI) covers the most salient progress in research and reporting over the second half of 2021 in the field of AI ethics. Particular emphasis is placed on an "Analysis of the AI Ecosystem", "Privacy", "Bias", "Social Media and Problematic Information", "AI Design and Governance", "Laws and Regulations", "Trends", and other areas covered in the "Outside the Boxes" section. The two AI spotlights feature application pieces on "Constructing and Deconstructing Gender with AI-Generated Art" as well as "Will an Artificial Intellichef be Cooking Your Next Meal at a Michelin Star Restaurant?". Given MAIEI's mission to democratize AI, submissions from external collaborators have featured, such as pieces on the "Challenges of AI Development in Vietnam: Funding, Talent and Ethics" and using "Representation and Imagination for Preventing AI Harms". The report is a comprehensive overview of what the key issues in the field of AI ethics were in 2021, what trends are emergent, what gaps exist, and a peek into what to expect from the field of AI ethics in 2022. It is a resource for researchers and practitioners alike in the field to set their research and development agendas to make contributions to the field of AI ethics.


AAAI Conferences

Recent years have seen a growing interest in player modeling to create player-adaptive digital games. As a core player-modeling task, goal recognition aims to recognize players' latent, high-level intentions in a non-invasive fashion to deliver goal-driven, tailored game experiences. This paper reports on an investigation of multimodal data streams that provide rich evidence about players' goals. Two data streams, game event traces and player gaze traces, are utilized to devise goal recognition models from a corpus collected from an open-world serious game for science education. Empirical evaluations of 140 players' trace data suggest that multimodal LSTM-based goal recognition models outperform competitive baselines, including unimodal LSTMs as well as multimodal and unimodal CRFs, with respect to predictive accuracy and early prediction. The results demonstrate that player gaze traces have the potential to significantly enhance goal recognition models' performance.

Language Model-Based Paired Variational Autoencoders for Robotic Language Learning Artificial Intelligence

Human infants learn language while interacting with their environment in which their caregivers may describe the objects and actions they perform. Similar to human infants, artificial agents can learn language while interacting with their environment. In this work, first, we present a neural model that bidirectionally binds robot actions and their language descriptions in a simple object manipulation scenario. Building on our previous Paired Variational Autoencoders (PVAE) model, we demonstrate the superiority of the variational autoencoder over standard autoencoders by experimenting with cubes of different colours, and by enabling the production of alternative vocabularies. Additional experiments show that the model's channel-separated visual feature extraction module can cope with objects of different shapes. Next, we introduce PVAE-BERT, which equips the model with a pretrained large-scale language model, i.e., Bidirectional Encoder Representations from Transformers (BERT), enabling the model to go beyond comprehending only the predefined descriptions that the network has been trained on; the recognition of action descriptions generalises to unconstrained natural language as the model becomes capable of understanding unlimited variations of the same descriptions. Our experiments suggest that using a pretrained language model as the language encoder allows our approach to scale up for real-world scenarios with instructions from human users.