Collaborating Authors

Computer Literacy & Computer Science

Teaching computers to read health records is helping fight COVID-19 – here's how


Medical records are a rich source of health data. When combined, the information they contain can help researchers better understand diseases and treat them more effectively. But to unlock this rich resource, researchers first need to read it. We may have moved on from the days of handwritten medical notes, but the information recorded in modern electronic health records can be just as hard to access and interpret. It's an old joke that doctors' handwriting is illegible, but it turns out their typing isn't much better.

Digital diagnosis: Why teaching computers to read medical records could help against COVID-19


These algorithms are extremely complex. They need to understand context, long strings of words and medical concepts, distinguish current events from historic ones, identify family relationships and more. We teach them to do this by feeding them existing written information so they can learn the structure and meaning of language – in this case, publicly available English text from the internet – and then use real medical records for further improvement and testing.

PNNL machine learning scientists teach computers to read X-ray images


If a person in the developing world severely fractures a limb, they face an impossible choice. An improperly healed fracture could mean a lifetime of pain, but lengthy healing time in traction or a bulky cast results in immediate financial hardship. That's why Pacific Northwest National Laboratory (PNNL) machine learning scientists leaped into action when they learned they could help a local charity enable patients in the developing world to walk within one week of surgery--even when fractures are severe. For more than 20 years, the Richland, Washington-based charity SIGN Fracture Care has pioneered orthopedic care, including training and innovatively designed implants that speed healing without real-time operating room X-ray machines. During those 20 years, they've built a database of 500,000 procedure images and outcomes that serves as a learning hub for doctors around the world.

10 Days With "Deep Learning for Coders" - KDnuggets


I started Practical Deep Learning for Coders 10 days ago. I am compelled to say their pragmatic approach is exactly what I needed. I started data science by learning Python, Pandas, NumPy, and whatever I needed in a short few months. I did whatever courses I need to do (e.g. Kaggle micro-courses) and whatever books I needed to read (e.g.

Machine Learning Scientists Teach Computers to Read X-Ray Images – IAM Network


Increasing use of motorized vehicles, such as these in Cameroon, are the dominate cause of injury in low- and middle-income countries.

Using Computer Programs and Search Problems for Teaching Theory of Computation

Communications of the ACM

The theory of computation is one of the crown jewels of the computer science curriculum. It stretches from the discovery of mathematical problems, such as the halting problem, that cannot be solved by computers, to the most celebrated open problem in computer science today: the P vs. NP question. Since the founding of our discipline by Church and Turing in the 1930s, the theory of computation has addressed some of the most fundamental questions about computers: What does it mean to compute the solution to a problem? Which problems can be solved by computers? Which problems can be solved efficiently, in theory and in practice?

How to Self-Teach Computer Science


My first encounter with computer science was in grade 5, when my mom put me in my local library's C and HTML classes. At only grade 5, computer science seemed like an alien language. After struggling to write my program for hours, I gave up. I told myself that computer science was simply not for me. Fast-forward to high school, and I didn't choose any computer science courses.

Analogy-Making as a Core Primitive in the Software Engineering Toolbox Artificial Intelligence

An analogy is an identification of structural similarities and correspondences between two objects. Computational models of analogy making have been studied extensively in the field of cognitive science to better understand high-level human cognition. For instance, Melanie Mitchell and Douglas Hofstadter sought to better understand high-level perception by developing the Copycat algorithm for completing analogies between letter sequences. In this paper, we argue that analogy making should be seen as a core primitive in software engineering. We motivate this argument by showing how complex software engineering problems such as program understanding and source-code transformation learning can be reduced to an instance of the analogy-making problem. We demonstrate this idea using Sifter, a new analogy-making algorithm suitable for software engineering applications that adapts and extends ideas from Copycat. In particular, Sifter reduces analogy-making to searching for a sequence of update rule applications. Sifter uses a novel representation for mathematical structures capable of effectively representing the wide variety of information embedded in software. We conclude by listing major areas of future work for Sifter and analogy-making in software engineering.

Efficient Framework for Learning Code Representations through Semantic-Preserving Program Transformations Artificial Intelligence

Recent learning techniques for the representation of code depend mostly on human-annotated (labeled) data. In this work, we are proposing Corder, a self-supervised learning system that can learn to represent code without having to label data. The key innovation is that we train the source code model by asking it to recognize similar and dissimilar code snippets through a contrastive learning paradigm. We use a set of semantic-preserving transformation operators to generate snippets that are syntactically diverse but semantically equivalent. The contrastive learning objective, at the same time, maximizes agreement between different views of the same snippets and minimizes agreement between transformed views of different snippets. We train different instances of Corder on 3 neural network encoders, which are Tree-based CNN, ASTNN, and Code2vec over 2.5 million unannotated Java methods mined from GitHub. Our result shows that the Corder pre-training improves code classification and method name prediction with large margins. Furthermore, the code vectors generated by Corder are adapted to code clustering which has been shown to significantly beat the other baselines.