Collaborating Authors

Version Control for Data Science -- Tracking Machine Learning models and datasets


Undoubtedly, GIT is the holy grail of versioning systems! Git is great in versioning the source code. But unlike software engineering, Data Science projects have additional big-ass files like datasets, trained model files, label-encodings etc. which can easily go to the size of a few GBs and therefore cannot be tracked using GIT. DVC helps us to version large data files, similar to how we version control source code files using git. Also, DVC works flawlessly on top of GIT which makes it even better!

Recommender Systems in Requirements Engineering

AI Magazine

Requirements engineering in large-scaled industrial, government, and international projects can be a highly complex process involving thousands, or even hundreds of thousands of potentially distributed stakeholders. As a result, many human intensive tasks in requirements elicitation, analysis, and management processes can be augmented and supported through the use of recommender system and machine learning techniques. In this article we describe several areas in which recommendation technologies have been applied to the requirements engineering domain, namely stakeholder identification, domain analysis, requirements elicitation, and decision support across several requirements analysis and prioritization tasks. We also highlight ongoing challenges and opportunities for applying recommender systems in the requirements engineering domain.

A.Levenchuk -- Machine learning engineering


Permission granted to DeepHack and INCOSE to publish and use. Why is my program not working? You need to know why? Software engineer (systems) To advance theory? Computer Scientist You need to program working properly? Programmers are engineers: software is physical system!

Quick Feature Engineering with Dates Using


As you are no doubt aware, simple date fields are potential treasure troves of data. While, at first glance, a date gives us nothing more than a specific point on a timeline, knowing where this point on the line is relative to other points can generate all sort of insights into a dataset. What you want out of a date is dependent on what it is you are doing. Having external resources containing the answer to some of the less-intrinsic questions above ("Were the Olympics taking place on that date?" -- perhaps a perfectly valid question given your project) would certainly be necessary, but even sussing out the more elementary questions could prove immensely useful. Simple feature engineering on dates can mindlessly take care of the latter.

Teaching Software Engineering for AI-Enabled Systems Artificial Intelligence

Software engineers have significant expertise to offer when building intelligent systems, drawing on decades of experience and methods for building systems that are scalable, responsive and robust, even when built on unreliable components. Systems with artificial-intelligence or machine-learning (ML) components raise new challenges and require careful engineering. We designed a new course to teach software-engineering skills to students with a background in ML. We specifically go beyond traditional ML courses that teach modeling techniques under artificial conditions and focus, in lecture and assignments, on realism with large and changing datasets, robust and evolvable infrastructure, and purposeful requirements engineering that considers ethics and fairness as well. We describe the course and our infrastructure and share experience and all material from teaching the course for the first time.