AITopics

2506.16051

Country: North America > United States > California > Los Angeles County > Los Angeles (0.15)

Genre:

Research Report (1.00)
Workflow (0.83)

Industry: Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.91)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Quality > Data Cleaning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.34)

Shivashankar, Karthik, Martini, Antonio

MLScent A tool for Anti-pattern detection in ML projects

arXiv.org Artificial IntelligenceJan-30-2025

--Machine learning (ML) codebases face unprecedented challenges in maintaining code quality and sustainability as their complexity grows exponentially. While traditional code smell detection tools exist, they fail to address ML-specific issues that can significantly impact model performance, reproducibility, and maintainability. This paper introduces MLScent, a novel static analysis tool that leverages sophisticated Abstract Syntax Tree (AST) analysis to detect anti-patterns and code smells specific to ML projects. MLScent implements 76 distinct detectors across major ML frameworks including T ensorFlow (13 detectors), PyT orch (12 detectors), Scikit-learn (9 detectors), and Hugging Face (10 detectors), along with data science libraries like Pandas and NumPy (8 detectors each). Our evaluation demonstrates MLScent's effectiveness through both quantitative classification metrics and qualitative assessment via user studies feedback with ML practitioners. Results show high accuracy in identifying framework-specific anti-patterns, data handling issues, and general ML code smells across real-world projects. The software development landscape has undergone a dramatic transformation with the integration of Machine Learning (ML). Recent statistics from Gartner highlight this shift, revealing a striking 270% increase in ML adoption within enterprise software projects over the last four years [1]. This rapid adoption, however, brings its own set of complexities. Traditional software development practices have had to evolve significantly to accommodate ML's unique requirements, including the need for extensive datasets, sophisticated algorithms, and iterative development cycles [3]. These fundamental differences have catalyzed a complete reimagining of software development methodologies, from initial design through testing and maintenance [4], [5] which is also highlighted by Tang et al. [6] in their empirical study of ML systems refactoring and technical debt. ML projects introduce distinct code quality challenges that set them apart from conventional software development. The complexity stems from their inherent characteristics: intricate mathematical operations, extensive data preprocessing requirements, and sophisticated model architectures that challenge traditional code maintenance approaches [7].

detection, effectiveness, mlscent, (16 more...)

2502.18466

Country:

North America > United States (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.88)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

arXiv.org Artificial IntelligenceDec-18-2024

TAACKIT: Track Annotation and Analytics with Continuous Knowledge Integration Tool

Lee, Lily, Fontes, Julian, Weinert, Andrew, Schomacker, Laura, Stabile, Daniel, Hou, Jonathan

Machine learning (ML) is a powerful tool for efficiently analyzing data, detecting patterns, and forecasting trends across various domains such as text, audio, and images. The availability of annotation tools to generate reliably annotated data is crucial for advances in ML applications. In the domain of geospatial tracks, the lack of such tools to annotate and validate data impedes rapid and accessible ML application development. This paper presents Track Annotation and Analytics with Continuous Knowledge Integration Tool (TAACKIT) to serve the critically important functions of annotating geospatial track data and validating ML models. We demonstrate an ML application use case in the air traffic domain to illustrate its data annotation and model evaluation power and quantify the annotation effort reduction.

artificial intelligence, information fusion, machine learning, (18 more...)

doi: 10.1109/AIxDKE63520.2024.00028

2412.16228

Country: North America > United States > Massachusetts > Middlesex County (0.14)

Genre: Research Report (0.50)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Díaz, Mark, Smith, Angela DR

What Makes An Expert? Reviewing How ML Researchers Define "Expert"

arXiv.org Artificial IntelligenceOct-31-2024

Human experts are often engaged in the development of machine learning systems to collect and validate data, consult on algorithm development, and evaluate system performance. At the same time, who counts as an 'expert' and what constitutes 'expertise' is not always explicitly defined. In this work, we review 112 academic publications that explicitly reference 'expert' and 'expertise' and that describe the development of machine learning (ML) systems to survey how expertise is characterized and the role experts play. We find that expertise is often undefined and forms of knowledge outside of formal education and professional certification are rarely sought, which has implications for the kinds of knowledge that are recognized and legitimized in ML development. Moreover, we find that expert knowledge tends to be utilized in ways focused on mining textbook knowledge, such as through data annotation. We discuss the ways experts are engaged in ML development in relation to deskilling, the social construction of expertise, and implications for responsible AI development. We point to a need for reflection and specificity in justifications of domain expert engagement, both as a matter of documentation and reproducibility, as well as a matter of broadening the range of recognized expertise.

artificial intelligence, expertise, machine learning, (19 more...)

2411.00179

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Massachusetts > Middlesex County > Waltham (0.04)
(3 more...)

Genre:

Overview (0.87)
Research Report (0.82)

Industry:

Health & Medicine (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.86)

#artificialintelligenceMar-18-2023, 10:20:32 GMT

Service in review: Sagemaker Modeling Pipelines - DEV Community

Welcome back to my blog, where I share insights and tips on machine learning workflows using Sagemaker Pipelines. If you're new here, I recommend checking out my first post to learn more about this AWS fully managed machine learning service. In my second post, I discussed how parameterization can help you customize the workflow and make it more flexible and efficient. After using Sagemaker Pipelines extensively in real-life projects, I've gained a comprehensive understanding of the service. In this post, I'll summarize the key benefits of using Sagemaker Pipelines and the limitations you should consider before implementing it. This service is integrated with Sagemaker directly, so the user doesn't have to deal with other AWS services.

pipeline, sagemaker modeling pipeline, workflow, (11 more...)

Genre: Workflow (0.67)

Industry: Education (0.58)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

#artificialintelligenceOct-16-2022, 19:37:15 GMT

Does your business need AI/ML Development?

AI/ML is an innovative feat of technology that many look at as the future wave -- using its ability to analyze heaps of data (structured as well as unstructured) and make intelligent decisions without needing human intervention. AI/ML development companies can enable you to increase your productivity and minimize operational inefficiencies across several facets. Many businesses can leverage AI to employ complex use cases and support business development lifecycles- from customer support to marketing and lead generation. Here are some areas where hiring an ai/ml development company can prove to bring positive fruition for your business. Digital channels are a crucial part of any marketing program.

ai ml development, development company, ml development, (6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceAug-19-2022, 17:23:14 GMT

Introducing the Private Hub: A New Way to Build With Machine Learning

Machine learning is changing how companies are building technology. From powering a new generation of disruptive products to enabling smarter features in well-known applications we all use and love, ML is at the core of the development process. But with every technology shift comes new challenges. Around 90% of machine learning models never make it into production. Efforts get duplicated as models and datasets aren't shared internally, and similar artifacts are built from scratch across teams all the time.

dataset, hub, private hub, (13 more...)

Industry: Information Technology > Security & Privacy (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceApr-1-2022, 15:07:27 GMT

How to find the business value in AI and ML

We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 - August 3. Join AI and data leaders for insightful talks and exciting networking opportunities. There's no doubt that, when applied effectively, machine learning (ML) and artificial intelligence (AI) have proven potential to deliver significant value and cutting-edge technological innovation. But many organizations are struggling with the "effectively" part, according to a new survey. Despite the fact that businesses are increasingly undertaking initiatives to leverage ML and AI, many tools and projects lack appropriate resources, are far less productive than they should be, lag in deployment, and more often than not, fail or are abandoned. In short, business value is rarely captured – and very often falls short of expectations – because significant time, resources and budgets are being wasted, according to a 2021 survey of ML practitioners, "Too Much Friction, Too Little ML." "Building AI is hard," said Gideon Mendels, CEO and cofounder of Comet, the enterprise ML development platform company that commissioned the survey. "ML is often a slow, iterative process with many potential pitfalls and moving parts.

business value, mendel, tool and process, (14 more...)

Country: North America > United States > New York (0.05)

Genre: Questionnaire & Opinion Survey (0.73)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceMar-3-2022, 00:35:14 GMT

Top Emerging Machine Learning Trends For 2022

Machine learning creates algorithms that support machines in better comprehending data and making data-driven judgments. According to some observers, machine learning will become quite widespread by 2024, with the most emphasis in 2022 and 2023. Machine learning (ML) applications can be found in a variety of industries, including banks, restaurants, industrial plants, and even gas stations. The first and most important ML developments in IoT, which the majority of computer employees are looking forward to. As the cornerstone for IoT, a breakthrough in this area will significantly impact 5G adoption.

advancement, application, learning, (13 more...)

Genre: Research Report (0.36)

Industry:

Information Technology (0.52)
Transportation > Ground > Road (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceMar-14-2021, 08:00:16 GMT

How to scale AI with a high degree of customization

In a previous post, I outlined four challenges to scaling AI: customization, data, talent, and trust. In this post, I'm going to dig deeper into that first challenge of customization. Scaling machine learning programs is very different to scaling traditional software because they have to be adapted to fit any new problem you approach. As the data you're using changes (whether because you're attacking a new problem or simply because time has passed), you will likely need to build and train new models. This takes human input and supervision. The degree of supervision varies, and that is critical to understanding the scalability challenge.

ai system, business unit, customization, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)