AITopics | refactor

Collaborating Authors

refactor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CODECLEANER: Elevating Standards with A Robust Data Contamination Mitigation Toolkit

Cao, Jialun, Chen, Songqiang, Zhang, Wuqi, Lo, Hau Ching, Cheung, Shing-Chi

arXiv.org Artificial IntelligenceNov-16-2024

Data contamination presents a critical barrier preventing widespread industrial adoption of advanced software engineering techniques that leverage code language models (CLMs). This phenomenon occurs when evaluation data inadvertently overlaps with the public code repositories used to train CLMs, severely undermining the credibility of performance evaluations. For software companies considering the integration of CLM-based techniques into their development pipeline, this uncertainty about true performance metrics poses an unacceptable business risk. Code refactoring, which comprises code restructuring and variable renaming, has emerged as a promising measure to mitigate data contamination. It provides a practical alternative to the resource-intensive process of building contamination-free evaluation datasets, which would require companies to collect, clean, and label code created after the CLMs' training cutoff dates. However, the lack of automated code refactoring tools and scientifically validated refactoring techniques has hampered widespread industrial implementation. To bridge the gap, this paper presents the first systematic study to examine the efficacy of code refactoring operators at multiple scales (method-level, class-level, and cross-class level) and in different programming languages. In particular, we develop an open-sourced toolkit, CODECLEANER, which includes 11 operators for Python, with nine method-level, one class-level, and one cross-class-level operator. A drop of 65% overlap ratio is found when applying all operators in CODECLEANER, demonstrating their effectiveness in addressing data contamination. Additionally, we migrate four operators to Java, showing their generalizability to another language. We make CODECLEANER online available to facilitate further studies on mitigating CLM data contamination.

data contamination, effectiveness, operator, (11 more...)

arXiv.org Artificial Intelligence

2411.10842

Country:

Asia > China > Hong Kong (0.05)
North America > United States > New York > New York County > New York City (0.04)
Europe > Middle East > Malta > Eastern Region > Northern Harbour District > St. Julian's (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Software (0.86)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Software (0.91)
(2 more...)

Add feedback

REFACTOR: Learning to Extract Theorems from Proofs

Zhou, Jin Peng, Wu, Yuhuai, Li, Qiyang, Grosse, Roger

arXiv.org Artificial IntelligenceFeb-26-2024

Human mathematicians are often good at recognizing modular and reusable theorems that make complex mathematical results within reach. In this paper, we propose a novel method called theoREm-from-prooF extrACTOR (REFACTOR) for training neural networks to mimic this ability in formal mathematical theorem proving. We show on a set of unseen proofs, REFACTOR is able to extract 19.6% of the theorems that humans would use to write the proofs. When applying the model to the existing Metamath library, REFACTOR extracted 16 new theorems. With newly extracted theorems, we show that the existing proofs in the MetaMath database can be refactored. The new theorems are used very frequently after refactoring, with an average usage of 733.5 times, and help shorten the proof lengths. Lastly, we demonstrate that the prover trained on the new-theorem refactored dataset proves more test theorems and outperforms state-of-the-art baselines by frequently leveraging a diverse set of newly extracted theorems. Code can be found at https://github.com/jinpz/refactor.

prop, theorem, wffph prop, (16 more...)

arXiv.org Artificial Intelligence

2402.17032

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Oceania > Fiji > Central Division > Suva (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Adaptive Reconvergence-driven AIG Rewriting via Strategy Learning

Ni, Liwei, Yang, Zonglin, Zhang, Jiaxi, Liu, Junfeng, Li, Huawei, Xie, Biwei, Li, Xinquan

arXiv.org Artificial IntelligenceDec-22-2023

Rewriting is a common procedure in logic synthesis aimed at improving the performance, power, and area (PPA) of circuits. The traditional reconvergence-driven And-Inverter Graph (AIG) rewriting method focuses solely on optimizing the reconvergence cone through Boolean algebra minimization. However, there exist opportunities to incorporate other node-rewriting algorithms that are better suited for specific cones. In this paper, we propose an adaptive reconvergence-driven AIG rewriting algorithm that combines two key techniques: multi-strategy-based AIG rewriting and strategy learning-based algorithm selection. The multi-strategy-based rewriting method expands upon the traditional approach by incorporating support for multi-node-rewriting algorithms, thus expanding the optimization space. Additionally, the strategy learning-based algorithm selection method determines the most suitable node-rewriting algorithm for a given cone. Experimental results demonstrate that our proposed method yields a significant average improvement of 5.567\% in size and 5.327\% in depth.

algorithm, cone, node-rewriting algorithm, (15 more...)

arXiv.org Artificial Intelligence

2312.14536

Country:

Asia > China > Beijing > Beijing (0.05)
Europe (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Why AI is your friend when it comes to cloud migration

#artificialintelligenceNov-12-2019, 05:47:22 GMT

But the drawbacks of not making the investment to rebuild your legacy apps for the cloud means technological debt, competitive disadvantages in agility and frustrated customers left suffering poor user experiences. Organisations need to decide which applications to move to the cloud and which to keep on-premise. Then, they must decide how to refactor those apps with cloud-native technologies or create a hybrid-cloud setup - it's a complicated process. Successful cloud migrations and transformation rely on automating continuous builds, integration and delivery as well as automating performance monitoring, root-cause analysis and remediation. Together with this'automate everything' approach is leveraging AI.

ai and automation, application, cloud migration, (11 more...)

#artificialintelligence

Industry: Information Technology > Services (0.73)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Coding habits for data scientists

#artificialintelligenceNov-4-2019, 01:39:00 GMT

While this may be fine for notebooks targeted at teaching people about the machine learning process, in real projects it's a recipe for unmaintainable mess. The lack of good coding habits makes code hard to understand and consequently, modifying code becomes painful and error-prone. This makes it increasingly difficult for data scientists and developers to evolve their ML solutions. In this article, we'll share techniques for identifying bad habits that add to complexity in code as well as habits that can help us partition complexity.

complexity, notebook, print statement, (16 more...)

#artificialintelligence

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.90)

Add feedback

Technical Debt in Data Science Series -- Part 1 – Acing AI – Medium

#artificialintelligenceAug-15-2018, 15:41:13 GMT

A Data Science Interview involves different challenges for a potential data scientist. As much as the interview is for the company to decide if the person is a fit, it as also, for the person to decide if the company is a fit. Understanding a company as a fit requires one to ask some important questions to the interviewers and understand how the data team functions in different areas. Technical Debt in Data Science is one such area. My AI Interview Questions articles for Microsoft, Google, Amazon, Netflix, LinkedIn, Ebay, Twitter, Walmart, Apple, Facebook, Salesforce and Uber have been very helpful to the readers.

artificial intelligence, social media, technical debt, (16 more...)

#artificialintelligence

Country:

North America > United States > New York > New York County > New York City (0.05)
Asia > India (0.05)

Industry: Information Technology > Services (0.55)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Barriers to Refactoring

Communications of the ACMSep-25-2017, 19:55:15 GMT

Refactoring6 is something software developers like to do. But do they refactor as much as they would like? Are there barriers that prevent them from doing so? Refactoring is an important tool for improving quality. Many development methodologies rely on refactoring, especially for agile methodologies but also in more plan-driven organizations. If barriers exist, they would undermine the effectiveness of many product-development organizations. We conducted a large-scale survey in 2009 of 3,785 practitioners' use of object-oriented concepts,7 including questions as to whether they would refactor to deal with certain design problems. We expected either that practitioners would tell us our choice of design principles was inappropriate for basing a refactoring decision or that refactoring is the right decision to take when designs were believed to have quality problems. However, we were told the decision of whether or not to refactor was due to non-design considerations. It is now eight years since the survey, but little has changed in integrated development environment (IDE) support for refactoring, and what has changed has done little to address the barriers we identified.

object-oriented architecture, programming language, refactor, (17 more...)

Communications of the ACM

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.05)
Europe > Greece > Central Macedonia > Thessaloniki (0.04)
(9 more...)

Genre:

Questionnaire & Opinion Survey (0.94)
Research Report > New Finding (0.69)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.89)
Information Technology > Software > Programming Languages (0.67)

Add feedback

ReFACTor: Practical Low-Rank Matrix Estimation Under Column-Sparsity

Gavish, Matan, Schweiger, Regev, Rahmani, Elior, Halperin, Eran

arXiv.org Machine LearningMay-22-2017

Various problems in data analysis and statistical genetics call for recovery of a column-sparse, low-rank matrix from noisy observations. We propose ReFACTor, a simple variation of the classical Truncated Singular Value Decomposition (TSVD) algorithm. In contrast to previous sparse principal component analysis (PCA) algorithms, our algorithm can provably reveal a low-rank signal matrix better, and often significantly better, than the widely used TSVD, making it the algorithm of choice whenever column-sparsity is suspected. Empirically, we observe that ReFACTor consistently outperforms TSVD even when the underlying signal is not sparse, suggesting that it is generally safe to use ReFACTor instead of TSVD and PCA. The algorithm is extremely simple to implement and its running time is dominated by the runtime of PCA, making it as practical as standard principal component analysis.

artificial intelligence, machine learning, refactor, (14 more...)

arXiv.org Machine Learning

1705.07654

Country:

Asia > Middle East > Israel (0.28)
North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report > Experimental Study (0.47)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.45)

Add feedback