AITopics

2206.11303

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
North America > United States > New York > Tompkins County > Ithaca (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Journal of Artificial Intelligence ResearchNov-16-2023

Maintenance of Plan Libraries for Case-Based Planning: Offline and Online Policies

Gerevini, Alfonso Emilio | Saetti, Alessandro (a:1:{s:5:"en_US";s:21:"University of Brescia";}) | Serina, Ivan | Loreggia, Andrea | Putelli, Luca | Roubickova, Anna

Case-based planning is an approach to planning where previous planning experience provides guidance to solving new problems. Such a guidance can be extremely useful, or even necessary, when the new problem is very hard to solve, or the stored previous experience is highly valuable, because, e.g., it was provided or validated by human experts, and the system should try to reuse it as much as possible. To do so, a case-based planning system stores in a library previous planning experience in the form of already encountered problems and their solutions. The quality of such a plan library critically influences the performance of the planner, and therefore it needs to be carefully designed and created. For this reason, it is also important to update the library during the lifetime of the system, as the type of problems being addressed may evolve or differ from the ones the library was originally designed for. Moreover, like in general case-based reasoning, the library needs to be maintained at a manageable size, otherwise the computational cost of querying it grows excessively, making the entire approach ineffective. In this paper, we formally define the problem of maintaining a library of cases, discuss which criteria should drive the maintenance, study the computational complexity of the maintenance problem, and propose offline techniques to reduce an oversized library that optimize different criteria. Moreover, we introduce a complementary online approach that attempts to limit the growth of the library, and we consider the combination of offline and online techniques to ensure the best performance of the case-based planner. Finally, we experimentally show the practical effectiveness of the offline and online methods for reducing the library.

case base, library, plan library, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.14797

AI Access Foundation

14797

Journal of Artificial Intelligence Research

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Florida > Orange County > Orlando (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling > Plan Recognition (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Prakash, Nikhil Potu Surya, Seo, Joohwan, Choi, Jongeun, Horowitz, Roberto

Clustering Techniques for Stable Linear Dynamical Systems with applications to Hard Disk Drives

arXiv.org Artificial IntelligenceNov-16-2023

Abstract: In Robust Control and Data Driven Robust Control design methodologies, multiple plant transfer functions or a family of transfer functions are considered and a common controller is designed such that all the plants that fall into this family are stabilized. Though the plants are stabilized, the controller might be sub-optimal for each of the plants when the variations in the plants are large. This paper presents a way of clustering stable linear dynamical systems for the design of robust controllers within each of the clusters such that the controllers are optimal for each of the clusters. First a k-medoids algorithm for hard clustering will be presented for stable Linear Time Invariant (LTI) systems and then a Gaussian Mixture Models (GMM) clustering for a special class of LTI systems, common for Hard Disk Drive plants, will be presented. Keywords: Robust Control, Clustering, k-medoids, Gaussian Mixture Models, Hard Disk Drives 1. INTRODUCTION Recently, in most of the personal computers, Solid State Drives (SSD) have replaced HDDs as the data transfer rate of SSDs is much higher than that of HDDs.

algorithm, controller, transfer function, (12 more...)

2311.10322

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Virginia (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.50)

arXiv.org Artificial IntelligenceNov-16-2023

Interpretable pap smear cell representation for cervical cancer screening

Ando, Yu, and, Nora Jee-Young Park, Chong, Gun Oh, Ko, Seokhwan, Lee, Donghyeon, Cho, Junghwan, Han, Hyungsoo

Screening is critical for prevention and early detection of cervical cancer but it is time-consuming and laborious. Supervised deep convolutional neural networks have been developed to automate pap smear screening and the results are promising. However, the interest in using only normal samples to train deep neural networks has increased owing to class imbalance problems and high-labeling costs that are both prevalent in healthcare. In this study, we introduce a method to learn explainable deep cervical cell representations for pap smear cytology images based on one class classification using variational autoencoders. Findings demonstrate that a score can be calculated for cell abnormality without training models with abnormal samples and localize abnormality to interpret our results with a novel metric based on absolute difference in cross entropy in agglomerative clustering. The best model that discriminates squamous cell carcinoma (SCC) from normals gives 0.908 +- 0.003 area under operating characteristic curve (AUC) and one that discriminates high-grade epithelial lesion (HSIL) 0.920 +- 0.002 AUC. Compared to other clustering methods, our method enhances the V-measure and yields higher homogeneity scores, which more effectively isolate different abnormality regions, aiding in the interpretation of our results. Evaluation using in-house and additional open dataset show that our model can discriminate abnormality without the need of additional training of deep models.

abnormality, dataset, latent space, (16 more...)

2311.10269

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > South Korea > Daegu > Daegu (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Cervical Cancer (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Park, Yu Min, Tun, Yan Kyaw, Hong, Choong Seon

Joint User Pairing and Beamforming Design of Multi-STAR-RISs-Aided NOMA in the Indoor Environment via Multi-Agent Reinforcement Learning

arXiv.org Artificial IntelligenceNov-16-2023

The development of 6G/B5G wireless networks, which have requirements that go beyond current 5G networks, is gaining interest from academia and industry. However, to increase 6G/B5G network quality, conventional cellular networks that rely on terrestrial base stations are constrained geographically and economically. Meanwhile, NOMA allows multiple users to share the same resources, which improves the spectral efficiency of the system and has the advantage of supporting a larger number of users. Additionally, by intelligently manipulating the phase and amplitude of both the reflected and transmitted signals, STAR-RISs can achieve improved coverage, increased spectral efficiency, and enhanced communication reliability. However, STAR-RISs must simultaneously optimize the amplitude and phase shift corresponding to reflection and transmission, which makes the existing terrestrial networks more complicated and is considered a major challenging issue. Motivated by the above, we study the joint user pairing for NOMA and beamforming design of Multi-STAR-RISs in an indoor environment. Then, we formulate the optimization problem with the objective of maximizing the total throughput of MUs by jointly optimizing the decoding order, user pairing, active beamforming, and passive beamforming. However, the formulated problem is a MINLP. To address this challenge, we first introduce the decoding order for NOMA networks. Next, we decompose the original problem into two subproblems, namely: 1) MU pairing and 2) Beamforming optimization under the optimal decoding order. For the first subproblem, we employ correlation-based K-means clustering to solve the user pairing problem. Then, to jointly deal with beamforming vector optimizations, we propose MAPPO, which can make quick decisions in the given environment owing to its low complexity.

algorithm, optimization, star-ris, (14 more...)

2311.08708

Country:

North America > United States > Nevada > Clark County > Las Vegas (0.04)
Europe > Hungary > Budapest > Budapest (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry: Telecommunications (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.35)

HAL 9000: Skynet's Risk Manager

Freitas, Tadeu, Neto, Mário, Dutra, Inês, Soares, João, Correia, Manuel, Martins, Rolando

Intrusion Tolerant Systems (ITSs) are a necessary component for cyber-services/infrastructures. Additionally, as cyberattacks follow a multi-domain attack surface, a similar defensive approach should be applied, namely, the use of an evolving multi-disciplinary solution that combines ITS, cybersecurity and Artificial Intelligence (AI). With the increased popularity of AI solutions, due to Big Data use-case scenarios and decision support and automation scenarios, new opportunities to apply Machine Learning (ML) algorithms have emerged, namely ITS empowerment. Using ML algorithms, an ITS can augment its intrusion tolerance capability, by learning from previous attacks and from known vulnerabilities. As such, this work's contribution is twofold: (1) an ITS architecture (Skynet) based on the state-of-the-art and incorporates new components to increase its intrusion tolerance capability and its adaptability to new adversaries; (2) an improved Risk Manager design that leverages AI to improve ITSs by automatically assessing OS risks to intrusions, and advise with safer configurations. One of the reasons that intrusions are successful is due to bad configurations or slow adaptability to new threats. This can be caused by the dependency that systems have for human intervention. One of the characteristics in Skynet and HAL 9000 design is the removal of human intervention. Being fully automatized lowers the chance of successful intrusions caused by human error. Our experiments using Skynet, shows that HAL is able to choose 15% safer configurations than the state-of-the-art risk manager.

configuration, information, replica, (16 more...)

2311.09449

Country:

Europe > Portugal > Porto > Porto (0.04)
South America > Brazil > Pará > Belém (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre:

Research Report (0.63)
Overview (0.45)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
(3 more...)

Linear time Evidence Accumulation Clustering with KMeans

Candel, Gaëlle

Among ensemble clustering methods, Evidence Accumulation Clustering is one of the simplest technics. In this approach, a co-association (CA) matrix representing the co-clustering frequency is built and then clustered to extract consensus clusters. Compared to other approaches, this one is simple as there is no need to find matches between clusters obtained from two different partitionings. Nevertheless, this method suffers from computational issues, as it requires to compute and store a matrix of size n x n, where n is the number of items. Due to the quadratic cost, this approach is reserved for small datasets. This work describes a trick which mimic the behavior of average linkage clustering. We found a way of computing efficiently the density of a partitioning, reducing the cost from a quadratic to linear complexity. Additionally, we proved that the k-means maximizes naturally the density. We performed experiments on several benchmark datasets where we compared the k-means and the bisecting version to other state-of-the-art consensus algorithms. The k-means results are comparable to the best state of the art in terms of NMI while keeping the computational cost low. Additionally, the k-means led to the best results in terms of density. These results provide evidence that consensus clustering can be solved with simple algorithms.

algorithm, ensemble, matrix, (12 more...)

2311.09272

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Khalilian, Saeed, Tsouvalas, Vasileios, Ozcelebi, Tanir, Meratnia, Nirvana

FedCode: Communication-Efficient Federated Learning via Transferring Codebooks

Federated Learning (FL) is a distributed machine learning paradigm that enables learning models from decentralized local data. While FL offers appealing properties for clients' data privacy, it imposes high communication burdens for exchanging model weights between a server and the clients. Existing approaches rely on model compression techniques, such as pruning and weight clustering to tackle this. However, transmitting the entire set of weight updates at each federated round, even in a compressed format, limits the potential for a substantial reduction in communication volume. We propose FedCode where clients transmit only codebooks, i.e., the cluster centers of updated model weight values. To ensure a smooth learning curve and proper calibration of clusters between the server and the clients, FedCode periodically transfers model weights after multiple rounds of solely communicating codebooks. This results in a significant reduction in communication volume between clients and the server in both directions, without imposing significant computational overhead on the clients or leading to major performance degradation of the models. We evaluate the effectiveness of FedCode using various publicly available datasets with ResNet-20 and MobileNet backbone model architectures. Our evaluations demonstrate a 12.2-fold data transmission reduction on average while maintaining a comparable model performance with an average accuracy loss of 1.3% compared to FedAvg. Further validation of FedCode performance under non-IID data distributions showcased an average accuracy loss of 2.0% compared to FedAvg while achieving approximately a 12.7-fold data transmission reduction.

accuracy, codebook, fedcode, (14 more...)

2311.0927

Country:

Europe > Netherlands > North Brabant > Eindhoven (0.04)
North America > United States > Virginia (0.04)
North America > United States > District of Columbia > Washington (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Das, Devleena, Khetan, Vivek

DEFT: Data Efficient Fine-Tuning for Large Language Models via Unsupervised Core-Set Selection

Recent advances have led to the availability of many pre-trained language models (PLMs); however, a question that remains is how much data is truly needed to fine-tune PLMs for downstream tasks? In this work, we introduce DEFT, a data-efficient fine-tuning framework that leverages unsupervised core-set selection to minimize the amount of data needed to fine-tune PLMs for downstream tasks. We demonstrate the efficacy of our DEFT framework in the context of text-editing LMs, and compare to the state-of-the art text-editing model, CoEDIT. Our quantitative and qualitative results demonstrate that DEFT models are just as accurate as CoEDIT while being finetuned on ~70% less data.

arxiv preprint arxiv, coedit, dataset, (13 more...)

2310.16776

Country:

Europe > France (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

arXiv.org Artificial IntelligenceNov-14-2023

In the Red(dit): Social Media and Stock Prices

Baker, James

I spent most of the summer sifting through topics, including patents, blood diamonds and police brutality. None of them really stuck. However, on September 16, 2021 I sent Dr. White this email: " In a shocking turn of events, I have found another thing I would like to research. I would like to see if Twitter "coverage" of a publicly traded stock or related phrase ("google" and "search engine") can predict the daily returns of the stock, or changes in the highs/lows/volume of trades. The theory here being that investors' valuation of a stock may be reinforced or informed based on their perception of how others think about the companies (sort of like herd behavior) or their familiarity with the firm in general. If there is an effect, I would particularly like to examine whether this effect has gotten stronger during corona times, as many journalists are claiming that now that everyone is sitting at home with their commission-free trading apps like RobinHood, there are tons of amateur investors in the scene, who may be prone to "window shopping" for hot stocks that make the news.

benchmark, information, table 6, (16 more...)

2311.09252

Country:

North America > United States > New York > New York County > New York City (0.05)
Africa > Sudan (0.04)
South America > Argentina (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Media > News (1.00)
Information Technology > Services (1.00)
Banking & Finance > Trading (1.00)
(4 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)