Goto

Collaborating Authors

 Generative AI


Cancer-Net SCa-Synth: An Open Access Synthetically Generated 2D Skin Lesion Dataset for Skin Cancer Classification

arXiv.org Artificial Intelligence

In the United States, skin cancer ranks as the most commonly diagnosed cancer, presenting a significant public health issue due to its high rates of occurrence and the risk of serious complications if not caught early. Recent advancements in dataset curation and deep learning have shown promise in quick and accurate detection of skin cancer. However, current open-source datasets have significant class imbalances which impedes the effectiveness of these deep learning models. In healthcare, generative artificial intelligence (AI) models have been employed to create synthetic data, addressing data imbalance in datasets by augmenting underrepresented classes and enhancing the overall quality and performance of machine learning models. In this paper, we build on top of previous work by leveraging new advancements in generative AI, notably Stable Diffusion and DreamBooth. We introduce Cancer-Net SCa-Synth, an open access synthetically generated 2D skin lesion dataset for skin cancer classification. Further analysis on the data effectiveness by comparing the ISIC 2020 test set performance for training with and without these synthetic images for a simple model highlights the benefits of leveraging synthetic data to improve performance.


Enhancing Investment Analysis: Optimizing AI-Agent Collaboration in Financial Research

arXiv.org Artificial Intelligence

In recent years, the application of generative artificial intelligence (GenAI) in financial analysis and investment decision-making has gained significant attention. However, most existing approaches rely on single-agent systems, which fail to fully utilize the collaborative potential of multiple AI agents. In this paper, we propose a novel multi-agent collaboration system designed to enhance decision-making in financial investment research. The system incorporates agent groups with both configurable group sizes and collaboration structures to leverage the strengths of each agent group type. By utilizing a sub-optimal combination strategy, the system dynamically adapts to varying market conditions and investment scenarios, optimizing performance across different tasks. We focus on three sub-tasks: fundamentals, market sentiment, and risk analysis, by analyzing the 2023 SEC 10-K forms of 30 companies listed on the Dow Jones Index. Our findings reveal significant performance variations based on the configurations of AI agents for different tasks. The results demonstrate that our multi-agent collaboration system outperforms traditional single-agent models, offering improved accuracy, efficiency, and adaptability in complex financial environments. This study highlights the potential of multi-agent systems in transforming financial analysis and investment decision-making by integrating diverse analytical perspectives.


Learn to Solve Vehicle Routing Problems ASAP: A Neural Optimization Approach for Time-Constrained Vehicle Routing Problems with Finite Vehicle Fleet

arXiv.org Artificial Intelligence

Finding a feasible and prompt solution to the Vehicle Routing Problem (VRP) is a prerequisite for efficient freight transportation, seamless logistics, and sustainable mobility. Traditional optimization methods reach their limits when confronted with the real-world complexity of VRPs, which involve numerous constraints and objectives. Recently, the ability of generative Artificial Intelligence (AI) to solve combinatorial tasks, known as Neural Combinatorial Optimization (NCO), demonstrated promising results, offering new perspectives. In this study, we propose an NCO approach to solve a time-constrained capacitated VRP with a finite vehicle fleet size. The approach is based on an encoder-decoder architecture, formulated in line with the Policy Optimization with Multiple Optima (POMO) protocol and trained via a Proximal Policy Optimization (PPO) algorithm. We successfully trained the policy with multiple objectives (minimizing the total distance while maximizing vehicle utilization) and evaluated it on medium and large instances, benchmarking it against state-of-the-art heuristics. The method is able to find adequate and cost-efficient solutions, showing both flexibility and robust generalization. Finally, we provide a critical analysis of the solution generated by NCO and discuss the challenges and opportunities of this new branch of intelligent learning algorithms emerging in optimization science, focusing on freight transportation.


OpenAI bought the web domain Chat.com

Engadget

OpenAI has scooped up a domain name that sounds like a logical fit. TechCrunch reports that Chat.com, which was previously bought for over 15 million, is now in the hands of the ChatGPT maker. According to the domain history website who.is, Chat.com was first registered way back in September 1996. Before OpenAI's acquisition, it last changed hands in 2023, when HubSpot co-founder and CTO Dharmesh Shah reportedly bought it for 15.5 million.


Goodbye Google? How to use ChatGPT's new web search.

Popular Science

After a limited testing period, OpenAI is opening up its ChatGPT web search tool to all users. The rollout starts with those paying 20 a month for ChatGPT Plus, but OpenAI says it'll show up for everyone, whether they're subscribers or not, in the coming months. While it uses the familiar ChatGPT interface, web search works a little differently. Instead of the bot relying on its training data to come up with answers, it scours the web for relevant and timely information, and then sifts through and summarizes what it's found to generate a coherent response. It means you can type all the queries you would normally plug into Google into ChatGPT instead--from "what time is the Superbowl?" to "what are the best places to visit in Florence?"


Windows Terminal now supports AI chatbots like ChatGPT, GitHub Copilot

PCWorld

Less than a month ago, OpenAI released its Windows app for ChatGPT, allowing you to use the AI chatbot on a desktop PC without running it in a browser tab. But it was an "early version" reserved for paid subscribers, so many users were left out in the cold. Fortunately, there is a way to use ChatGPT on Windows without paying for a premium plan -- if you get the nightly version of Windows Terminal. Windows Terminal is a free app provided by Microsoft that's basically a faster, more modern, more powerful, and more efficient alternative to Command Prompt but still more accessible and easier to use than the built-for-power-users PowerShell. The regular version of Windows Terminal is available on the Microsoft Store, but the nightly version -- which you'll need to access the newly implemented support for both ChatGPT and GitHub Copilot -- can only be downloaded on the project's GitHub page.


The 50 Million Movie 'Here' De-Aged Tom Hanks With Generative AI

WIRED

On Friday, TriStar Pictures released Here, a 50 million Robert Zemeckis-directed film that used real-time generative AI face transformation techniques to portray actors Tom Hanks and Robin Wright across a 60-year span, marking one of Hollywood's first full-length features built around AI-powered visual effects. The film adapts a 2014 graphic novel set primarily in a New Jersey living room across multiple time periods. Rather than cast different actors for various ages, the production used AI to modify Hanks' and Wright's appearances throughout. The de-aging technology comes from Metaphysic, a visual effects company that creates real time face swapping and aging effects. During filming, the crew watched two monitors simultaneously: one showing the actors' actual appearances and another displaying them at whatever age the scene required.


PhDGPT: Introducing a psychometric and linguistic dataset about how large language models perceive graduate students and professors in psychology

arXiv.org Artificial Intelligence

Machine psychology aims to reconstruct the mindset of Large Language Models (LLMs), i.e. how these artificial intelligences perceive and associate ideas. This work introduces PhDGPT, a prompting framework and synthetic dataset that encapsulates the machine psychology of PhD researchers and professors as perceived by OpenAI's GPT-3.5. The dataset consists of 756,000 datapoints, counting 300 iterations repeated across 15 academic events, 2 biological genders, 2 career levels and 42 unique item responses of the Depression, Anxiety, and Stress Scale (DASS-42). PhDGPT integrates these psychometric scores with their explanations in plain language. This synergy of scores and texts offers a dual, comprehensive perspective on the emotional well-being of simulated academics, e.g. male/female PhD students or professors. By combining network psychometrics and psycholinguistic dimensions, this study identifies several similarities and distinctions between human and LLM data. The psychometric networks of simulated male professors do not differ between physical and emotional anxiety subscales, unlike humans. Other LLMs' personification can reconstruct human DASS factors with a purity up to 80%. Furthemore, LLM-generated personifications across different scenarios are found to elicit explanations lower in concreteness and imageability in items coding for anxiety, in agreement with past studies about human psychology. Our findings indicate an advanced yet incomplete ability for LLMs to reproduce the complexity of human psychometric data, unveiling convenient advantages and limitations in using LLMs to replace human participants. PhDGPT also intriguingly capture the ability for LLMs to adapt and change language patterns according to prompted mental distress contextual features, opening new quantitative opportunities for assessing the machine psychology of these artificial intelligences.


Enhancing Security Control Production With Generative AI

arXiv.org Artificial Intelligence

Security controls are mechanisms or policies designed for cloud based services to reduce risk, protect information, and ensure compliance with security regulations. The development of security controls is traditionally a labor-intensive and time-consuming process. This paper explores the use of Generative AI to accelerate the generation of security controls. We specifically focus on generating Gherkin codes which are the domain-specific language used to define the behavior of security controls in a structured and understandable format. By leveraging large language models and in-context learning, we propose a structured framework that reduces the time required for developing security controls from 2-3 days to less than one minute. Our approach integrates detailed task descriptions, step-by-step instructions, and retrieval-augmented generation to enhance the accuracy and efficiency of the generated Gherkin code. Initial evaluations on AWS cloud services demonstrate promising results, indicating that GenAI can effectively streamline the security control development process, thus providing a robust and dynamic safeguard for cloud-based infrastructures.


Understanding Generative AI in Robot Logic Parametrization

arXiv.org Artificial Intelligence

Leveraging generative AI (for example, Large Language Models) for language understanding within robotics opens up possibilities for LLM-driven robot end-user development (EUD). Despite the numerous design opportunities it provides, little is understood about how this technology can be utilized when constructing robot program logic. In this paper, we outline the background in capturing natural language end-user intent and summarize previous use cases of LLMs within EUD. Taking the context of filmmaking as an example, we explore how a cinematography practitioner's intent to film a certain scene can be articulated using natural language, captured by an LLM, and further parametrized as low-level robot arm movement. We explore the capabilities of an LLM interpreting end-user intent and mapping natural language to predefined, cross-modal data in the process of iterative program development. We conclude by suggesting future opportunities for domain exploration beyond cinematography to support language-driven robotic camera navigation.