AITopics | taxnodes:Technology: Instructional Materials

Anytime-Competitive Reinforcement Learning with Policy Prior

Neural Information Processing SystemsMay-25-2025, 16:46:57 GMT

This paper studies the problem of Anytime-Competitive Markov Decision Process (A-CMDP). Existing works on Constrained Markov Decision Processes (CMDPs) aim to optimize the expected reward while constraining the expected cost over random dynamics, but the cost in a specific episode can still be unsatisfactorily high. In contrast, the goal of A-CMDP is to optimize the expected reward while guaranteeing a bounded cost in each round of any episode against a policy prior. We propose a new algorithm, called Anytime-Competitive Reinforcement Learning (ACRL), which provably guarantees the anytime cost constraints. The regret analysis shows the policy asymptotically matches the optimal reward achievable under the anytime competitive constraints. Experiments on the application of carbonintelligent computing verify the reward performance and cost constraint guarantee of ACRL.

constraint, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre:

Research Report (0.66)
Overview (0.48)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Energy > Power Industry (1.00)
Energy > Renewable (0.67)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)

Add feedback

Anytime-Competitive Reinforcement Learning with Policy Prior

Neural Information Processing SystemsMay-25-2025, 16:46:53 GMT

This paper studies the problem of Anytime-Competitive Markov Decision Process (A-CMDP). Existing works on Constrained Markov Decision Processes (CMDPs) aim to optimize the expected reward while constraining the expected cost over random dynamics, but the cost in a specific episode can still be unsatisfactorily high. In contrast, the goal of A-CMDP is to optimize the expected reward while guaranteeing a bounded cost in each round of any episode against a policy prior. We propose a new algorithm, called Anytime-Competitive Reinforcement Learning (ACRL), which provably guarantees the anytime cost constraints. The regret analysis shows the policy asymptotically matches the optimal reward achievable under the anytime competitive constraints. Experiments on the application of carbonintelligent computing verify the reward performance and cost constraint guarantee of ACRL.

constraint, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre:

Research Report (0.66)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Energy > Power Industry (1.00)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)

Add feedback

TradeMaster Appendix

Neural Information Processing SystemsMay-25-2025, 09:57:02 GMT

Is there a label or target associated with each instance? No, there is no label or target associated with each instance as our focus is not supervised learning settings. Is any information missing from individual instances? Yes, it is common to have missing values in financial datasets. We provide scripts to preprocess and conduct data imputation with diffusion models [26]. Are relationships between individual instances made explicit?

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Instructional Material (0.68)

Industry:

Banking & Finance > Trading (1.00)
Information Technology (0.93)
Law (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Neural Information Processing SystemsMay-23-2025, 06:31:24 GMT

The performance of a large language model (LLM) depends heavily on the quality and size of its pretraining dataset. However, the pretraining datasets for state-ofthe-art open LLMs like Llama 3 and Mixtral are not publicly available and very little is known about how they were created. In this work, we introduce FineWeb, a 15-trillion token dataset derived from 96 Common Crawl snapshots that produces better-performing LLMs than other open pretraining datasets. To advance the understanding of how best to curate high-quality pretraining datasets, we carefully document and ablate all of the design choices used in FineWeb, including indepth investigations of deduplication and filtering strategies. In addition, we introduce FineWeb-Edu, a 1.3-trillion token collection of educational text filtered from FineWeb.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Instructional Material (0.92)
Research Report (0.67)

Industry:

Education > Educational Setting (0.67)
Health & Medicine > Consumer Health (0.46)
Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

MetaTeacher: Coordinating Multi-Model Domain Adaptation for Medical Image Classification (Appendix)

Neural Information Processing SystemsMay-22-2025, 06:38:01 GMT

We follow the derivation route in [7] except the coordinating weight part. According to Eq.(7), we update θ According to the chain rule, Eq.(15) can be written as: For the right part of Eq.(16), it follows that [ ( Figure 3: The Class Activation Map (CAM) [10] is used to perform visual ablation analysis on a chest x-ray image in Open-i dataset. The background color is blue, with red or yellow representing the disease location. The number on the top left corner of each image is the predicted probability for the corresponding disease. We visualize the domain adaptation performance on the transfer scenario NIH-CXR14, CheXpert, MIMIC-CXR to Open-i. The visualization sample in the Open-i is suffering from Atelecsis and Effusion disease.

artificial intelligence, machine learning, metateacher, (15 more...)

Neural Information Processing Systems

Country:

Asia > China (0.15)
North America > United States > Texas (0.14)

Genre:

Instructional Material > Online (0.41)
Instructional Material > Course Syllabus & Notes (0.41)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Information Technology > Security & Privacy (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.87)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.41)

Add feedback

615299acbbac3e21302bbc435091ad9f-Supplemental.pdf

Neural Information Processing SystemsMay-21-2025, 15:04:14 GMT

algorithm, variance, weight update, (16 more...)

Neural Information Processing Systems

Genre: Instructional Material > Online (0.40)

Industry:

Energy > Oil & Gas (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Provably Efficient Sample Collection Strategy for Reinforcement Learning

Neural Information Processing SystemsMay-21-2025, 03:01:24 GMT

One of the challenges in online reinforcement learning (RL) is that the agent needs to trade off the exploration of the environment and the exploitation of the samples to optimize its behavior. Whether we optimize for regret, sample complexity, state-space coverage or model estimation, we need to strike a different exploration-exploitation trade-off. In this paper, we propose to tackle the exploration-exploitation problem following a decoupled approach composed of: 1) An "objective-specific" algorithm that (adaptively) prescribes how many samples to collect at which states, as if it has access to a generative model (i.e., a simulator of the environment); 2) An "objective-agnostic" sample collection exploration strategy responsible for generating the prescribed samples as fast as possible.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County (0.14)

Genre: Instructional Material (0.34)

Industry: Energy > Oil & Gas > Upstream (0.74)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Interview with Filippos Gouidis: Object state classification

AIHubMay-20-2025, 08:45:37 GMT

Filippos's PhD dissertation focuses on developing a method for recognizing object states without visual training data. By leveraging semantic knowledge from online sources and Large Language Models, structured as Knowledge Graphs, Graph Neural Networks learn representations for accurate state classification. In this interview series, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. The Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. In this latest interview, we met with Filippos Gouidis, who has recently completed his PhD, and found out more about his research on object state classification.

classification, machine learning, natural language, (18 more...)

AIHub

Genre: Instructional Material > Course Syllabus & Notes (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The secret to AI: most people are using it wrong

AI is supposed to save time, boost your output, and even help kickstart your creativity. But if you find yourself constantly rewriting prompts and begging the AI to edit bad responses, there's a hard truth you have to accept: it's not ChatGPT. But getting your skills up to snuff is simple if you enroll in our best-selling e-degree program. It doesn't matter if you're a complete beginner, an aspiring master, or somewhere in between; you'll learn how to use ChatGPT like an expert for just 19.97 (reg. Don't worry about fitting time into your schedule because these courses are completely self-paced.

large language model, machine learning, natural language, (6 more...)

Popular Science

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

You can try Microsoft's free AI skills training for two more weeks, and I recommend you do

ZDNetMay-15-2025, 19:05:42 GMT

I know you've heard of gamification, but have you ever heard of festification? That's what Microsoft did last month and is continuing until May 28, with the Microsoft AI Skills Fest. It's a little odd, but it also looks like it might be a heck of a lot of fun. And you still three full weeks to participate. Microsoft's AI Skills Fest offers courses that are open for all skill levels.

artificial intelligence, microsoft, social media, (15 more...)

ZDNet

Country: Asia > India (0.15)

Genre: Instructional Material > Course Syllabus & Notes (0.30)

Industry: Leisure & Entertainment > Games > Computer Games (0.36)

Technology: