calculus
Multi-Method Analysis of Mathematics Placement Assessments: Classical, Machine Learning, and Clustering Approaches
Allagan, Julian D., Singleton, Dasia A., Perry, Shanae N., Morgan, Gabrielle C., Morgan, Essence A.
This study evaluates a 40-item mathematics placement examination administered to 198 students using a multi-method framework combining Classical Test Theory, machine learning, and unsupervised clustering. Classical Test Theory analysis reveals that 55\% of items achieve excellent discrimination ($D \geq 0.40$) while 30\% demonstrate poor discrimination ($D < 0.20$) requiring replacement. Question 6 (Graph Interpretation) emerges as the examination's most powerful discriminator, achieving perfect discrimination ($D = 1.000$), highest ANOVA F-statistic ($F = 4609.1$), and maximum Random Forest feature importance (0.206), accounting for 20.6\% of predictive power. Machine learning algorithms demonstrate exceptional performance, with Random Forest and Gradient Boosting achieving 97.5\% and 96.0\% cross-validation accuracy. K-means clustering identifies a natural binary competency structure with a boundary at 42.5\%, diverging from the institutional threshold of 55\% and suggesting potential overclassification into remedial categories. The two-cluster solution exhibits exceptional stability (bootstrap ARI = 0.855) with perfect lower-cluster purity. Convergent evidence across methods supports specific refinements: replace poorly discriminating items, implement a two-stage assessment, and integrate Random Forest predictions with transparency mechanisms. These findings demonstrate that multi-method integration provides a robust empirical foundation for evidence-based mathematics placement optimization.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey > Bergen County > Mahwah (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (7 more...)
- Research Report > New Finding (1.00)
- Instructional Material > Course Syllabus & Notes (0.93)
- Education > Curriculum > Subject-Specific Education (0.94)
- Education > Assessment & Standards (0.68)
Interpretability Framework for LLMs in Undergraduate Calculus
Dakshit, Sagnik, Roy, Sushmita Sinha
Large Language Models (LLMs) are increasingly being used in education, yet their correctness alone does not capture the quality, reliability, or pedagogical validity of their problem-solving behavior, especially in mathematics, where multistep logic, symbolic reasoning, and conceptual clarity are critical. Conventional evaluation methods largely focus on final answer accuracy and overlook the reasoning process. To address this gap, we introduce a novel interpretability framework for analyzing LLM-generated solutions using undergraduate calculus problems as a representative domain. Our approach combines reasoning flow extraction and decomposing solutions into semantically labeled operations and concepts with prompt ablation analysis to assess input salience and output stability. Using structured metrics such as reasoning complexity, phrase sensitivity, and robustness, we evaluated the model behavior on real Calculus I to III university exams. Our findings revealed that LLMs often produce syntactically fluent yet conceptually flawed solutions, with reasoning patterns sensitive to prompt phrasing and input variation. This framework enables fine-grained diagnosis of reasoning failures, supports curriculum alignment, and informs the design of interpretable AI-assisted feedback tools. This is the first study to offer a structured, quantitative, and pedagogically grounded framework for interpreting LLM reasoning in mathematics education, laying the foundation for the transparent and responsible deployment of AI in STEM learning environments.
- Research Report > New Finding (1.00)
- Instructional Material > Course Syllabus & Notes (0.93)
- Education > Curriculum > Subject-Specific Education (1.00)
- Education > Educational Setting > Higher Education (0.93)
- Education > Educational Technology > Educational Software > Computer Based Training (0.46)
On Condorcet's Jury Theorem with Abstention
The well-known Condorcet Jury Theorem states that, under majority rule, the better of two alternatives is chosen with probability approaching one as the population grows. We study an asymmetric setting where voters face varying participation costs and share a possibly heuristic belief about their pivotality (ability to influence the outcome). In a costly voting setup where voters abstain if their participation cost is greater than their pivotality estimate, we identify a single property of the heuristic belief -- weakly vanishing pivotality -- that gives rise to multiple stable equilibria in which elections are nearly tied. In contrast, strongly vanishing pivotality (as in the standard Calculus of Voting model) yields a unique, trivial equilibrium where only zero-cost voters participate as the population grows. We then characterize when nontrivial equilibria satisfy a version of the Jury Theorem: below a sharp threshold, the majority-preferred candidate wins with probability approaching one; above it, both candidates either win with equal probability.
- North America > United States (0.14)
- Asia > Middle East > Israel (0.04)
- Asia > India (0.04)
U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs
Chernyshev, Konstantin, Polshkov, Vitaliy, Artemova, Ekaterina, Myasnikov, Alex, Stepanov, Vlad, Miasnikov, Alexei, Tilga, Sergei
The current evaluation of mathematical skills in LLMs is limited, as existing benchmarks are either relatively small, primarily focus on elementary and highschool problems, or lack diversity in topics. Additionally, the inclusion of visual elements in tasks remains largely under-explored. To address these gaps, we introduce U-MATH, a novel benchmark of 1,100 unpublished open-ended university-level problems sourced from teaching materials. It is balanced across six core subjects, with 20% of multimodal problems. Given the open-ended nature of U-MATH problems, we employ an LLM to judge the correctness of generated solutions. To this end, we release µ-MATH, a dataset to evaluate the LLMs' capabilities in judging solutions. The evaluation of general domain, math-specific, and multimodal LLMs highlights the challenges presented by U-MATH. Our findings reveal that LLMs achieve a maximum accuracy of only 63% on text-based tasks, with even lower 45% on visual problems. The solution assessment proves challenging for LLMs, with the best LLM judge having an F1-score of 80% on µ-MATH. Mathematical reasoning is a fundamental domain for assessing the true capabilities of Large Language Models (LLMs) to reason (Ahn et al., 2024). While existing benchmarks like GSM8K (Cobbe et al., 2021) and MATH (Hendrycks et al., 2021) provide valuable insights, they primarily focus on schoollevel mathematics. This leaves a significant gap in understanding how LLMs perform on more advanced, university-level problems.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > Middle East > Jordan (0.04)
- (7 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Answer Set Programming for Flexible Payroll Management
Callewaert, Benjamin, Vennekens, Joost
Payroll management is a critical business task that is subject to a large number of rules, which vary widely between companies, sectors, and countries. Moreover, the rules are often complex and change regularly. Therefore, payroll management systems must be flexible in design. In this paper, we suggest an approach based on a flexible Answer Set Programming (ASP) model and an easy-to-read tabular representation based on the Decision Model and Notation (DMN) standard. It allows HR consultants to represent complex rules without the need for a software engineer, and to ultimately design payroll systems for a variety of different scenarios. We show how the multi-shot solving capabilities of the clingo ASP system can be used to reach the performance that is necessary to handle real-world instances.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.05)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- (2 more...)
SIGHT: A Large Annotated Dataset on Student Insights Gathered from Higher Education Transcripts
Wang, Rose E., Wirawarn, Pawan, Goodman, Noah, Demszky, Dorottya
Lectures are a learning experience for both students and teachers. Students learn from teachers about the subject material, while teachers learn from students about how to refine their instruction. However, online student feedback is unstructured and abundant, making it challenging for teachers to learn and improve. We take a step towards tackling this challenge. First, we contribute a dataset for studying this problem: SIGHT is a large dataset of 288 math lecture transcripts and 15,784 comments collected from the Massachusetts Institute of Technology OpenCourseWare (MIT OCW) YouTube channel. Second, we develop a rubric for categorizing feedback types using qualitative analysis. Qualitative analysis methods are powerful in uncovering domain-specific insights, however they are costly to apply to large data sources. To overcome this challenge, we propose a set of best practices for using large language models (LLMs) to cheaply classify the comments at scale. We observe a striking correlation between the model's and humans' annotation: Categories with consistent human annotations (>$0.9$ inter-rater reliability, IRR) also display higher human-model agreement (>$0.7$), while categories with less consistent human annotations ($0.7$-$0.8$ IRR) correspondingly demonstrate lower human-model agreement ($0.3$-$0.5$). These techniques uncover useful student feedback from thousands of comments, costing around $\$0.002$ per comment. We conclude by discussing exciting future directions on using online student feedback and improving automated annotation techniques for qualitative research.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (3 more...)
- Research Report (1.00)
- Instructional Material > Course Syllabus & Notes (1.00)
- Education > Curriculum > Subject-Specific Education (0.93)
- Education > Educational Setting > Online (0.67)
- Education > Educational Setting > Higher Education (0.64)
Math 0-1: Calculus for Data Science & Machine Learning - Coupons ME
Created by Lazy Programmer Inc., Lazy Programmer Team 13.5 hours on-demand video course This Math 0-1: Calculus for Data Science & Machine Learning course will cover Calculus 1 (limits, derivatives, and the most important derivative rules), Calculus 2 (integration), and Calculus 3 (vector calculus). It will even include machine learning-focused material you wouldn't normally see in a regular college course. We will even demonstrate many of the concepts in this course using the Python programming language (don't worry, you don't need to know Python for this course). In other words, instead of the dry old college version of calculus, this Math 0-1: Calculus for Data Science & Machine Learning course takes just the most practical and impactful topics, and provides you with skills directly applicable to machine learning and data science, so you can start applying them today.
ChatGPT And The Changing Art Of Personalization
LONDON, ENGLAND - FEBRUARY 03: In this photo illustration, the welcome screen for the OpenAI ... [ ] "ChatGPT" app is displayed on a laptop screen on February 03, 2023 in London, England. OpenAI, whose online chatbot ChatGPT made waves when it was debuted in December, announced this week that a commercial version of the service, called ChatGPT Plus, would soon be available to users in the United States. And AI in driving personalization is not new. In every service from rideshare pricing to product recommendations, what companies put in front of consumers has been AI-optimized to serve each customer as best as possible (or as profitably as possible). Even with that, I believe the new generation of large language models brings this personalization to a whole new level.
- Europe > United Kingdom > England > Greater London > London (0.46)
- North America > United States (0.25)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)
"The Ghost of Calculus in Deep learning" and how to overcome it!
While a strong foundation in calculus is important for deep learning, it is not strictly necessary to learn calculus in order to learn deep learning. It is possible to learn the basics of deep learning and build simple neural networks without a deep understanding of calculus. However, as you progress in your deep learning studies and start working on more complex tasks, a solid understanding of calculus will become increasingly important. Calculus is used extensively in deep learning to optimize the performance of machine learning algorithms, particularly in the training of neural networks. For example, gradient descent, which is a commonly used optimization algorithm in deep learning, relies on the derivative of the loss function to update the model's parameters.
Everything You Need To Know About Mathematics for Machine Learning
This Edureka video on'Mathematics for Machine Learning' teaches you all the math needed to get started with mastering Machine Learning. It teaches you all the necessary topics and concepts of Linear Algebra, Multivariate Calculus, Statistics, and Probability and also dives into the actual implementation of these topics. Are you an aspiring data scientist who is fascinated by how things workaround in the world of data science and machine learning? Well, congrats on choosing the right career path that is best suited for you at this point in time. However, did you know that you need to ace mathematics for machine learning and data science?