majumdar
A Large Language Model for Corporate Credit Scoring
Majumdar, Chitro, Scandizzo, Sergio, Mahanta, Ratanlal, Mandal, Avradip, Bhattacharjee, Swarnendu
We introduce Omega^2, a Large Language Model-driven framework for corporate credit scoring that combines structured financial data with advanced machine learning to improve predictive reliability and interpretability. Our study evaluates Omega^2 on a multi-agency dataset of 7,800 corporate credit ratings drawn from Moody's, Standard & Poor's, Fitch, and Egan-Jones, each containing detailed firm-level financial indicators such as leverage, profitability, and liquidity ratios. The system integrates CatBoost, LightGBM, and XGBoost models optimized through Bayesian search under temporal validation to ensure forward-looking and reproducible results. Omega^2 achieved a mean test AUC above 0.93 across agencies, confirming its ability to generalize across rating systems and maintain temporal consistency. These results show that combining language-based reasoning with quantitative learning creates a transparent and institution-grade foundation for reliable corporate credit-risk assessment.
Learning and Evidence Analytics Framework Bridges Research and Practice for Educational Data Science
Learning analytics (LA) as a research discipline focuses on multiple perspectives of understanding and supporting educational activities utilizing collected log data. To do so at a national and even international level, educational technology platforms that enable gathering users' interaction traces and digitally generated artifacts must store data in a standardized format. In Japan, the government initiated the GIGA School project in 2020, which installed more than nine million tablet PCs and high-speed Internet access at compulsory education institutions (elemental and middle schools). Such infrastructure enables the collection of educational data and analysis with the aim to improve educational practices in each school. With standardized data logging, it is possible to aggregate data from all schools and to generate educational Big Data that can support evidence-based policy-making and research at a national level.
human-language-accelerates-robotic-learning
A team of researchers at Princeton has found that human-language descriptions of tools can accelerate the learning of a simulated robotic arm that can lift and use various tools. The new research supports the idea that AI training can make autonomous robots more adaptive in new situations, which in turn improves their effectiveness and safety. By adding descriptions of a tool's form and function to the robot's training process, the robot's ability to manipulate new tools was improved. The new method is called Accelerated Learning of Tool Manipulation with Language, or ATLA. Anirudha Majumdar is an assistant professor of mechanical and aerospace engineering at Princeton and head of the Intelligent Robot Motion Lab.
Transformed K-means Clustering
Goel, Anurag, Majumdar, Angshul
In this work we propose a clustering framework based on the paradigm of transform learning. In simple terms the representation from transform learning is used for K-means clustering; however, the problem is not solved in such a na\"ive piecemeal fashion. The K-means clustering loss is embedded into the transform learning framework and the joint problem is solved using the alternating direction method of multipliers. Results on document clustering show that our proposed approach improves over the state-of-the-art.
Machine learning guarantees robots' performance in unknown territory
This experiment is a proving ground for a pivotal challenge in modern robotics: the ability to guarantee the safety and success of automated robots operating in novel environments. As engineers increasingly turn to machine learning methods to develop adaptable robots, new work by Princeton University researchers makes progress on such guarantees for robots in contexts with diverse types of obstacles and constraints. "Over the last decade or so, there's been a tremendous amount of excitement and progress around machine learning in the context of robotics, primarily because it allows you to handle rich sensory inputs," like those from a robot's camera, and map these complex inputs to actions, said Anirudha Majumdar, an assistant professor of mechanical and aerospace engineering at Princeton. However, robot control algorithms based on machine learning run the risk of overfitting to their training data, which can make algorithms less effective when they encounter inputs that differ from those they were trained on. Majumdar's Intelligent Robot Motion Lab addressed this challenge by expanding the suite of available tools for training robot control policies, and quantifying the likely success and safety of robots performing in novel environments.
Machine learning guarantees robots' performance in unknown territory
This experiment is a proving ground for a pivotal challenge in modern robotics: the ability to guarantee the safety and success of automated robots operating in novel environments. As engineers increasingly turn to machine learning methods to develop adaptable robots, new work by Princeton University researchers makes progress on such guarantees for robots in contexts with diverse types of obstacles and constraints. "Over the last decade or so, there's been a tremendous amount of excitement and progress around machine learning in the context of robotics, primarily because it allows you to handle rich sensory inputs," like those from a robot's camera, and map these complex inputs to actions, said Anirudha Majumdar, an assistant professor of mechanical and aerospace engineering at Princeton. However, robot control algorithms based on machine learning run the risk of overfitting to their training data, which can make algorithms less effective when they encounter inputs that differ from those they were trained on. Majumdar's Intelligent Robot Motion Lab addressed this challenge by expanding the suite of available tools for training robot control policies, and quantifying the likely success and safety of robots performing in novel environments.
UCSF Launches Center for Intelligent Imaging to Accelerate AI Adoption in Radiology -
UC San Francisco (UCSF) announced the launch of the Center for Intelligence or ci2 that focuses on accelerating the adoption of artificial intelligence (AI) technology to radiology, leveraging advanced computational techniques and industry collaborations to improve patient diagnoses and care. As part of the center, Investigators in ci2 will collaborate with NVIDIA Corp to build infrastructure and tools focused on enabling the translation of AI into clinical practice. The Center is comprised of clinical radiologists, imaging scientists, engineers, machine learning scientists, data engineers, clinicians, post-doctoral fellows, and students collaborating to develop and deploy artificial intelligence that will solve critical clinical problems by advancing the way in which healthcare professionals are able to utilize and deliver imaging. Tools under development include organ and tissue segmentation, automated volumetry and morphological quantification, and disease visualization. NVIDIA engineers and data scientists will work alongside UCSF investigators to develop clinical AI tools, applying powerful computational resources that are available in a few medical institutions, with the goal of accelerating the AI development cycle and integrating it seamlessly in the clinic.
HVAC-Aware Occupancy Scheduling (Extended Abstract)
Lim, Boon-Ping (NICTA and Australian National University)
My research focuses on developing innovative ways to control Heating, Ventilation, and Air Conditioning (HVAC) and schedule occupancy flows in smart buildings to reduce our ecological footprint (and energy bills). We look at the potential for integrating building operations with room booking and meeting scheduling. Specifically, we improve on the effectiveness of energy-aware room-booking and occupancy scheduling approaches, by allowing the scheduling decisions to rely on an explicit model of the building's occupancy-based HVAC control. From computational standpoint, this is a challenging topic as HVAC models are inherently non-linear non-convex, and occupancy scheduling models additionally introduce discrete variables capturing the time slot and location at which each activity is scheduled. The mechanism needs to tradeoff minimizing energy cost against addressing occupancy thermal comfort and control feasibility in a highly dynamic and uncertain system.