Goto

Collaborating Authors

 virtual model


LaajMeter: A Framework for LaaJ Evaluation

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly used as evaluators in natural language processing tasks, a paradigm known as LLM-as-a-Judge (LaaJ). The analysis of a LaaJ software, commonly refereed to as meta-evaluation, pose significant challenges in domain-specific contexts. In such domains, in contrast to general domains, annotated data is scarce and expert evaluation is costly. As a result, meta-evaluation is often performed using metrics that have not been validated for the specific domain in which they are applied. Therefore, it becomes difficult to determine which metrics effectively identify LaaJ quality, and further, what threshold indicates sufficient evaluator performance. In this work, we introduce LaaJMeter, a simulation-based framework for controlled meta-evaluation of LaaJs. LaaJMeter enables engineers to generate synthetic data representing virtual models and judges, allowing systematic analysis of evaluation metrics under realistic conditions. This helps practitioners validate LaaJs for specific tasks: they can test whether their metrics correctly distinguish between high and low quality (virtual) LaaJs, and estimate appropriate thresholds for evaluator adequacy. We demonstrate the utility of LaaJMeter in a code translation task involving a legacy programming language, showing how different metrics vary in sensitivity to evaluator quality. Our results highlight the limitations of common metrics and the importance of principled metric selection. LaaJMeter provides a scalable and extensible solution for assessing LaaJs in low-resource settings, contributing to the broader effort to ensure trustworthy and reproducible evaluation in NLP.


A Vehicle-in-the-Loop Simulator with AI-Powered Digital Twins for Testing Automated Driving Controllers

arXiv.org Artificial Intelligence

Simulators are useful tools for testing automated driving controllers. Vehicle-in-the-loop (ViL) tests and digital twins (DTs) are widely used simulation technologies to facilitate the smooth deployment of controllers to physical vehicles. However, conventional ViL tests rely on full-size vehicles, requiring large space and high expenses. Also, physical-model-based DT suffers from the reality gap caused by modeling imprecision. This paper develops a comprehensive and practical simulator for testing automated driving controllers enhanced by scaled physical cars and AI-powered DT models. The scaled cars allow for saving space and expenses of simulation tests. The AI-powered DT models ensure superior simulation fidelity. Moreover, the simulator integrates well with off-the-shelf software and control algorithms, making it easy to extend. We use a filtered control benchmark with formal safety guarantees to showcase the capability of the simulator in validating automated driving controllers. Experimental studies are performed to showcase the efficacy of the simulator, implying its great potential in validating control solutions for autonomous vehicles and intelligent traffic.


Virtual Cells: Predict, Explain, Discover

arXiv.org Artificial Intelligence

Drug discovery is fundamentally a process of inferring the effects of treatments on patients, and would therefore benefit immensely from computational models that can reliably simulate patient responses, enabling researchers to generate and test large numbers of therapeutic hypotheses safely and economically before initiating costly clinical trials. Even a more specific model that predicts the functional response of cells to a wide range of perturbations would be tremendously valuable for discovering safe and effective treatments that successfully translate to the clinic. Creating such virtual cells has long been a goal of the computational research community that unfortunately remains unachieved given the daunting complexity and scale of cellular biology. Nevertheless, recent advances in AI, computing power, lab automation, and high-throughput cellular profiling provide new opportunities for reaching this goal. In this perspective, we present a vision for developing and evaluating virtual cells that builds on our experience at Recursion. We argue that in order to be a useful tool to discover novel biology, virtual cells must accurately predict the functional response of a cell to perturbations and explain how the predicted response is a consequence of modifications to key biomolecular interactions. We then introduce key principles for designing therapeutically-relevant virtual cells, describe a lab-in-the-loop approach for generating novel insights with them, and advocate for biologically-grounded benchmarks to guide virtual cell development. Finally, we make the case that our approach to virtual cells provides a useful framework for building other models at higher levels of organization, including virtual patients. We hope that these directions prove useful to the research community in developing virtual models optimized for positive impact on drug discovery outcomes.


Toward a digital twin of U.S. Congress

arXiv.org Artificial Intelligence

In this paper we provide evidence that a virtual model of U.S. congresspersons based on a collection of language models satisfies the definition of a digital twin. In particular, we introduce and provide high-level descriptions of a daily-updated dataset that contains every Tweet from every U.S. congressperson during their respective terms. We demonstrate that a modern language model equipped with congressperson-specific subsets of this data are capable of producing Tweets that are largely indistinguishable from actual Tweets posted by their physical counterparts. We illustrate how generated Tweets can be used to predict roll-call vote behaviors and to quantify the likelihood of congresspersons crossing party lines, thereby assisting stakeholders in allocating resources and potentially impacting real-world legislative dynamics. We conclude with a discussion of the limitations and important extensions of our analysis.


Web-based Augmented Reality with Auto-Scaling and Real-Time Head Tracking towards Markerless Neurointerventional Preoperative Planning and Training of Head-mounted Robotic Needle Insertion

arXiv.org Artificial Intelligence

Neurosurgery requires exceptional precision and comprehensive preoperative planning to ensure optimal patient outcomes. Despite technological advancements, there remains a need for intuitive, accessible tools to enhance surgical preparation and medical education in this field. Traditional methods often lack the immersive experience necessary for surgeons to visualize complex procedures and critical neurovascular structures, while existing advanced solutions may be cost-prohibitive or require specialized hardware. This research presents a novel markerless web-based augmented reality (AR) application designed to address these challenges in neurointerventional preoperative planning and education. Utilizing MediaPipe for precise facial localization and segmentation, and React Three Fiber for immersive 3D visualization, the application offers an intuitive platform for complex preoperative procedures. A virtual 2-RPS parallel positioner or Skull-Bot model is projected onto the user's face in real-time, simulating surgical tool control with high precision. Key features include the ability to import and auto-scale head anatomy to the user's dimensions and real-time auto-tracking of head movements once aligned. The web-based nature enables simultaneous access by multiple users, facilitating collaboration during surgeries and allowing medical students to observe live procedures. A pilot study involving three participants evaluated the application's auto-scaling and auto-tracking capabilities through various head rotation exercises. This research contributes to the field by offering a cost-effective, accessible, and collaborative tool for improving neurosurgical planning and education, potentially leading to better surgical outcomes and more comprehensive training for medical professionals. The source code of our application is publicly available at https://github.com/Hillllllllton/skullbot_web_ar.


Real-Time Interactions Between Human Controllers and Remote Devices in Metaverse

arXiv.org Artificial Intelligence

Supporting real-time interactions between human controllers and remote devices remains a challenging goal in the Metaverse due to the stringent requirements on computing workload, communication throughput, and round-trip latency. In this paper, we establish a novel framework for real-time interactions through the virtual models in the Metaverse. Specifically, we jointly predict the motion of the human controller for 1) proactive rendering in the Metaverse and 2) generating control commands to the real-world remote device in advance. The virtual model is decoupled into two components for rendering and control, respectively. To dynamically adjust the prediction horizons for rendering and control, we develop a two-step human-in-the-loop continuous reinforcement learning approach and use an expert policy to improve the training efficiency. An experimental prototype is built to verify our algorithm with different communication latencies. Compared with the baseline policy without prediction, our proposed method can reduce 1) the Motion-To-Photon (MTP) latency between human motion and rendering feedback and 2) the root mean squared error (RMSE) between human motion and real-world remote devices significantly.


Inside the tech-savvy lives of AI model creators earning thousands of dollars generating 'dream girls'

Daily Mail - Science & tech

Steven Jones' porn business, which once netted him half a million in revenue per month, fell apart in 2013 amid stiff competition from free streamers like Pornhub. But the self-described science-fiction nerd now says he's getting back into the adult content game, thanks to artificial intelligence (AI) which he's leveraging to help customers make bespoke pornography depicting their AI-generated'dream girls.' More chaste AI-generated models, created by a marketing team in Barcelona, Spain, are already minting between 11,000 and 3,200 a month in advertising deals. Welcome to the brave new world of post-human modelling, where'posing' joins the ranks of big box store retail and rocket manufacturing among the industries where managers profit off a labor force of unpaid machines and virtual employees. Right now, these AI-crafted models are already realistic enough to fool people who see models up close much more than the rest of us.


Twin-S: A Digital Twin for Skull-base Surgery

arXiv.org Artificial Intelligence

Purpose: Digital twins are virtual interactive models of the real world, exhibiting identical behavior and properties. In surgical applications, computational analysis from digital twins can be used, for example, to enhance situational awareness. Methods: We present a digital twin framework for skull-base surgeries, named Twin-S, which can be integrated within various image-guided interventions seamlessly. Twin-S combines high-precision optical tracking and real-time simulation. We rely on rigorous calibration routines to ensure that the digital twin representation precisely mimics all real-world processes. Twin-S models and tracks the critical components of skull-base surgery, including the surgical tool, patient anatomy, and surgical camera. Significantly, Twin-S updates and reflects real-world drilling of the anatomical model in frame rate. Results: We extensively evaluate the accuracy of Twin-S, which achieves an average 1.39 mm error during the drilling process. We further illustrate how segmentation masks derived from the continuously updated digital twin can augment the surgical microscope view in a mixed reality setting, where bone requiring ablation is highlighted to provide surgeons additional situational awareness. Conclusion: We present Twin-S, a digital twin environment for skull-base surgery. Twin-S tracks and updates the virtual model in real-time given measurements from modern tracking technologies. Future research on complementing optical tracking with higher-precision vision-based approaches may further increase the accuracy of Twin-S.


How virtual models of the brain could transform epilepsy surgery

#artificialintelligence

An MRI scan showing the brain of a person with epilepsy.Credit: BSIP/Universal Images Group via Getty Virtual models representing the brains of people with epilepsy could help to enable more-effective treatments of the disease by showing neurosurgeons precisely which zones are responsible for seizures. The models, created using a computational system known as the Virtual Epileptic Patient (VEP), have been developed as part of the Human Brain Project (HBP), a 10-year European initiative focused on digital brain research. The approach is being tested in a clinical trial called EPINOV, to evaluate whether it improves the success rate of epilepsy surgeries. "It's an example of personalized medicine," says Aswin Chari, a neurosurgeon at University College London. VEP uses "the patient's own brain scans [and] the patient's own brainwave-recording data to build a model and improve our understanding of where their seizures are coming from".


Meet Noah, Joy, Theo and Nine… the AI models set to revolutionise how you order clothes online

Daily Mail - Science & tech

Digital humans could end up replacing supermodels as artificial intelligence is set to revolutionise the way we order clothes online. A German company has launched a system that sees state-of-the-art'digital humans' pose for campaigns. The digi-models, who come with names including Joy, Nina, Noah and Theo, can be customised and individualised to a brand's preference. A German company has launched a system that sees state-of-the-art'digital humans' pose for campaigns There is also the option for real models, such as Kate Moss or Kaia Gerber, to licence their likenesses. They can either be recreated completely digitally or have a scan done of their entire body.