Goto

Collaborating Authors

 mobile phone


MobileFineTuner: A Unified End-to-End Framework for Fine-Tuning LLMs on Mobile Phones

Geng, Jiaxiang, Zhao, Lunyu, Lu, Yiyi, Luo, Bing

arXiv.org Artificial Intelligence

Mobile phones are the most ubiquitous end devices, generating vast amounts of human-authored data and serving as the primary platform for end-side applications. As high-quality public data for large language models (LLMs) approaches exhaustion, on-device fine-tuning provides an opportunity to leverage private user data while preserving privacy. However, existing approaches are predominantly simulation-based or rely on IoT devices and PCs, leaving commodity mobile phones largely unexplored. A key gap is the absence of an open-source framework that enables practical LLM fine-tuning on mobile phones. We present MobileFineTuner, a unified open-source framework that enables end-to-end LLM fine-tuning directly on commodity mobile phones. MobileFineTuner is designed for efficiency, scalability, and usability, supporting full-parameters fine-tuning (Full-FT) and parameter-efficient fine-tuning (PEFT). To address the memory and energy limitations inherent to mobile phones, we introduce system-level optimizations including parameter sharding, gradient accumulation, and energy-aware computation scheduling. We demonstrate the practicality of MobileFineTuner by fine-tuning GPT-2, Gemma 3, and Qwen 2.5 on real mobile phones. Extensive experiments and ablation studies validate the effectiveness of the proposed optimizations and establish MobileFineTuner as a viable foundation for future research on on-device LLM training.


Beware the eye in the sky! AI traffic cop catches thousands of drivers texting behind the wheel

Daily Mail - Science & tech

Marjorie Taylor Greene's revenge mission: Ex-GOP strategist reveals why firebrand has turned on Trump administration Accountant arrested after'opening fire on MAGA supporter' in row over Trump sign... as victim recalls bullets whizzing past his head'Half the internet' goes down after Amazon cloud outage leaving millions unable to use Ring, Alexa, and banking apps - as experts say we'can't rule out a cyberattack' This is exactly how I look this good at 68 - and you can too: World-famous make-up guru BOBBI BROWN reveals her 10 beauty tricks to hide wrinkles, tighten skin... and the one thing that's almost as good as a facelift Shocking behavior that led to little girl's horror plunge from Disney cruise ship is revealed for first time Keri Russell, 49, SLAMS plastic surgery trends in Hollywood and admits it feels'strange' to look natural Doctors expose the truth about melatonin... as terrifying side-effects soar Trump drops expletive as he issues blunt response to millions of'No Kings' protesters Shock new twist in death of ex-NFL star Doug Martin, 36, as it emerges he died'after brief struggle with police' How I lost 4st fast WITHOUT weight-loss jabs. Virginia Giuffre's memoir appears in book shops a day early: Prince Andrew accuser details multiple encounters with the disgraced royal and reveals how her traumatic childhood made her'perfect victim' for Epstein Greedy waitress chases down customer and calls cops because he didn't tip her: 'Who said he's obligated to tip?' Bailey Zimmerman debuts new cosmetic procedure after revealing he's been'insecure' about it since childhood Are you anxious, tired and have difficulty concentrating? You could be suffering from this common but widely misunderstood condition - and here's how to help yourself: DR MAX PEMBERTON Biden's former mouthpiece Karine Jean-Pierre reveals how ex-president left her'enraged and heartbroken' as she turns on Democrats Sarah Ferguson could turn on Andrew to'save her own skin' if cash runs low - as he faces potential police probe into'dirt digging' after losing his titles Sharon Stone played her mother who tied her to a bed to go party now the real daughter tells the story'Casino' didn't dare show Beware the eye in the sky! READ MORE: How AI cops are already patrolling Britain's streets Whether it's sending a quick text or casting an eye over your emails, those tempted to look at their phone while driving are finally being caught out. UK trials of an AI'traffic cop' have successfully detected thousands of drivers using their phone behind the wheel.


Elon Musk's Lawyers Claim He 'Does Not Use a Computer'

WIRED

Elon Musk's lawyers claimed that he "does not use a computer" in a Sunday court filing related to his lawsuit against Sam Altman and OpenAI. However, Musk has posted pictures or referred to his laptop on X several times in recent months, and public evidence suggests that he owns and appears to use at least one computer. Musk and his artificial intelligence startup xAI sued OpenAI in February 2024, alleging the company committed breach of contract by abandoning its founding agreement to develop AI "for the benefit of humanity," choosing instead "to maximize profits for Microsoft." The Sunday court filing was submitted in opposition to a Friday filing from OpenAI, which accused Musk and xAI of failing to fully comply with the discovery process. OpenAI alleges that Musk's counsel does not plan to collect any documents from him.


Patient Domain Supervised Contrastive Learning for Lung Sound Classification Using Mobile Phone

Jeong, Seung Gyu, Kim, Seong Eun

arXiv.org Artificial Intelligence

Auscultation is crucial for diagnosing lung diseases. The COVID-19 pandemic has revealed the limitations of traditional, in-person lung sound assessments. To overcome these issues, advancements in digital stethoscopes and artificial intelligence (AI) have led to the development of new diagnostic methods. In this context, our study aims to use smartphone microphones to record and analyze lung sounds. We faced two major challenges: the difference in audio style between electronic stethoscopes and smartphone microphones, and the variability among patients. To address these challenges, we developed a method called Patient Domain Supervised Contrastive Learning (PD-SCL). By integrating this method with the Audio Spectrogram Transformer (AST) model, we significantly improved its performance by 2.4\% compared to the original AST model. This progress demonstrates that smartphones can effectively diagnose lung sounds, addressing inconsistencies in patient data and showing potential for broad use beyond traditional clinical settings. Our research contributes to making lung disease detection more accessible in the post-COVID-19 world.


Differentiable Mobile Display Photometric Stereo

Ban, Gawoon, Kim, Hyeongjun, Choi, Seokjun, Yoon, Seungwoo, Baek, Seung-Hwan

arXiv.org Artificial Intelligence

Display photometric stereo uses a display as a programmable light source to illuminate a scene with diverse illumination conditions. Recently, differentiable display photometric stereo (DDPS) [1] demonstrated improved normal reconstruction accuracy by using learned display patterns. However, DDPS faced limitations in practicality, requiring a fixed desktop imaging setup using a polarization camera and a desktop-scale monitor. In this paper, we propose a more practical physics-based photometric stereo, differentiable mobile display photometric stereo (DMDPS), that leverages a mobile phone consisting of a display and a camera. We overcome the limitations of using a mobile device by developing a mobile app and method that simultaneously displays patterns and captures high-quality HDR images. Using this technique, we capture real-world 3D-printed objects and learn display patterns via a differentiable learning process. We demonstrate the effectiveness of DMDPS on both a 3D printed dataset and a first dataset of fallen leaves. The leaf dataset contains reconstructed surface normals and albedos of fallen leaves that may enable future research beyond computer graphics and vision. We believe that DMDPS takes a step forward for practical physics-based photometric stereo.


How Google's AI service Gemini works

PCWorld

Chat GPT is not the only AI service in town. Google Gemini is a similar service where you can ask questions and get answers in plain text–no commands required. You can "converse" just as if the AI robot were a real person. If you're familiar with Chat GPT, you'll recognize it because the layout is similar. You're greeted by a stripped-down screen with a text input field at the bottom.


Do We Need iPhone Moment or Xiaomi Moment for Robots? Design of Affordable Home Robots for Health Monitoring

Wei, Bo, Bian, Yaya, Gao, Mingcen

arXiv.org Artificial Intelligence

In this paper, we study cost-effective home robot solutions which are designed for home health monitoring. The recent advancements in Artificial Intelligence (AI) have significantly advanced the capabilities of the robots, enabling them to better and efficiently understand and interact with their surroundings. The most common robots currently used in homes are toy robots and cleaning robots. While these are relatively affordable, their functionalities are very limited. On the other hand, humanoid and quadruped robots offer more sophisticated features and capabilities, albeit at a much higher cost. Another category is educational robots, which provide educators with the flexibility to attach various sensors and integrate different design methods with the integrated operating systems. However, the challenge still exists in bridging the gap between affordability and functionality. Our research aims to address this by exploring the potential of developing advanced yet affordable and accessible robots for home robots, aiming for health monitoring, by using edge computing techniques and taking advantage of existing computing resources for home robots, such as mobile phones.


5G Virtual Reality Manipulator Teleoperation using a Mobile Phone

Werner, Alexander, Melek, William

arXiv.org Artificial Intelligence

This paper presents an approach to teleoperate a manipulator using a mobile phone as a leader device. Using its IMU and camera, the phone estimates its Cartesian pose which is then used to to control the Cartesian pose of the robot's tool. The user receives visual feedback in the form of multi-view video - a point cloud rendered in a virtual reality environment. This enables the user to observe the scene from any position. To increase immersion, the robot's estimate of external forces is relayed using the phone's haptic actuator. Leader and follower are connected through wireless networks such as 5G or Wi-Fi. The paper describes the setup and analyzes its performance.


Prompt Public Large Language Models to Synthesize Data for Private On-device Applications

Wu, Shanshan, Xu, Zheng, Zhang, Yanxiang, Zhang, Yuanbo, Ramage, Daniel

arXiv.org Artificial Intelligence

Pre-training on public data is an effective method to improve the performance for federated learning (FL) with differential privacy (DP). This paper investigates how large language models (LLMs) trained on public data can improve the quality of pre-training data for the on-device language models trained with DP and FL. We carefully design LLM prompts to filter and transform existing public data, and generate new data to resemble the real user data distribution. The model pre-trained on our synthetic dataset achieves relative improvement of 19.0% and 22.8% in next word prediction accuracy compared to the baseline model pre-trained on a standard public dataset, when evaluated over the real user data in Gboard (Google Keyboard, a production mobile keyboard application). Furthermore, our method achieves evaluation accuracy better than or comparable to the baseline during the DP FL fine-tuning over millions of mobile devices, and our final model outperforms the baseline in production A/B testing. Our experiments demonstrate the strengths of LLMs in synthesizing data close to the private distribution even without accessing the private data, and also suggest future research directions to further reduce the distribution gap.


From Graph to Word Bag: Introducing Domain Knowledge to Confusing Charge Prediction

Li, Ang, Chen, Qiangchao, Wu, Yiquan, Cai, Ming, Zhou, Xiang, Wu, Fei, Kuang, Kun

arXiv.org Artificial Intelligence

Confusing charge prediction is a challenging task in legal AI, which involves predicting confusing charges based on fact descriptions. While existing charge prediction methods have shown impressive performance, they face significant challenges when dealing with confusing charges, such as Snatch and Robbery. In the legal domain, constituent elements play a pivotal role in distinguishing confusing charges. Constituent elements are fundamental behaviors underlying criminal punishment and have subtle distinctions among charges. In this paper, we introduce a novel From Graph to Word Bag (FWGB) approach, which introduces domain knowledge regarding constituent elements to guide the model in making judgments on confusing charges, much like a judge's reasoning process. Specifically, we first construct a legal knowledge graph containing constituent elements to help select keywords for each charge, forming a word bag. Subsequently, to guide the model's attention towards the differentiating information for each charge within the context, we expand the attention mechanism and introduce a new loss function with attention supervision through words in the word bag. We construct the confusing charges dataset from real-world judicial documents. Experiments demonstrate the effectiveness of our method, especially in maintaining exceptional performance in imbalanced label distributions.