Radio
GenHPE: Generative Counterfactuals for 3D Human Pose Estimation with Radio Frequency Signals
Huang, Shuokang, McCann, Julie A.
Human pose estimation (HPE) detects the positions of human body joints for various applications. Compared to using cameras, HPE using radio frequency (RF) signals is non-intrusive and more robust to adverse conditions, exploiting the signal variations caused by human interference. However, existing studies focus on single-domain HPE confined by domain-specific confounders, which cannot generalize to new domains and result in diminished HPE performance. Specifically, the signal variations caused by different human body parts are entangled, containing subject-specific confounders. RF signals are also intertwined with environmental noise, involving environment-specific confounders. In this paper, we propose GenHPE, a 3D HPE approach that generates counterfactual RF signals to eliminate domain-specific confounders. GenHPE trains generative models conditioned on human skeleton labels, learning how human body parts and confounders interfere with RF signals. We manipulate skeleton labels (i.e., removing body parts) as counterfactual conditions for generative models to synthesize counterfactual RF signals. The differences between counterfactual signals approximately eliminate domain-specific confounders and regularize an encoder-decoder model to learn domain-independent representations. Such representations help GenHPE generalize to new subjects/environments for cross-domain 3D HPE. We evaluate GenHPE on three public datasets from WiFi, ultra-wideband, and millimeter wave. Experimental results show that GenHPE outperforms state-of-the-art methods and reduces estimation errors by up to 52.2mm for cross-subject HPE and 10.6mm for cross-environment HPE.
Noise-Robust Radio Frequency Fingerprint Identification Using Denoise Diffusion Model
Yin, Guolin, Zhang, Junqing, Ding, Yuan, Cotton, Simon
Securing Internet of Things (IoT) devices presents increasing challenges due to their limited computational and energy resources. Radio Frequency Fingerprint Identification (RFFI) emerges as a promising authentication technique to identify wireless devices through hardware impairments. RFFI performance under low signal-to-noise ratio (SNR) scenarios is significantly degraded because the minute hardware features can be easily swamped in noise. In this paper, we leveraged the diffusion model to effectively restore the RFF under low SNR scenarios. Specifically, we trained a powerful noise predictor and tailored a noise removal algorithm to effectively reduce the noise level in the received signal and restore the device fingerprints. We used Wi-Fi as a case study and created a testbed involving 6 commercial off-the-shelf Wi-Fi dongles and a USRP N210 software-defined radio (SDR) platform. We conducted experimental evaluations on various SNR scenarios. The experimental results show that the proposed algorithm can improve the classification accuracy by up to 34.9%.
iPhone designer still asks: 'I wonder what Steve Jobs would do?' โ despite being told not to
Sir Jony Ive, the innovative designer of Apple's iMac, iPhone and Apple Watch, and a close friend and collaborator of the late Steve Jobs, says he still often asks himself: "I wonder what Steve would do?" Ive told BBC Radio 4's Desert Island Discs on Sunday that he does so despite the fact that Jobs had specifically told him not to before his death in 2011, aged 56. "He used to say I really don't want you to be thinking'Well, what would Steve do?'," Ives said. The designer, who was born in Chingford, Essex, and moved to San Francisco to work at Apple in 1992, worked alongside the company's co-founder and CEO five years later, when Jobs was called back in to help the struggling company after a period working elsewhere. Jobs' return marked an immediate improvement for Ive, he recalled. "It was remarkable that, despite the limitations of my ability to communicate, Steve understood what I thought and how I felt," Ive said.
DiSCo: Device-Server Collaborative LLM-Based Text Streaming Services
Sun, Ting, Wang, Penghan, Lai, Fan
The rapid rise of large language models (LLMs) in text streaming services has introduced significant cost and Quality of Experience (QoE) challenges in serving millions of daily requests, especially in meeting Time-To-First-Token (TTFT) and Time-Between-Token (TBT) requirements for real-time interactions. Our real-world measurements show that both server-based and on-device deployments struggle to meet diverse QoE demands: server deployments face high costs and last-hop issues (e.g., Internet latency and dynamics), while on-device LLM inference is constrained by resources. We introduce DiSCo, a device-server cooperative scheduler designed to optimize users' QoE by adaptively routing requests and migrating response generation between endpoints while maintaining cost constraints. DiSCo employs cost-aware scheduling, leveraging the predictable speed of on-device LLM inference with the flexible capacity of server-based inference to dispatch requests on the fly, while introducing a token-level migration mechanism to ensure consistent token delivery during migration. Evaluations on real-world workloads -- including commercial services like OpenAI GPT and DeepSeek, and open-source deployments such as LLaMA3 -- show that DiSCo can improve users' QoE by reducing tail TTFT (11-52\%) and mean TTFT (6-78\%) across different model-device configurations, while dramatically reducing serving costs by up to 84\% through its migration mechanism while maintaining comparable QoE levels.
User Profile with Large Language Models: Construction, Updating, and Benchmarking
Prottasha, Nusrat Jahan, Kowsher, Md, Raman, Hafijur, Anny, Israt Jahan, Bhat, Prakash, Garibay, Ivan, Garibay, Ozlem
User profile modeling plays a key role in personalized systems, as it requires building accurate profiles and updating them with new information. In this paper, we present two high-quality open-source user profile datasets: one for profile construction and another for profile updating. These datasets offer a strong basis for evaluating user profile modeling techniques in dynamic settings. We also show a methodology that uses large language models (LLMs) to tackle both profile construction and updating. Our method uses a probabilistic framework to predict user profiles from input text, allowing for precise and context-aware profile generation. Our experiments demonstrate that models like Mistral-7b and Llama2-7b perform strongly in both tasks. LLMs improve the precision and recall of the generated profiles, and high evaluation scores confirm the effectiveness of our approach.
Rogue states could use AI to do 'real harm', warns ex-Google CEO
Google's former chief executive has warned that artificial intelligence could be used by rogue states such as North Korea, Iran and Russia to "harm innocent people". Eric Schmidt, who held senior posts at Google from 2001 to 2017, told BBC Radio 4's Today programme that those countries and terrorists could adopt and misuse the technology to develop weapons to create "a bad biological attack from some evil person". The tech billionaire said: "The real fears that I have are not the ones that most people talk about AI โ I talk about extreme risk. "Think about North Korea, or Iran, or even Russia, who have some evil goal. This technology is fast enough for them to adopt that they could misuse it and do real harm."
Copyright and Competition: Estimating Supply and Demand with Unstructured Data
Copyright policies play a pivotal role in protecting the intellectual property of creators and companies in creative industries. The advent of cost-reducing technologies, such as generative AI, in these industries calls for renewed attention to the role of these policies. This paper studies product positioning and competition in a market of creatively differentiated products and the competitive and welfare effects of copyright protection. A common feature of products with creative elements is that their key attributes (e.g., images and text) are unstructured and thus high-dimensional. We focus on a stylized design product, fonts, and use data from the world's largest online marketplace for fonts. We use neural network embeddings to quantify unstructured attributes and measure the visual similarity. We show that this measure closely aligns with actual human perception. Based on this measure, we empirically find that competitions occur locally in the visual characteristics space. We then develop a structural model for supply and demand that integrate the embeddings. Through counterfactual analyses, we find that local copyright protection can enhance consumer welfare when products are relocated, and the interplay between copyright and cost-reducing technologies is essential in determining an optimal policy for social welfare. We believe that the embedding analysis and empirical models introduced in this paper can be applicable to a range of industries where unstructured data captures essential features of products and markets.
Pre-training a Transformer-Based Generative Model Using a Small Sepedi Dataset
Ramalepe, Simon P., Modipa, Thipe I., Davel, Marelie H.
Due to the scarcity of data in low-resourced languages, the development of language models for these languages has been very slow. Currently, pre-trained language models have gained popularity in natural language processing, especially, in developing domain-specific models for low-resourced languages. In this study, we experiment with the impact of using occlusion-based techniques when training a language model for a text generation task. We curate 2 new datasets, the Sepedi monolingual (SepMono) dataset from several South African resources and the Sepedi radio news (SepNews) dataset from the radio news domain. We use the SepMono dataset to pre-train transformer-based models using the occlusion and non-occlusion pre-training techniques and compare performance. The SepNews dataset is specifically used for fine-tuning. Our results show that the non-occlusion models perform better compared to the occlusion-based models when measuring validation loss and perplexity. However, analysis of the generated text using the BLEU score metric, which measures the quality of the generated text, shows a slightly higher BLEU score for the occlusion-based models compared to the non-occlusion models.
Digital Operating Mode Classification of Real-World Amateur Radio Transmissions
Bundscherer, Maximilian, Schmitt, Thomas H., Baumann, Ilja, Bocklet, Tobias
This study presents an ML approach for classifying digital radio operating modes evaluated on real-world transmissions. We generated 98 different parameterized radio signals from 17 digital operating modes, transmitted each of them on the 70 cm (UHF) amateur radio band, and recorded our transmissions with two different architectures of SDR receivers. Three lightweight ML models were trained exclusively on spectrograms of limited non-transmitted signals with random characters as payloads. This training involved an online data augmentation pipeline to simulate various radio channel impairments. Our best model, EfficientNetB0, achieved an accuracy of 93.80% across the 17 operating modes and 85.47% across all 98 parameterized radio signals, evaluated on our real-world transmissions with Wikipedia articles as payloads. Furthermore, we analyzed the impact of varying signal durations & the number of FFT bins on classification, assessed the effectiveness of our simulated channel impairments, and tested our models across multiple simulated SNRs.
Neural Reflectance Fields for Radio-Frequency Ray Tracing
Jia, Haifeng, Chen, Xinyi, Wei, Yichen, Sun, Yifei, Pi, Yibo
Ray tracing is widely employed to model the propagation of radio-frequency (RF) signal in complex environment. The modelling performance greatly depends on how accurately the target scene can be depicted, including the scene geometry and surface material properties. The advances in computer vision and LiDAR make scene geometry estimation increasingly accurate, but there still lacks scalable and efficient approaches to estimate the material reflectivity in real-world environment. In this work, we tackle this problem by learning the material reflectivity efficiently from the path loss of the RF signal from the transmitters to receivers. Specifically, we want the learned material reflection coefficients to minimize the gap between the predicted and measured powers of the receivers. We achieve this by translating the neural reflectance field from optics to RF domain by modelling both the amplitude and phase of RF signals to account for the multipath effects. We further propose a differentiable RF ray tracing framework that optimizes the neural reflectance field to match the signal strength measurements. We simulate a complex real-world environment for experiments and our simulation results show that the neural reflectance field can successfully learn the reflection coefficients for all incident angles. As a result, our approach achieves better accuracy in predicting the powers of receivers with significantly less training data compared to existing approaches.