price
An Uncertainty Principle is a Price of Privacy-Preserving Microdata
Privacy-protected microdata are often the desired output of a differentially private algorithm since microdata is familiar and convenient for downstream users. However, there is a statistical price for this kind of convenience. We show that an uncertainty principle governs the trade-off between accuracy for a population of interest (``sum query'') vs. accuracy for its component sub-populations (``point queries''). Compared to differentially private query answering systems that are not required to produce microdata, accuracy can degrade by a logarithmic factor. For example, in the case of pure differential privacy, without the microdata requirement, one can provide noisy answers to the sum query and all point queries while guaranteeing that each answer has squared error $O(1/\epsilon^2)$. With the microdata requirement, one must choose between allowing an additional $\log^2(d)$ factor ($d$ is the number of point queries) for some point queries or allowing an extra $O(d^2)$ factor for the sum query. We present lower bounds for pure, approximate, and concentrated differential privacy. We propose mitigation strategies and create a collection of benchmark datasets that can be used for public study of this problem.
Optimality and Stability in Federated Learning: A Game-theoretic Approach
Federated learning is a distributed learning paradigm where multiple agents, each only with access to local data, jointly learn a global model. There has recently been an explosion of research aiming not only to improve the accuracy rates of federated learning, but also provide certain guarantees around social good properties such as total error. One branch of this research has taken a game-theoretic approach, and in particular, prior work has viewed federated learning as a hedonic game, where error-minimizing players arrange themselves into federating coalitions. This past work proves the existence of stable coalition partitions, but leaves open a wide range of questions, including how far from optimal these stable solutions are. In this work, we motivate and define a notion of optimality given by the average error rates among federating agents (players).
M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis
Wu, Chengyan, Ma, Bolei, Liu, Yihong, Zhang, Zheyu, Deng, Ningyuan, Li, Yanshu, Chen, Baolan, Zhang, Yi, Plank, Barbara, Xue, Yun
Aspect-based sentiment analysis (ABSA) is a crucial task in information extraction and sentiment analysis, aiming to identify aspects with associated sentiment elements in text. However, existing ABSA datasets are predominantly English-centric, limiting the scope for multilingual evaluation and research. To bridge this gap, we present M-ABSA, a comprehensive dataset spanning 7 domains and 21 languages, making it the most extensive multilingual parallel dataset for ABSA to date. Our primary focus is on triplet extraction, which involves identifying aspect terms, aspect categories, and sentiment polarities. The dataset is constructed through an automatic translation process with human review to ensure quality. We perform extensive experiments using various baselines to assess performance and compatibility on M-ABSA. Our empirical findings highlight that the dataset enables diverse evaluation tasks, such as multilingual and multi-domain transfer learning, and large language model evaluation, underscoring its inclusivity and its potential to drive advancements in multilingual ABSA research.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- (21 more...)
- Research Report (0.81)
- Instructional Material > Course Syllabus & Notes (0.45)
- Education (0.50)
- Consumer Products & Services (0.46)
- Energy (0.45)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
- (2 more...)
Reviews: The Price of Fair PCA: One Extra dimension
The manuscript proposes a dimensionality reduction method called "fair PCA". The proposed study is based on the observation that, in a data model containing more than one data category ("population" as called by authors), the projection learnt by PCA may yield different reconstruction errors for different populations. This may impair the performance of machine learning algorithms that have access to dimensionality-reduced data obtained via PCA. To address this problem, the authors propose a variant of the PCA algorithm that minimizes the total deviation between the error of the learnt projection and the error of the optimal projection for each population. Quality: The paper is based on an interesting idea with an interesting motivation. The technical content of the paper is of satisfactory depth.
Measuring Sales Performance Using Simple Statistical Models
Measuring sales performance is a crucial aspect of running a successful business. Accurately tracking and analyzing sales data helps companies understand their strengths and weaknesses, perform forecasts, identify trends, and make informed decisions that drive growth. In this article, I will illuminate how some simple statistical models can be used for measuring sales performance. Whether it is a small or enterprise sales team, simple quantitative techniques can be used to provide valuable sales insights or draw attention to areas of need. After reading this article, you will see various examples how simple models are applied in real life scenarios. Note: All the images in the article were generated by Artificial Intelligence using Stable Diffusion 2.x.
Inside Africa's first humanoid
Somewhere in Mabushi, a crossroad area between the inner and outer districts of Abuja, Nigeria, Uniccon Group, a two-year-old Nigerian technology firm, has built a humanoid: a 6-foot-tall multilingual human-like robot called Omeife. From an idea that was conceptualised in 2020 to a back-and-forth construction--slow wins and quick-succession learning--that stretched across two years, Omeife, built as a female Igbo character that understands and speaks eight different languages, is now a product ready to meet the world. Powered by sophisticated artificial intelligence algorithms developed in-house by the company's team of scientists, Omeife has a deep understanding of African culture and behavioural patterns. Speaking to TechCabal about the project over a call, Chuks Ekwueme, who founded the company in 2020 and serves as its CEO, revealed that the humanoid also has a real time understanding of its environment including active listening and the ability to focus on a specific conversation thread as it's happening. "It's not just multilingual, it has the ability to switch languages and interact with specific gestures--hand illustrations, smile and other bodily gestures--that match the tone of the conversation," said Ekwueme.
- Africa > Nigeria > Federal Capital Territory > Abuja (0.25)
- Asia > Middle East > UAE > Dubai Emirate > Dubai (0.06)
- Africa > South Africa (0.05)
Pornhub hasn't been actively enforcing its deepfake ban
Remember how Pornhub said it was banning AI-generated deepfake videos? BuzzFeed News "easily" found over 100 of the non-consensual videos by searching for obvious keywords like "deepfake" and "fake deep," nearly all of them explicitly mentioning their deepfake status in the title or the uploader's username. The clips had been around for a while, too, as they had hundreds of thousands or even millions of views. Some videos surfaced in home page recommendations. The site had removed some of the videos after BuzzFeed got in touch, but others were still listed (again, with hundreds of thousands of views) after searching for other keywords.
Thinking Machines – Jason Allen Ashlock – Medium
Like a film critic asked if the Oscars got it right this year, one has to feel a sense of standing too close to the frame, the field of vision too narrow to provide the context necessary for proper judgment. After spending an afternoon among the various installations that comprise "Thinking Machines: Art and Design in the Computer Age: 1959–1989," I wonder if this anxiety applied to the team tasked with creating this exhibit. In this case, I think not. Here, closeness to the frame is a virtue, not a vice. The thread running through the exhibit's diverse elements is both lucid and enjoyable: celebrating the collaborative possibilities between human artists and computational machines, and the energies unleashed when the humanistic meet the systematic.
- Media > Film (0.56)
- Leisure & Entertainment (0.56)
- Information Technology > Artificial Intelligence > Issues > Turing's Test (0.64)
- Information Technology > Artificial Intelligence > Issues > Philosophy (0.64)
ODSC East 2018 Open Data Science Conference
ODSC East 2018 is one of the largest applied data science conferences in the world. Our speakers include some of the core contributors to many open source tools, libraries, and languages. Attend ODSC East 2018 and learn the latest AI & data science topics, tools, and languages from some of the best and brightest minds in the field. See schedule for many more.. The largest applied data science conference is now 4 days including 2 full training days for even more talks, trainings, and workshops vested in 8 focused courses.
- Information Technology > Data Science > Data Mining > Big Data (1.00)
- Information Technology > Artificial Intelligence (1.00)