pwc
ChatPD: An LLM-driven Paper-Dataset Networking System
Xu, Anjie, Ding, Ruiqing, Wang, Leye
Scientific research heavily depends on suitable datasets for method validation, but existing academic platforms with dataset management like PapersWithCode suffer from inefficiencies in their manual workflow. To overcome this bottleneck, we present a system, called ChatPD, that utilizes Large Language Models (LLMs) to automate dataset information extraction from academic papers and construct a structured paper-dataset network. Our system consists of three key modules: \textit{paper collection}, \textit{dataset information extraction}, and \textit{dataset entity resolution} to construct paper-dataset networks. Specifically, we propose a \textit{Graph Completion and Inference} strategy to map dataset descriptions to their corresponding entities. Through extensive experiments, we demonstrate that ChatPD not only outperforms the existing platform PapersWithCode in dataset usage extraction but also achieves about 90\% precision and recall in entity resolution tasks. Moreover, we have deployed ChatPD to continuously extract which datasets are used in papers, and provide a dataset discovery service, such as task-specific dataset queries and similar dataset recommendations. We open source ChatPD and the current paper-dataset network on this [GitHub repository]{https://github.com/ChatPD-web/ChatPD}.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > Canada > Ontario > Toronto (0.05)
- Asia > China > Beijing > Beijing (0.04)
- (10 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Are all models wrong? Fundamental limits in distribution-free empirical model falsification
Müller, Manuel M., Luo, Yuetian, Barber, Rina Foygel
In statistics and machine learning, when we train a fitted model on available data, we typically want to ensure that we are searching within a model class that contains at least one accurate model -- that is, we would like to ensure an upper bound on the model class risk (the lowest possible risk that can be attained by any model in the class). However, it is also of interest to establish lower bounds on the model class risk, for instance so that we can determine whether our fitted model is at least approximately optimal within the class, or, so that we can decide whether the model class is unsuitable for the particular task at hand. Particularly in the setting of interpolation learning where machine learning models are trained to reach zero error on the training data, we might ask if, at the very least, a positive lower bound on the model class risk is possible -- or are we unable to detect that "all models are wrong"? In this work, we answer these questions in a distribution-free setting by establishing a model-agnostic, fundamental hardness result for the problem of constructing a lower bound on the best test error achievable over a model class, and examine its implications on specific model classes such as tree-based methods and linear regression.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > Middle East > Jordan (0.04)
Productivity soars in sectors of global economy most exposed to AI, says report
The sectors of the global economy most heavily exposed to artificial intelligence (AI) are witnessing a marked productivity increase and command a significant wage premium, according to a report. Boosting hopes that AI might help lift the global economy out of a 15-year, low-growth trough, a PwC study found productivity growth was almost five times as rapid in parts of the economy where AI penetration was highest than in less exposed sectors. PwC said that in the UK, one of the 15 countries covered by the report, job postings that require AI skills were growing 3.6 times faster relative to all job listings. On average, UK employers were willing to pay a 14% wage premium for jobs that require AI skills, with the legal and information technology sectors experiencing the highest premiums. The uptick in productivity in sectors more exposed to AI – such as financial services, information technology, and professional services – was marginally higher in the UK than the global average.
PwC's 4,000 legal staffers get AI assistant as law chatbots gain steam
PwC said it partnered with AI startup Harvey for an initial 12-month contract, which the accounting and consulting firm said will help lawyers with contract analysis, regulatory compliance work, due diligence and other legal advisory and consulting services. PwC said it will also determine ways for tax professionals to use the technology. It said its access to Harvey's technology is exclusive among the Big Four professional services firms. Harvey is built on technology from OpenAI, the Microsoft Corp-backed startup that on Tuesday released an upgraded version of its AI sensation ChatGPT. Harvey received a $5 million investment last year in a funding round led by the OpenAI Startup Fund.
- Professional Services (1.00)
- Law (1.00)
- Information Technology > Security & Privacy (0.54)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.88)
Fulltime Cloud Architect openings in California on September 25, 2022
The Cloud and BigData Software Architect is responsible for leading technical efforts related to modern data engineering, based on Cloud computing technologies. The candidate is expected to have demonstrated experience in the domain, with proven record of architecting data centric systems at Cloud scale, supporting complex use cases. Experience with public cloud environments, with a focus on the various data services and cloud native software architecture are important for the position. Effectively participate in defining, designing, architecting and implementing cloud based systems. Master new and emerging technologies, and effectively apply them into new systems. Be in the forefront of the happenings related to data.
- North America > United States > California > San Francisco County > San Francisco (0.05)
- North America > United States > Colorado (0.05)
- North America > United States > California > Santa Clara County > Sunnyvale (0.04)
- (2 more...)
- Information Technology > Services (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
The Future of the Metaverse + AI and Data Looks Bright
We're on the cusp of major breakthroughs in the metaverse, that collective world of new augmented reality (AR) and virtual reality (VR) experiences that is gaining steam. The new meta modality will not only be a huge new source of user and behavioral data for enterprises, but it will be a fertile ground for deploying rapidly maturing AI technologies like NLP and computer vision. AR and VR technologies have been simmering on the backburner for years. Up to this point, VR has primarily been used for immersive video games, although a few enterprise software companies have dabbled in AR and VR interfaces, including Looker before it was snapped up by Google Cloud. But Facebook CEO Mark Zuckerberg put the collective worlds of AR and VR, i.e. the metaverse, firmly on the front burner last month when he announced Facebook was changing its name to Meta.
- Information Technology > Services (0.55)
- Leisure & Entertainment > Games > Computer Games (0.50)
Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research
Koch, Bernard, Denton, Emily, Hanna, Alex, Foster, Jacob G.
Benchmark datasets play a central role in the organization of machine learning research. They coordinate researchers around shared research problems and serve as a measure of progress towards shared goals. Despite the foundational role of benchmarking practices in this field, relatively little attention has been paid to the dynamics of benchmark dataset use and reuse, within or across machine learning subcommunities. In this paper, we dig into these dynamics. We study how dataset usage patterns differ across machine learning subcommunities and across time from 2015-2020. We find increasing concentration on fewer and fewer datasets within task communities, significant adoption of datasets from other tasks, and concentration across the field on datasets that have been introduced by researchers situated within a small number of elite institutions. Our results have implications for scientific evaluation, AI ethics, and equity/access within the field.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- (7 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
PwC rated as a Leader in Artificial Intelligence Consultancies by Independent Research Firm
PwC announced that it was cited as a Leader in The Forrester Wave: AI Consultancies, Q1 2021. In the report, Forrester notes that "AI consultancy customers should look for providers that: Commenting on PwC, the report states that: "The PwC backstory has two facets -- client transformations and its own. PwC helps transform client businesses, but its own transformation is part of its story. PwC doubled down on its own upskilling and IP-building platform and then launched this for clients. One-off simulation projects are now scaled offerings for strategic planning, operations, and continuous scaling of business models. Even strategic innovation partnerships are points of excellence; one client specifically selected PwC because of the consultancy's relationship with Carnegie Mellon."
As AI takes over more jobs, optimism remains in gaming and other creative industries
While many people may have been too busy staring at screens from 9am to 6pm every day to notice, a quiet revolution has started in workplaces around the world. Automation aided by artificial intelligence has increasingly been taking over more aspects of workers' jobs in recent years, and pundits have warned for years that this shift could lead to social upheaval. But not everyone is so pessimistic. Just as ancient Greek and Roman elites once relied on slave labour while spending their time on more leisurely pursuits, many workers today could soon see a rise in their own free time as repetitive work is outsourced to machines, according to Witman Hung, a technology industry veteran and Hong Kong deputy to China's 13th National People's Congress. Get the latest insights and analysis from our Global Impact newsletter on the big stories originating in China. "Maybe we will be working 15 hours a week and spending our time on better things such as playing games and thinking about philosophy," Hung said on Wednesday during a panel discussion about automation and the future of job opportunities organised by the South China Morning Post.
- Asia > China > Hong Kong (0.25)
- North America > United States > Massachusetts (0.05)
- North America > United States > California (0.05)
Don't Fear AI. It Will Lead To Long-Term Job Growth.
The COVID-19 pandemic has accelerated technological advances and the automation of many routine tasks – from contactless cashiers to robots delivering packages. In this environment, many are concerned that artificial intelligence (AI) will drive significant automation and destroy jobs in the coming decades. Just a few decades ago, the internet created similar concerns as it grew. Despite skepticism, the technology created millions of jobs and now comprises 10% of US GDP. Today, AI is poised to create even greater growth in the US and global economies.
- North America > United States (0.25)
- Asia > India (0.05)
- Asia > China (0.05)