Goto

Collaborating Authors

 pivot table




ACCIO: Table Understanding Enhanced via Contrastive Learning with Aggregations

arXiv.org Artificial Intelligence

The attention to table understanding using recent natural language models has been growing. However, most related works tend to focus on learning the structure of the table directly. Just as humans improve their understanding of sentences by comparing them, they can also enhance their understanding by comparing tables. With this idea, in this paper, we introduce ACCIO, tAble understanding enhanCed via Contrastive learnIng with aggregatiOns, a novel approach to enhancing table understanding by contrasting original tables with their pivot summaries through contrastive learning. ACCIO trains an encoder to bring these table pairs closer together. Through validation via column type annotation, ACCIO achieves competitive performance with a macro F1 score of 91.1 compared to state-of-the-art methods. This work represents the first attempt to utilize pairs of tables for table embedding, promising significant advancements in table comprehension. Our code is available at https://github.com/whnhch/ACCIO/.


SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models

arXiv.org Artificial Intelligence

Computer end users have spent billions of hours completing daily tasks like tabular data processing and project timeline scheduling. Most of these tasks are repetitive and error-prone, yet most end users lack the skill to automate these burdensome works. With the advent of large language models (LLMs), directing software with natural language user requests become a reachable goal. In this work, we propose a SheetCopilot agent that takes natural language task and control spreadsheet to fulfill the requirements. We propose a set of atomic actions as an abstraction of spreadsheet software functionalities. We further design a state machine-based task planning framework for LLMs to robustly interact with spreadsheets. We curate a representative dataset containing 221 spreadsheet control tasks and establish a fully automated evaluation pipeline for rigorously benchmarking the ability of LLMs in software control tasks. Our SheetCopilot correctly completes 44.3\% of tasks for a single generation, outperforming the strong code generation baseline by a wide margin. Our project page:https://sheetcopilot.github.io/.


Markov Algorithm For Time Series. Table of Contents

#artificialintelligence

In order to go from continuous data to a discretized dataset that can be used for State-Transition modeling, I need a mapping function. Here I will use the percentile function. Excel has a percentile function. What I am going to do is create a rolling 8-day window to get the percentile of the current value within the 8-day window. In this case, I am using excel.


How I Built a Movie Recommendation System

#artificialintelligence

We are calculating the number of ratings using the count method of a data frame. Using the count method helps count the number of not empty values for each column and returns the result for each column. Sorting by number of ratings, we now see some results. "Star Wars," which is a very famous movie, has got a mean of 4.35 as a rating from 583 users. We are creating a pivot table just to quickly summarize the amount of data we have.


Top 50 Free Udemy Courses

#artificialintelligence

Description: Understand the Theory of how Chatbots work and implement them in Python and PyTorch! Description: This course is for all those people who wants to get a brief idea on Tensorflow.JS in 2020


6 AI features Microsoft added to Office in 2019

#artificialintelligence

Microsoft has added so many AI-driven features to its Office 365 productivity suite this year that we wanted to pull together a comprehensive list -- but it's not as straightforward as it might seem. Features like PowerPoint Designer, OneNote's ink to text, and Word's grammar and style suggestions got notable improvements, but they didn't first show up in 2019. The team is constantly trying to figure out what makes users more productive and what doesn't. Many of the features also rely on machine learning models that adapt based on usage. "You get this amazing signal about how it's making them more productive, how often are they using it and engaging with it, how often are they keeping the results of what you suggest to them," Microsoft 365 general manager Rob Howard told VentureBeat.


Datameer X: Data Prep For Machine Learning

#artificialintelligence

We are excited to reveal exciting new features in Datameer. Some new features were long-time requests from our most loyal customers and other new features are on the cutting-edge of data science. Customers depend on Datameer to transform their raw datasets by formatting, structuring and enriching the datasets for analytic processing and reporting. In addition to data prep, Datameer X is designed for data science. The new features empower users of all levels of technical expertise to speed development of machine learning models and generate trusted, business-ready data insights.


Citizen Data Science: Analyze Nature Without Programming

#artificialintelligence

I recently gave an informal talk to a class of botany students at Gavilan College. The original topic was nature photography, but I also talked about the data science techniques that I used to create my recently completed photo book, Portraits of Birds: Shoreline Park. The concept for the book was to try to personally take photos of all of the bird species in a particular area, in this case Shoreline at Mountain View Park in Mountain View, California, which I later expanded to include the Palo Alto Baylands. To enumerate the species that have been seen in this area, I turned to two citizen science sites, iNaturalist and eBird, both of which have application programmatic interfaces (APIs). Note that while eBird is specific to birds, iNaturalist contains data on plants and other animals as well.