thumbtack
Beyond the Hype: Embeddings vs. Prompting for Multiclass Classification Tasks
Kokkodis, Marios, Demsyn-Jones, Richard, Raghavan, Vijay
Are traditional classification approaches irrelevant in this era of AI hype? We show that there are multiclass classification problems where predictive models holistically outperform LLM prompt-based frameworks. Given text and images from home-service project descriptions provided by Thumbtack customers, we build embeddings-based softmax models that predict the professional category (e.g., handyman, bathroom remodeling) associated with each problem description. We then compare against prompts that ask state-of-the-art LLM models to solve the same problem. We find that the embeddings approach outperforms the best LLM prompts in terms of accuracy, calibration, latency, and financial cost. In particular, the embeddings approach has 49.5\% higher accuracy than the prompting approach, and its superiority is consistent across text-only, image-only, and text-image problem descriptions. Furthermore, it yields well-calibrated probabilities, which we later use as confidence signals to provide contextualized user experience during deployment. On the contrary, prompting scores are overly uninformative. Finally, the embeddings approach is 14 and 81 times faster than prompting in processing images and text respectively, while under realistic deployment assumptions, it can be up to 10 times cheaper. Based on these results, we deployed a variation of the embeddings approach, and through A/B testing we observed performance consistent with our offline analysis. Our study shows that for multiclass classification problems that can leverage proprietary datasets, an embeddings-based approach may yield unequivocally better results. Hence, scientists, practitioners, engineers, and business leaders can use our study to go beyond the hype and consider appropriate predictive models for their classification use cases.
- North America > United States > New York (0.06)
- North America > United States > California (0.06)
Data Scientist
A home is the biggest investment most people make, and yet, it doesn't come with a manual. That's why we're building the only app homeowners need to effortlessly manage their homes -- knowing what to do, when to do it, and who to hire. With Thumbtack, millions of people care for what matters most, and pros earn billions of dollars through our platform. And as one of the fastest-growing companies in a $500B industry -- we must be doing something right. We are driven by a common goal and the deep satisfaction that comes from knowing our work supports local economies, helps small businesses grow, and brings homeowners peace of mind.
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.06)
- North America > United States > California > San Francisco County > San Francisco (0.06)
- North America > Canada > Ontario > Toronto (0.06)
- Asia > Philippines > Luzon > National Capital Region > City of Manila (0.06)
- Health & Medicine (0.93)
- Information Technology > Security & Privacy (0.32)
- Information Technology > Data Science (0.68)
- Information Technology > Artificial Intelligence (0.53)
Senior Software Engineer, Machine Learning Infrastructure
A home is the biggest investment most people make, and yet, it doesn't come with a manual. That's why we're building the only app homeowners need to effortlessly manage their homes -- knowing what to do, when to do it, and who to hire. With Thumbtack, millions of people care for what matters most, and pros earn billions of dollars through our platform. And as one of the fastest-growing companies in a $500B industry -- we must be doing something right. We are driven by a common goal and the deep satisfaction that comes from knowing our work supports local economies, helps small businesses grow, and brings homeowners peace of mind.
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.06)
- North America > United States > California > San Francisco County > San Francisco (0.06)
- North America > Canada > Ontario > Toronto (0.06)
- Asia > Philippines > Luzon > National Capital Region > City of Manila (0.06)
Measurement and applications of position bias in a marketplace search engine
Search engines intentionally influence user behavior by picking and ranking the list of results. Users engage with the highest results both because of their prominent placement and because they are typically the most relevant documents. Search engine ranking algorithms need to identify relevance while incorporating the influence of the search engine itself. This paper describes our efforts at Thumbtack to understand the impact of ranking, including the empirical results of a randomization program. In the context of a consumer marketplace we discuss practical details of model choice, experiment design, bias calculation, and machine learning model adaptation. We include a novel discussion of how ranking bias may not only affect labels, but also model features. The randomization program led to improved models, motivated internal scenario analysis, and enabled user-facing scenario tooling.
Senior Data Analyst
Today, millions of people use Thumbtack to effortlessly manage their homes. We help them confidently know what to do, when to do it and who to hire. Our goal is simple: to be the only platform homeowners need to fix, maintain and improve their homes. As a long-term partner for homeowners, our promise is to turn what was once confusing and intimidating into something straightforward -- and a lot less stressful. Each day, we connect local professionals across America with busy homeowners so they can grow their businesses.
- North America > United States > California > San Francisco County > San Francisco (0.08)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.07)
- North America > Canada > Ontario > Toronto (0.05)
- Asia > Philippines > Luzon > National Capital Region > City of Manila (0.05)
- Information Technology > Data Science > Data Mining > Big Data (0.41)
- Information Technology > Artificial Intelligence (0.40)
Newton & Kepler: Effect & Cause?
I frequently make the point that science faith and one of the ways that I have found to illustrate this is to use Kepler's 1st Law: According to Kepler's First Law of Planetary Motion, planetary orbits are ellipses with the Sun at one focus of the ellipse. This means that even if they have the same size, ellipses with different shapes do not have the same center. My question is: what is the source of gravity at focus 2? Most people have no clue, which illustrates one aspect of science faith, which is that most people believe something that they don't understand and can't explain. They have faith that someone understands it, that it is understandable. They believe that it is fact, proven, and can be dismissed as unnecessary knowledge.