Collaborating Authors

Multivariate modeling vs. univariate modeling along human intuition: predicting taste of wine


I wrote a blog post inspired by Jamie Goode's book "Wine Science: The Application of Science in Winemaking". In this book, Goode argued that reductionistic approach cannot explain relationship between chemical ingredients and taste of wine. Indeed, we know not all high (alcohol) wines are excellent, although in general high wines are believed to be good. Usually taste of wine is affected by a complicated balance of many components such as sweetness, acid, tannin, density or others that are given by corresponding chemical entities. However, I think (and probably many other data science experts agree) that it is not a limitation of reductionistic approach, but a limitation of univariate modeling.

Alternatives to algebraic modeling for complex data: topological modeling via Gunnar Carlsson


For many, mathematical modeling is exclusively about algebraic models, based on one form or another of regression or on differential equation modeling in the case of dynamical systems. However, this is too restrictive a point of view. For example, a clustering algorithm can be regarded as a modeling mechanism applicable to data where linear regression simply isn't applicable. Hierarchical clustering can also be regarded as a modeling mechanism, where the output is a dendrogram and contains information about the behavior of clusters at different levels of resolution. Kohonen self-organizing maps can similarly be regarded in this way.

Analyzing and Modeling Special Offer Campaigns in Location-Based Social Networks

AAAI Conferences

The proliferation of mobile handheld devices in combination with the technological advancements in mobile computing has led to a number of innovative services that make use of the location information available on such devices. Traditional yellow pages websites have now moved to mobile platforms, giving the opportunity to local businesses and potential, near-by, customers to connect. These platforms can offer an affordable advertisement channel to local businesses. One of the mechanisms offered by location-based social networks (LBSNs) allows businesses to provide special offers to their customers that connect through the platform. We collect a large time-series dataset from approximately 14 million venues on Foursquare and analyze the performance of such campaigns using randomization techniquesand (non-parametric) hypothesis testing with statistical bootstrapping. Our main finding indicates that this type of promotions are not as effective as anecdote success stories might suggest. Finally, we design classifiers by extracting three different types of features that are able to provide an educated decision on whether a special offer campaign for a local business will succeed or not both in short and long term.