Yandex recently announced its new search algorithm Palekh, which improves how Yandex understands the meaning behind every search query by using its deep neural networks as a ranking factor among others. Ultimately, the new algorithm helps Yandex improve its search results across the board but especially for long-tail search queries. As most State of Digital readers know, long-tail search queries are categorized by searches that the search engine very rarely processes. There is a correlation between the rarity of a query and the length of it. Typically, the shorter the query the more common it is and the longer it is the more rare it is.
Search engines today present results that are often oblivious to recent shifts in intent. While no studies exactly quantify the magnitude of intent-shifting traffic, studies suggest that news events, seasonal topics, pop culture, etc account for 1/2 the search queries. This paper shows that the signals a search engine receives can be used to both determine that a shift in intent happened, as well as find a result that is now more relevant. We present a meta-algorithm that marries a classifier with a bandit algorithm to achieve regret that depends logarithmically on the number of query impressions, under certain assumptions. We provide strong evidence that this regret is close to the best achievable.
Many web-search queries serve as the beginning of an exploration of an unknown space of information, rather than looking for a specific web page. To answer such queries effec- tively, the search engine should attempt to organize the space of relevant information in a way that facilitates exploration. We describe the Aspector system that computes aspects for a given query. Each aspect is a set of search queries that together represent a distinct information need relevant to the original search query. To serve as an effective means to explore the space, Aspector computes aspects that are orthogonal to each other and to have high combined coverage. Aspector combines two sources of information to compute aspects. We discover candidate aspects by analyzing query logs, and cluster them to eliminate redundancies. We then use a mass-collaboration knowledge base (e.g., Wikipedia) to compute candidate aspects for queries that occur less frequently and to group together aspects that are likely to be semantically related. We present a user study that indicates that the aspects we compute are rated favorably against three competing alternatives related searches proposed by Google, cluster labels assigned by the Clusty search engine, and navigational searches proposed by Bing.
Gmail dominates the email landscape. It provides gigabytes of storage, works at super-fast speeds in any browser, and automatically sorts messages into specialized sections depending on their content and importance. But perhaps its greatest strength--as you might expect from a Google product--is its search abilities. You're probably comfortable typing a few words or contact names into the search box at the top of the page, but some less obvious terms can open up whole new inbox-sifting possibilities. We've collected nine invaluable search tools to help you organize your inbox, find lost emails, dig up oversize attachments, unsubscribe from spam, and more.
At their core, data scientists have a math and statistics background. Out of this math background, they're creating advanced analytics. Just like their software engineering counterparts, data scientists will have to interact with the business side. This includes understanding the domain enough to make insights. Data scientists are often tasked with analyzing data to help the business, and this requires a level of business acumen. Finally, their results need to be given to the business in an understandable fashion. This requires the ability to verbally and visually communicate complex results and observations in a way that the business can understand and act on them. Thus, it'll be extremely valuable for any aspiring data scientists to learn data mining -- the process where one structures the raw data and formulate or recognize the various patterns in the data through the mathematical and computational algorithms. This helps generate new information and unlock various insights. Here is a simple list of reasons on why you should study Data Mining?