Collaborating Authors

Information Retrieval

Ahana Cloud for Presto review: Fast SQL queries against data lakes


Hope springs eternal in the database business. While we're still hearing about data warehouses (fast analysis databases, typically featuring in-memory columnar storage) and tools that improve the ETL step (extract, transform, and load), we're also hearing about improvements in data lakes (which store data in its native format) and data federation (on-demand data integration of heterogeneous data stores). Presto keeps coming up as a fast way to perform SQL queries on big data that resides in data lake files. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes. Presto allows querying data where it lives, including Hive, Cassandra, relational databases, and proprietary data stores.

Weaviate is an open-source search engine powered by ML, vectors, graphs, and GraphQL


Bob van Luijt's career in technology started at age 15, building websites to help people sell toothbrushes online. Not many 15 year-olds do that. Apparently, this gave van Luijt enough of a head start to arrive at the confluence of technology trends today. Van Luijt went on to study arts but ended up working full time in technology anyway. In 2015, when Google introduced its RankBrain algorithm, the quality of search results jumped up.

Effective Search Engine Optimization Requires a Commitment, Not a Campaign


A successful product and brand strategy develops brand awareness and identity that distinguishes a product from countless others based on just the brand name. A well-crafted strategy repeatedly reminds prospective and existing customers why they should buy a particular product over others with similar characteristics. Brand is just a perception, and perception will match reality over time. A brand is any trade mark through which a product is correctly identified and described by consumers. Therefore, the brand includes any action and remedy by which the product is identified.

Attract a larger audience with this stacked SEO training bundle


TL;DR: The SEO Blueprint for Ranking on Google Bundle is on sale for £29.15 as of March 26, saving you 92% on list price. Digital marketing trends come and go, but there's one thing that never changes: You won't be as relevant if you're not on the first page of Google search, and there's data to back it up. Reports show that 75% of people never bother to scroll past the first page of search results, so regardless of how stellar your content is, barely anyone will see it unless you know what you're doing. We're talking about search engine optimisation (SEO), which is a set of processes that help your site and content skyrocket to the first page and gain relevance. It takes time, energy, and patience to get your SEO in a good place, but the SEO Blueprint Course Bundle can certainly help.

Database Systems Research in the Arab World

Communications of the ACM

From Hammurabi's stone tablets to papyrus rolls and leather-bound books, the Arab region has a rich history of recordkeeping and transactional systems that closely matches the evolution of data storage mediums. Even modern-day data management concepts like data provenance and lineage have historic roots in the Arab world; generations of scribes meticulously tracked Islamic prophetic narrations from one narrator to the next, forming lineage chains that originated from central Arabia. Database systems research has been part of the academic culture in the Arab world since the 1970s. High-quality computer science and database education was always available at several universities within the Arab region, such as Alexandria University in Egypt. Many students who went through these programs were drawn to database systems research and became globally prominent, such as Ramez Elmasri (professor at University of Texas, Arlington), Amr El Abbadi (professor at University of California, Santa Barbara), and Walid Aref (professor at Purdue University).

Top 60 Artificial Intelligence Interview Questions & Answers


A month ago, India's first driverless metro train in the national capital, Delhi, was launched. Yes! Like it or not, automation is happening and will continue to happen in places where you couldn't have imagined before. Artificial Intelligence has swept away the world around us, leading to the natural progression of demand for skilled professionals in the job market. It is one field that will never go outdated and will continue to grow. Wondering how to leverage this opportunity? How can you prepare yourself for such a league of jobs that make the world go around? We have got a repository of questions to help you get ready for your next interview! This article will cover the artificial intelligence interview questions and help you with the much-needed tips and tricks to crack the interview. The article is divided into three parts: basic artificial intelligence questions, intermediate level, and advanced AI questions. AnalytixLabs is India's top-ranked AI & Data Science Institute and is in its tenth year.

AI, blockchain, and new ways for everyone to monetize their data - Dataconomy


Breakthroughs in AI and innovations in applying blockchain for personal data control and monetization enable new ways to make money off of personal information that most people currently give away for free. Here we highlight three data science and business model innovations, starting with breakthrough ML technology that learns on the fly. There's an emergent machine learning technology out there that offers a clever new way of finding and classifying unstructured content. In geek-speak, the technology is a vertical, personalized search engine that doesn't require expensive knowledge graphs. In human speak, it's a context-sensitive, human-in-the-loop search engine that uses search criteria and implicit user feedback to recommend high-quality results.

Individually Fair Ranking Machine Learning

We develop an algorithm to train individually fair learning-to-rank (LTR) models. The proposed approach ensures items from minority groups appear alongside similar items from majority groups. This notion of fair ranking is based on the definition of individual fairness from supervised learning and is more nuanced than prior fair LTR approaches that simply ensure the ranking model provides underrepresented items with a basic level of exposure. The crux of our method is an optimal transport-based regularizer that enforces individual fairness and an efficient algorithm for optimizing the regularizer. We show that our approach leads to certifiably individually fair LTR models and demonstrate the efficacy of our method on ranking tasks subject to demographic biases. Information retrieval (IR) systems are everywhere in today's digital world, and ranking models are integral parts of many IR systems. In light of their ubiquity, issues of algorithmic bias and unfairness in ranking models have come to the fore of the public's attention. In many applications, the items to be ranked are individuals, so algorithmic biases in the output of ranking models directly affect people's lives. For example, gender bias in job search engines directly affect the career success of job applicants (Dastin, 2018).

Automated Fact-Checking for Assisting Human Fact-Checkers Artificial Intelligence

The reporting and analysis of current events around the globe has expanded from professional, editor-lead journalism all the way to citizen journalism. Politicians and other key players enjoy direct access to their audiences through social media, bypassing the filters of official cables or traditional media. However, the multiple advantages of free speech and direct communication are dimmed by the misuse of the media to spread inaccurate or misleading claims. These phenomena have led to the modern incarnation of the fact-checker -- a professional whose main aim is to examine claims using available evidence to assess their veracity. As in other text forensics tasks, the amount of information available makes the work of the fact-checker more difficult. With this in mind, starting from the perspective of the professional fact-checker, we survey the available intelligent technologies that can support the human expert in the different steps of her fact-checking endeavor. These include identifying claims worth fact-checking; detecting relevant previously fact-checked claims; retrieving relevant evidence to fact-check a claim; and actually verifying a claim. In each case, we pay attention to the challenges in future work and the potential impact on real-world fact-checking.

Push-down query capabilities: Five questions to ask your cloud BI provider


Software-as-a-service (SaaS) offers many benefits, including but not limited to elasticity: the ability to shrink and grow storage and compute resources on demand. Clients of most leading enterprise business intelligence (BI) platforms enjoy this cloud elasticity benefit but at a cost. Ultimately, elasticity requires both application and data components (compute and store) to be elastic, and therefore, cloud-native BI platforms require that on-premises data be ingested into the cloud platform before it can be analyzed. But not all organizations are ready to let go of their data from inside their firewalls, and they are not ready to commit to a single cloud provider -- most are opting for a hybrid on-premises and multicloud environment. Here's a look at how the cloud leaders stack up, the hybrid market, and the SaaS players that run your company as well as their latest strategic moves.