Foreword

Over the last decade, search has become ubiquitous—the keyword search box has evolved to become the de facto UI for exploring data and for navigating most websites and applications. At the same time, delivering a truly relevant search experience has been elusive, if not a critical blind spot for most organizations.

Powerful open source technologies have arisen to deliver fast, feature-rich search (Apache Lucene) in a distributed, highly scalable way with little-to-no coding required (Apache Solr and later Elasticsearch). This has provided the necessary infrastructure for almost any developer to build a “generally relevant” real-time search engine for the big data era. As more of the hard search infrastructure problems have been solved and their solutions commoditized, the competitive differentiators have moved away from providing fast, scalable search and more toward delivering the most relevant matches for a user’s information need. In other words, delivering “generally relevant” results is no longer sufficient—Google and other top search engines have now trained users to expect search applications to almost read their minds. This book is about how to move more aggressively in that direction of understanding user intent.

Doug Turnbull and John Berryman are two highly experienced search and relevancy experts whom I’ve known for years, typically running into each other at search conferences where we’ve all presented. I fondly recall times spent with them discussing ideas to solve some of the world’s hardest problems in search relevancy, recommendations, and personalization. No one is more excited than I to see their unique expertise codified in this book—one of the best and most engaging technical books I’ve ever read.

Relevancy tuning is a hard problem—it’s usually misunderstood, and it’s often not immediately obvious when something is wrong. It usually requires seeing many bad examples to identify problematic patterns, and it’s often challenging to know what better results would look like without actually seeing them show up. Unfortunately, it’s often not until well after a search system is deployed into production that organizations begin to realize the gap between out-of-the-box relevancy defaults and true domain-driven, personalized matching.

Not only that, but the skillsets needed to think about relevancy (domain expertise, feature engineering, machine learning, ontologies, user testing, natural language processing) are very different from those needed to build and maintain scalable infrastructure (distributed systems, data structures, performance and concurrency, hardware utilization, network calls and communication). The role of a relevance engineer is almost entirely lacking in many organizations, leaving so much potential untapped for building a search experience that truly delights users and significantly moves a company forward.

The spectrum of personalization between manually entered keyword searches and completely automated recommendations is also rich with opportunities to deliver relevant matches crafted for each specific user’s needs. The authors do a great job of explaining some of the more nuanced ways that search features/signals can be modeled to take full advantage of this spectrum. With the techniques in this book, you will be well-equipped to take on the role of a relevance engineer and solve many of the most challenging problems inherent in creating a truly personalized, relevant search experience.

TREY GRAINGER

AUTHOR, SOLR IN ACTION

SENIOR VICE PRESIDENT OF ENGINEERING AT LUCIDWORKS

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.203.137