Software at Scale 25 - Rajesh Venkataraman: Senior Staff Software Engineer at Google

52:16
 
Share
 

Manage episode 295899258 series 2899471
By Utsav Shah. Discovered by Player FM and our community — copyright is owned by the publisher, not Player FM, and audio is streamed directly from their servers. Hit the Subscribe button to track updates in Player FM, or paste the feed URL into other podcast apps.

Rajesh Venkataraman is a Senior Staff Engineer at Google where he works on Privacy and Personalization at Google Pay. He’s had experience building and maintaining search systems for a large part of his career. He worked on natural language processing at Microsoft, the cloud inference team at Google, and released parts of the search infrastructure at Dropbox.

Apple Podcasts | Spotify | Google Podcasts

In this episode, we discuss the nuances and technology behind search systems. We go over search infrastructure - data storage and retrieval, as well as search quality - tokenization, ranking, and more. I was especially curious about how image search and other advanced search systems work internally with constraints for low latency, high search quality, and cost-efficiency.

Highlights

08:00 - Getting started building a search system - where to begin? Some history.

13:30 - Why we should use different hardware for different parts of a high throughput search system

17:00 - What goes on behind the scenes in a search system when it has to incorporate a picture or a PDF? The rise of transformers, not the Optimus Prime kind.

We go on to discuss how transformers work at a very high level.

27:00 - The key idea for non-text search is being able to store, index, and search for vectors efficiently. Searches often involve nearest neighbor searches. Indexing involves techniques as simple as only storing the first few bits of each vector dimension in hashmaps.

34:00 - How search systems efficiently rebuild their inverted indices based on changing data; internationalization for search systems; search user interface design and research.

42:00 - How should a student interested in building a search system learn the best practices and techniques to do so?


This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.softwareatscale.dev

53 episodes