Search Engineering Reading Material

2006 just called and wants its #RecSys models based solely on user and item IDs back! 🙃

Semantic IDs might be as revolutionary as I keep hearing (or they might not), but…I’m already tired of comparisons to a straw man of ID-based recommendation modeling back in “the bad old days.”

If you’ve spent the past decade or two having cold-start, sparsity, and generalization issues while wrangling enormous embedding tables learned from lowest-common denominator features like lists of interacted items, you missed a lot of practical industry knowledge about how to make recommenders that live up to people’s natural expectations by combining general purpose deep model architectures (e.g. DLRM, DCN) with specific domain knowledge (e.g. ontologies and taxonomies.)

For example: how to reflect features across the user/item boundary, how to enrich user and item representations with distributional features, and how to transfer learning between implicit and explicit preferences. We’ve been out here doing this stuff for a while now, and I don’t expect it to go away any time soon.

via Carl H. on LinkedIn

Greg Brockman’s advice for AI Engineers

Build Mindset

  • "Forget about that 100yr time horizon, I just want to build." — Greg
  • Why koding vs. math: in math you write a proof and maybe three people care; in coding you write a program and everyone benefits.

Speed via First Principles

  • Identify which steps are truly required vs. legacy process.
  • Collapse feedback loops (e.g. live-iterate on a call vs wait 9 months).
  • Focus on the few hard constraints; bulldoze through irrelevant ones.
  • Speed compounds because it unlocks more cycles of learning.

Independent Study that Works (lessons, backed by his examples)

  • Go deep; push through the “boredom walls.”
  • Learn by building end-to-end and shipping (Kaggle comps, GPU rig)
  • Create your own learning environment & tools.
  • If keep getting introed to same smart people, it's real. Use that as a signal to double down.
  • Turing (1950): you’ll never write all the rules—build a child machine that learns via rewards & punishments.
  • Mix “talk to people” with “do the work.” Conversations (e.g. Geoff Hinton) inform direction; mastery comes from doing.

Do you still believe great engineers can contribute as much as researchers?

  • “Absolutely—if not even more true today.”
  • Early example = fast conv kernels (Alex) + apply to ImageNet (Ilya) → AlexNet.
  • Today: eng for ~100k GPUs, complex RL orchestration.
  • If you don’t have the idea, there’s nothing to do; without engineering, the idea doesn’t live. You need both.
  • Tells every new OpenAI engineer to have technical humility — leave traditional intuitions at the door. Assume you’re missing context; deeply understand why; then change the abstraction.
  • Working pattern that scaled: bring 5 ideas; teammate says 4 are bad → “great—that’s all I wanted.”

Coding with Models

  • “How you structure your codebase determines how much you get out of Codex.”
  • Structure repos for models: small, well-documented modules, crisp entry points, and fast tests so models can fill in details and run tests repeatedly.
  • Design for models, not just humans: humans can hold big abstractions and skip tests; models can’t — so make it easy for them to succeed.
  • Pattern is likely durable as models improve and also great for human maintainability.
  • Vibe coding = empowerment and an early glimpse; next step is agents as coworkers — off-laptop, cloud-resident, tool-using, still working while you sleep.
  • Biggest ROI: transforming existing applications (migrations, library upgrades, legacy refactors), not just flashy greenfield demos.

Machine Learning Engineer - Search

Job Application for Sr. Applied Machine Learning Engineer - Search at Legion Intelligence via Joe Burgum

Sr. Applied Machine Learning Engineer - Search

San Francisco (Remote)

Company Overview:

Let’s be real, AI isn’t magic; Legion was built to move beyond AI hype—delivering secure, reliable systems that work alongside the people tackling the world’s most critical challenges.Born from a Department of Defense partnership and trusted by leaders across government and enterprise, Legion embeds intelligence inside complex systems, unlocking data, accelerating human workflows, and strengthening mission-critical systems. We don’t replace workflows—we optimize them, ensuring quality, efficiency, and reliability inside the platforms our partners already use. With world-class collaborators like Nvidia, HPE, and Oracle, we’re building intelligent infrastructure that enhances human capability and drives impact at the edge and across a range of enterprises. We’re looking for bold thinkers and doers to join us in shaping the future of AI that’s secure, grounded, and built to work.

Job Summary:

As a Senior Applied ML Engineer - Search, you will be designing, implementing, and optimizing search capabilities for our enterprise-level AI platform. Your responsibilities will include measuring search performance, deploying efficient enterprise search solutions for various document types, and deeply leveraging LLMs and AI agents to enhance document enrichment, ranking, and retrieval techniques. Additionally, you will collaborate closely with our infrastructure and platform team to integrate search and ML techniques.

Responsibilities:

  • Design and build scalable enterprise search solutions, focusing on efficient indexing, retrieval.
  • Develop and maintain search ranking algorithms using machine learning, vector search, and LLMs to enhance search results and reduce latency.
  • Contribute to our large-scale machine learning codebase, leading the search-related components and ensuring code quality, scalability, and maintainability.
  • Apply your production experience with search platforms (e.g., Elasticsearch, Solr) and machine learning systems to ensure the scalability and reliability of our search infrastructure.
  • Stay updated with the latest advancements in search and ML technologies, vector search, and LLMs, and apply them to improve our applied machine learning search solutions.
  • Work with a collaborative team of other ML engineers building AI solutions.
  • Contribute to a culture of continuous learning.

Required Skills and Qualifications:

  • Bachelor's or Master's degree in Computer Science, Machine Learning, or a related field, or equivalent experience.
  • 3+ years of experience in ML and search
  • Familiarity with document enrichment techniques, with experience in leveraging LLMs and other machine learning models for improved enrichment.
  • Knowledge of search ranking algorithms, index optimizations, search query optimization, and experience applying machine learning to improve search relevance.
  • Excitement and enthusiasm for exploring and leveraging advanced search techniques
  • Growth mindset and low ego – you’re eager to pick up new tools and technologies, learn from others, and be open to changing course when it’s right.

Compensation Information: $205,000 - $260,000 USD