LX Mini entry · 2025-11-06-tweet-1986459364777926962 | LX Mini

Nov 6, 2025, 3:43 PM

What if embedding retrieval and semantic search is only worth it for code?

Code is a nice domain for semantic search because there’s less 1:1 mapping between words and semantics.

Code is of course more structured, but there’s a subtle layer of indirection between the string of tokens and the meaning. The meaning is also more distributed across lines. A line of code isn’t always like a sentence. Natural language and code target different recipients.

I suspect this is part of why we’re seeing more success for semantic search with code than with natural language documents.