Embeddings & Semantic Search

LIP v1.3 adds semantic search: compute dense embedding vectors for your source files and find the most semantically similar files to any query — either another file or a free-text description.

How it works

The daemon sends file source text to an OpenAI-compatible /v1/embeddings HTTP endpoint.
The returned vector is stored alongside the file in the daemon’s in-memory graph.
When you run a nearest-neighbour query, the daemon computes cosine similarity against all stored vectors and returns the top-K results.
If a file is upserted (edited), its cached embedding is automatically invalidated.

The daemon never embeds files automatically — you trigger embedding explicitly, either via CLI or MCP tool.

Setup

1. Start an embedding endpoint

Any OpenAI-compatible /v1/embeddings endpoint works:

Ollama (local, free):

ollama pull nomic-embed-text
ollama serve  # listens on localhost:11434

OpenAI: Use https://api.openai.com/v1/embeddings with an API key.

LM Studio / vLLM / other: Set the URL to wherever your server listens.

2. Configure the daemon

Set environment variables before starting lip daemon:

export LIP_EMBEDDING_URL=http://localhost:11434/v1/embeddings
export LIP_EMBEDDING_MODEL=nomic-embed-text   # optional, defaults to text-embedding-3-small

lip daemon --socket /tmp/lip.sock

Without LIP_EMBEDDING_URL, embedding commands return an error and the rest of LIP works normally.

Embed your files

Compute embeddings for one or more files:

lip query embedding-batch file:///src/auth.rs file:///src/session.rs

Embed an entire directory by passing URIs from lip query symbols:

# Embed all indexed files
lip query index-status
# Then embed the ones with pending_embeddings > 0

Already-cached embeddings are returned without a network call. Only new or modified files trigger HTTP requests.

Search

By file

Find files most semantically similar to a given file:

lip query nearest file:///src/auth.rs
lip query nearest file:///src/auth.rs --top-k 10

Output:

score=0.9412  file:///src/session.rs
score=0.8871  file:///src/middleware/auth_guard.rs
score=0.8204  file:///src/handlers/login.rs

The query file must have an embedding — run embedding-batch first.

By text

Find files most related to a concept or description:

lip query nearest-by-text "authentication token validation"
lip query nearest-by-text "payment processing fraud detection" --top-k 10

The text is embedded on the fly using the configured model (or a per-request override via --model).

Check status

How many files are indexed vs embedded:

lip query index-status
# indexed=142  pending_embeddings=38  last_updated=1744400123000ms  embedding_model=nomic-embed-text

Check a specific file:

lip query file-status file:///src/auth.rs
# file:///src/auth.rs  indexed=true  has_embedding=true  age=42s

age is seconds since the file was last indexed.

Advanced: similarity and clustering

Pairwise similarity

Check how similar two specific files or symbols are:

# Two files
lip query similarity file:///src/auth.rs file:///src/session.rs
# score=0.9214

# Two symbols (lip:// URIs)
lip query similarity \
  lip://local/src/auth.rs#verifyToken \
  lip://local/src/session.rs#validateSession
# score=0.8847

Returns null when either URI has no cached embedding.

Query expansion

Turn a short query into related symbol names before running lip query symbols:

lip query query-expansion "token validation" --top-k 5
# verifyToken
# validateSession
# checkJwt
# parseBearer
# refreshToken

This helps when you’re not sure of the exact symbol name — expand first, then search.

Clustering

Group a set of files by topic:

lip query cluster --radius 0.85 \
  file:///src/auth.rs \
  file:///src/session.rs \
  file:///src/payments.rs \
  file:///src/invoices.rs
# Group 1: file:///src/auth.rs  file:///src/session.rs
# Group 2: file:///src/payments.rs  file:///src/invoices.rs

--radius is the cosine-similarity threshold (default: 0.8). Useful before planning a refactor: identify which files are topically related so you can batch changes.

Exporting raw vectors

For external pipelines (re-ranking, custom visualization, integration with a vector DB):

lip query export-embeddings \
  file:///src/auth.rs \
  file:///src/session.rs \
  --output vectors.json

URIs with no cached embedding are omitted from the output map.

MCP tools

When using LIP through an AI agent, the same capabilities are available as MCP tools:

Tool	Description
`lip_embedding_batch`	Compute and cache embeddings for a list of URIs
`lip_nearest`	Top-K files most similar to a given file
`lip_nearest_by_text`	Top-K files most similar to a free-text query
`lip_index_status`	Daemon health and embedding coverage
`lip_file_status`	Per-file indexing and embedding status
`lip_similarity`	Pairwise cosine similarity of two stored embeddings
`lip_query_expansion`	Expand a query into related symbol names
`lip_cluster`	Group URIs by embedding proximity within a radius
`lip_export_embeddings`	Return raw stored vectors for external pipelines
`lip_reindex_files`	Force re-index of specific file URIs from disk

Typical agent workflow:

# 1. Bootstrap embeddings for the whole workspace (once per session)
lip_embedding_batch(uris=[all_file_uris])

# 2. Check coverage
lip_index_status()  →  pending_embeddings=0

# 3. Find files related to a concept
lip_nearest_by_text("payment processing and fraud detection", top_k=10)

# 4. Drill in on a specific file
lip_nearest(uri="file:///src/payments.rs", top_k=5)

# 5. Expand a query to discover related symbol names
lip_query_expansion(query="payment processing", top_k=8)

# 6. Group a candidate set before planning a refactor
lip_cluster(uris=[...candidate_uris...], radius=0.85)

Model recommendations

Model	Host	Dims	Notes
`nomic-embed-text`	Ollama	768	Best local option; fast, good code coverage
`text-embedding-3-small`	OpenAI	1536	Solid all-round; requires API key
`text-embedding-3-large`	OpenAI	3072	Highest quality; 5× more expensive than small
`mxbai-embed-large`	Ollama	1024	Strong code + natural language retrieval

You can mix models across requests with --model — but vectors from different models are not comparable. Use one model consistently per workspace.

Troubleshooting

LIP_EMBEDDING_URL not set — embedding commands return this error if the env var is missing. Set it before starting the daemon (or in the daemon’s service unit file).

connection refused — the embedding endpoint is not running. Start Ollama (ollama serve) or your other server.

Low scores / wrong results — make sure you’re using the same model for all embeddings. If you switch models, re-embed all files by restarting the daemon (which clears the in-memory cache).

pending_embeddings never reaches 0 — embeddings are not computed automatically. You must call embedding-batch explicitly. Consider calling it in a startup script or CI job.