Memory

Overview

Memory configuration controls where uploaded files are stored, how much session context is kept, when long conversation history is compacted, how many relevant memories are retrieved, and how background memory work is executed.

Most installations can use the defaults. Tune these settings when you need to change storage paths, adjust how much context the agent receives, or scale background embedding work.

Storage

Variable	Default	Description
`LIGHTFLARE_MEMORY_UPLOAD_DIR`	`/tmp/lightflare/memory-uploads`	Directory used to store uploaded memory files. Use a persistent volume in production.

Conversation Context

Variable	Default	Description
`LIGHTFLARE_MEMORY_SESSION_MEMORY_LIMIT`	`50`	Maximum number of active session memory entries loaded from the current chat.
`LIGHTFLARE_MEMORY_SIMILARITY_RESULT_LIMIT`	`10`	Maximum number of relevant saved memories or document sections retrieved for an agent run.

Higher values can give the agent more context, but also increase prompt size and token usage. Lower values keep runs smaller and faster, but may omit useful background information.

Compaction

Long chats can produce many session memory entries. Compaction summarizes older compactable entries so the conversation remains useful without replaying every message.

Variable	Default	Description
`LIGHTFLARE_MEMORY_COMPACTION_TOKEN_THRESHOLD`	`3000`	Approximate token threshold that can trigger session memory compaction.
`LIGHTFLARE_MEMORY_COMPACTION_BATCH_SIZE`	`5`	Number of compactable memory entries processed in one compaction pass.

Raise the threshold if you want longer raw conversation history to remain active. Lower it if you want long sessions summarized sooner.

Context Search

Context search decides which saved memories and document sections are most relevant to a request.

Variable	Default	Description
`LIGHTFLARE_CONTEXT_SEARCH_MIN_CANDIDATE_LIMIT`	`20`	Minimum number of candidate rows considered per search source before final ranking.
`LIGHTFLARE_CONTEXT_SEARCH_CANDIDATE_LIMIT_MULTIPLIER`	`4`	Multiplier applied to the requested result limit when loading candidates.
`LIGHTFLARE_CONTEXT_SEARCH_MAX_CHUNKS_PER_DOCUMENT`	`2`	Maximum number of sections from one uploaded document returned after ranking.
`LIGHTFLARE_CONTEXT_SEARCH_MEMORY_PAGE_SEARCH_LIMIT`	`500`	Maximum relevance-ranked memory IDs considered when searching from the Memories page.

The document chunk cap helps prevent one long document from crowding out every other result.

Ranking Weights

Ranking weights control how different relevance signals are balanced.

Variable	Default	Description
`LIGHTFLARE_CONTEXT_SEARCH_VECTOR_WEIGHT`	`0.55`	Weight for semantic similarity.
`LIGHTFLARE_CONTEXT_SEARCH_TEXT_WEIGHT`	`0.30`	Weight for exact text search.
`LIGHTFLARE_CONTEXT_SEARCH_RECENCY_WEIGHT`	`0.10`	Weight for newer memories.
`LIGHTFLARE_CONTEXT_SEARCH_SCOPE_WEIGHT`	`0.05`	Weight for closer memory scopes, such as session context.

Semantic similarity helps with related concepts. Text search helps with exact names, IDs, acronyms, filenames, and other literal terms.

Background Embedding Work

Memory and document content can be prepared for semantic search in the background.

Variable	Default	Description
`LIGHTFLARE_MEMORY_EMBEDDING_EXECUTOR_CORE_POOL_SIZE`	`2`	Core thread count for background embedding work.
`LIGHTFLARE_MEMORY_EMBEDDING_EXECUTOR_MAX_POOL_SIZE`	`4`	Maximum thread count for background embedding work.
`LIGHTFLARE_MEMORY_EMBEDDING_EXECUTOR_QUEUE_CAPACITY`	`100`	Queue capacity for pending embedding work.

Increase these values only when your LLM provider and database can handle more concurrent embedding work.

Overview​

Storage​

Conversation Context​

Compaction​

Context Search​

Ranking Weights​

Background Embedding Work​