⚡ Performance: Optimize Agent Spawning & Execution Pipeline
Migrated from llm/agent-buildkit#73 on 2025-10-12T20:14:28.985Z Original author: @thomas.scola | Created: 2025-10-12T02:15:10.477Z
⚡ Performance: Optimize Agent Spawning & Execution Pipeline
Problem
Agent spawning currently takes 2-5 seconds per agent, limiting scalability:
Current Bottlenecks:
- Sequential agent initialization
- No spawn caching or warm pools
- Redundant validation on every spawn
- Cold start penalties for each agent
Evidence from Codebase:
// agent-mesh/backend/src/services/domain/agent-manager.service.ts:58
async spawnAgent(type: string, options: Partial<AgentConfig> = {}): Promise<string> {
// Full initialization on every spawn - SLOW!
const agentProcess = spawn('node', [...], { stdio: ['ignore', 'pipe', 'pipe'] });
}
// parallel-agent-executor.service.ts:265
private async spawnAgentProcess(agentId: string, tasks: string[], executionId: string) {
// No pooling, no warm agents
const childProcess = spawn('node', [this.agentScriptPath, ...]);
}
Impact
Current State:
- Spawning 10 agents: 20-50 seconds
- Spawning 100 agents: 3-8 minutes
- High CPU spikes during spawn waves
Target State (90% improvement):
- Spawning 10 agents: < 2 seconds
- Spawning 100 agents: < 20 seconds
- Smooth resource utilization
Solution: Multi-Tier Performance Strategy
1. Agent Pool Management
Warm Agent Pool:
// New: src/services/agent-pool.service.ts
class AgentPoolService {
private warmPools: Map<AgentType, Agent[]>;
async initialize() {
// Pre-spawn 5 agents of each common type
await this.prewarmAgents(['code-reviewer', 'documentation', 'testing']);
}
async getAgent(type: string): Promise<Agent> {
// Return warm agent instantly (< 100ms)
return this.warmPools.get(type)?.pop() || await this.spawnFresh(type);
}
}
Benefits:
- 10-50x faster for cached types
- Predictable latency
- Resource pre-allocation
2. Lazy Initialization
Current: Load everything on spawn
New: Load on-demand
class LazyAgent {
private _llmClient?: LLMClient;
get llmClient() {
// Initialize only when actually needed
if (!this._llmClient) {
this._llmClient = createLLMClient(this.config);
}
return this._llmClient;
}
}
3. Parallel Spawn Optimization
Current: Sequential spawn
New: Batched parallel spawn with coordination
// New: Batch spawn with progress tracking
async spawnAgentBatch(requests: SpawnRequest[]): Promise<Agent[]> {
// Spawn in parallel batches of 10
return Promise.all(
chunk(requests, 10).map(batch =>
Promise.all(batch.map(req => this.spawnAgent(req)))
)
);
}
4. Spawn Caching
Cache validated agent configurations:
// Cache expensive validation results
private configCache = new LRUCache<string, ValidatedConfig>({ max: 100 });
async validateConfig(config: AgentConfig): Promise<ValidatedConfig> {
const key = hashConfig(config);
return this.configCache.get(key) ?? await this.performValidation(config);
}
5. Resource Prediction
Use Phoenix KG to predict resource needs:
// Predict optimal agent count based on workload
const prediction = await phoenixKG.predictOptimalAgentCount({
taskQueue: currentTasks,
historicalData: last7Days,
constraints: { maxCost: 0, maxLatency: 5000 }
});
Tasks
-
Implement AgentPoolService
with warm pools- Pre-spawn 5 agents per common type
- Auto-replenish when pool drops below 2
- Configurable pool sizes per agent type
-
Add lazy initialization to all agents - Defer LLM client creation
- Defer tool loading
- Load on first use, not on spawn
-
Optimize parallel spawning - Batch spawn API (10 agents at once)
- Progress tracking per batch
- Failure isolation (one fails, others continue)
-
Add spawn caching - Cache validated configs (LRU 100)
- Cache tool definitions
- Invalidate on config changes only
-
Integrate Phoenix KG predictions - Query historical spawn patterns
- Predict optimal pool sizes
- Auto-adjust based on workload
-
Add performance telemetry - Track spawn duration (p50, p95, p99)
- Track pool hit rates
- Alert on performance degradation
Acceptance Criteria
Metrics Dashboard
Add Grafana panels:
- Agent spawn latency (p50, p95, p99)
- Pool utilization per agent type
- Cache hit rates
- Resource consumption trends
Related
- Epic: #55
- Agent Spawning Code
- Phoenix KG Predictions
Priority: P1 | Labels: performance, spawning, optimization, scalability