Initial release: MCP server enforcing Worker-Reviewer loop
Diligence prevents AI agents from shipping quick fixes that break things by enforcing a research-propose-verify loop before any code changes. Key features: - Worker sub-agent researches and proposes with file:line citations - Reviewer sub-agent independently verifies claims by searching codebase - Iterates until approved (max 5 rounds) - Loads project-specific context from .claude/CODEBASE_CONTEXT.md - State persisted across sessions Validated on production codebase: caught architectural mistake (broker subscriptions on client-side code) that naive agent would have shipped.
This commit is contained in:
255
test/compare-approaches.mjs
Normal file
255
test/compare-approaches.mjs
Normal file
@@ -0,0 +1,255 @@
|
||||
#!/usr/bin/env node
|
||||
/**
|
||||
* Comparison Test: Naive vs Diligence Approach
|
||||
*
|
||||
* This script coordinates testing of both approaches:
|
||||
* 1. Naive: A single agent analyzes and proposes a fix
|
||||
* 2. Diligence: Worker-Reviewer loop with separate agents
|
||||
*
|
||||
* The test uses a real bug from the nexus codebase.
|
||||
*
|
||||
* Usage:
|
||||
* node test/compare-approaches.mjs
|
||||
*/
|
||||
|
||||
import { writeFileSync, mkdirSync, existsSync } from 'fs';
|
||||
import { dirname, join } from 'path';
|
||||
import { fileURLToPath } from 'url';
|
||||
|
||||
const __dirname = dirname(fileURLToPath(import.meta.url));
|
||||
const RESULTS_DIR = join(__dirname, 'results');
|
||||
|
||||
// Ensure results directory exists
|
||||
if (!existsSync(RESULTS_DIR)) {
|
||||
mkdirSync(RESULTS_DIR, { recursive: true });
|
||||
}
|
||||
|
||||
const TEST_BUG = {
|
||||
id: 'B1',
|
||||
name: 'Blocked users can join/answer DM voice calls',
|
||||
task: `Fix bug B1: Blocked users can join DM voice calls.
|
||||
|
||||
When user A blocks user B, user B should NOT be able to:
|
||||
1. Answer incoming DM calls from user A
|
||||
2. Start new calls to user A (already works)
|
||||
3. Join DM voice channel with user A (already works in joinVoiceChannel)
|
||||
|
||||
The bug is that answerDmCall() has no blocking check.
|
||||
|
||||
Analyze the codebase and propose a COMPLETE fix.`,
|
||||
|
||||
// What naive agents typically miss
|
||||
naive_misses: [
|
||||
'declineDmCall() also needs blocking check for consistency',
|
||||
'notifyDmCall() should filter blocked users from notifications',
|
||||
'blockUser() should clean up existing voice calls',
|
||||
'Need to subscribe to BusUserBlockChange for mid-call kick',
|
||||
'Should follow the pattern from chat.service.ts where permission=visibility, actions have separate checks',
|
||||
],
|
||||
|
||||
// Required elements for a complete fix
|
||||
required_elements: [
|
||||
'answerDmCall blocking check',
|
||||
'declineDmCall blocking check',
|
||||
'notification filtering',
|
||||
'voice cleanup in blockUser()',
|
||||
'BusUserBlockChange subscription',
|
||||
'chat.service.ts pattern reference',
|
||||
],
|
||||
};
|
||||
|
||||
// Prompts for the test
|
||||
const NAIVE_PROMPT = `You are analyzing a bug in the nexus codebase.
|
||||
|
||||
BUG: ${TEST_BUG.task}
|
||||
|
||||
Your job is to:
|
||||
1. Search the codebase to understand the current implementation
|
||||
2. Identify all files that need changes
|
||||
3. Propose a complete fix
|
||||
|
||||
DO NOT use any diligence MCP tools. Just analyze and propose.
|
||||
|
||||
Be thorough - check for:
|
||||
- Similar patterns in the codebase
|
||||
- Broker events that might be relevant
|
||||
- All places where blocking should be enforced
|
||||
- Edge cases (what if block happens mid-call?)
|
||||
|
||||
Output your analysis and proposed fix.`;
|
||||
|
||||
const WORKER_PROMPT = `You are a Worker agent in the diligence workflow.
|
||||
|
||||
Your brief has been loaded with:
|
||||
- The task description
|
||||
- Codebase context (architecture, patterns)
|
||||
- Any previous feedback
|
||||
|
||||
Your job:
|
||||
1. Research the codebase thoroughly
|
||||
2. Trace data flow from origin to all consumers
|
||||
3. Find existing patterns for similar functionality
|
||||
4. Identify ALL files that need changes
|
||||
5. Propose a fix with file:line citations for every claim
|
||||
|
||||
IMPORTANT:
|
||||
- Cite specific file:line for every claim
|
||||
- Search for similar patterns (how does chat handle blocking?)
|
||||
- Don't miss broker events
|
||||
- Consider edge cases (mid-call blocking)
|
||||
|
||||
Submit your proposal via mcp__diligence__propose when ready.`;
|
||||
|
||||
const REVIEWER_PROMPT = `You are a Reviewer agent in the diligence workflow.
|
||||
|
||||
Your brief has been loaded with:
|
||||
- The Worker's proposal
|
||||
- The task description
|
||||
- Codebase context
|
||||
|
||||
Your job:
|
||||
1. VERIFY every claim by searching the codebase yourself
|
||||
2. Check if the proposal follows existing patterns
|
||||
3. Look for missing broker events or edge cases
|
||||
4. Do NOT trust the Worker's citations - verify them
|
||||
|
||||
For each claim in the proposal:
|
||||
- Search for the file/line cited
|
||||
- Verify it says what the Worker claims
|
||||
- Check if there are related issues the Worker missed
|
||||
|
||||
Submit your review via mcp__diligence__review:
|
||||
- APPROVED if all checks pass
|
||||
- NEEDS_WORK with specific issues if not
|
||||
|
||||
Be strict - missing one broker event subscription can cause production bugs.`;
|
||||
|
||||
function log(msg) {
|
||||
const timestamp = new Date().toISOString().slice(11, 19);
|
||||
console.log(`[${timestamp}] ${msg}`);
|
||||
}
|
||||
|
||||
function saveResult(name, content) {
|
||||
const timestamp = new Date().toISOString().slice(0, 10);
|
||||
const filename = `${timestamp}-${name}.md`;
|
||||
const path = join(RESULTS_DIR, filename);
|
||||
writeFileSync(path, content);
|
||||
log(`Saved: ${path}`);
|
||||
return path;
|
||||
}
|
||||
|
||||
// Generate the test instructions
|
||||
function generateTestInstructions() {
|
||||
const instructions = `# Diligence Comparison Test
|
||||
|
||||
## Test Bug
|
||||
**ID:** ${TEST_BUG.id}
|
||||
**Name:** ${TEST_BUG.name}
|
||||
|
||||
## Task
|
||||
${TEST_BUG.task}
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Naive Approach (WITHOUT Diligence)
|
||||
|
||||
In a Claude Code session, paste this prompt:
|
||||
|
||||
\`\`\`
|
||||
${NAIVE_PROMPT}
|
||||
\`\`\`
|
||||
|
||||
Save the output as the "naive proposal".
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Diligence Approach (WITH Worker-Reviewer Loop)
|
||||
|
||||
### Step 1: Start the workflow
|
||||
\`\`\`
|
||||
mcp__diligence__start with task: "${TEST_BUG.task.split('\n')[0]}"
|
||||
\`\`\`
|
||||
|
||||
### Step 2: Spawn Worker Agent
|
||||
\`\`\`
|
||||
1. Call mcp__diligence__get_worker_brief
|
||||
2. Use Task tool with subagent_type="Explore" and this prompt:
|
||||
"${WORKER_PROMPT.replace(/\n/g, ' ').slice(0, 200)}..."
|
||||
3. Worker should research and call mcp__diligence__propose
|
||||
\`\`\`
|
||||
|
||||
### Step 3: Spawn Reviewer Agent
|
||||
\`\`\`
|
||||
1. Call mcp__diligence__get_reviewer_brief
|
||||
2. Use Task tool with subagent_type="Explore" and this prompt:
|
||||
"${REVIEWER_PROMPT.replace(/\n/g, ' ').slice(0, 200)}..."
|
||||
3. Reviewer should verify and call mcp__diligence__review
|
||||
\`\`\`
|
||||
|
||||
### Step 4: Loop or Complete
|
||||
- If NEEDS_WORK: spawn new Worker with updated brief
|
||||
- If APPROVED: call mcp__diligence__implement
|
||||
|
||||
Save the final approved proposal as the "diligence proposal".
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Compare Results
|
||||
|
||||
### Checklist - What Naive Typically Misses
|
||||
${TEST_BUG.naive_misses.map(m => `- [ ] ${m}`).join('\n')}
|
||||
|
||||
### Required Elements for Complete Fix
|
||||
${TEST_BUG.required_elements.map(e => `- [ ] ${e}`).join('\n')}
|
||||
|
||||
### Scoring
|
||||
- Naive proposal: Count how many required elements it includes
|
||||
- Diligence proposal: Count how many required elements it includes
|
||||
- Did diligence catch issues that naive missed?
|
||||
|
||||
---
|
||||
|
||||
## Expected Outcome
|
||||
|
||||
The naive approach will likely:
|
||||
- Add blocking check to answerDmCall() only
|
||||
- Miss the other 5 required elements
|
||||
|
||||
The diligence approach should:
|
||||
- Catch missing elements during review
|
||||
- Iterate until all elements are addressed
|
||||
- Produce a more complete proposal
|
||||
|
||||
`;
|
||||
|
||||
return instructions;
|
||||
}
|
||||
|
||||
// Main
|
||||
async function main() {
|
||||
log('Generating comparison test instructions...');
|
||||
|
||||
const instructions = generateTestInstructions();
|
||||
const path = saveResult('comparison-test-instructions', instructions);
|
||||
|
||||
console.log('\n' + '='.repeat(60));
|
||||
console.log('COMPARISON TEST READY');
|
||||
console.log('='.repeat(60));
|
||||
console.log(`\nInstructions saved to: ${path}`);
|
||||
console.log('\nTo run the test:');
|
||||
console.log('1. Open the instructions file');
|
||||
console.log('2. Start a Claude Code session in ~/bude/codecharm/nexus');
|
||||
console.log('3. Run Phase 1 (naive) and save the output');
|
||||
console.log('4. Run Phase 2 (diligence) and save the output');
|
||||
console.log('5. Compare using the checklist in Phase 3');
|
||||
console.log('\n');
|
||||
|
||||
// Also print the naive prompt for immediate use
|
||||
console.log('='.repeat(60));
|
||||
console.log('NAIVE PROMPT (for quick testing):');
|
||||
console.log('='.repeat(60));
|
||||
console.log(NAIVE_PROMPT);
|
||||
console.log('\n');
|
||||
}
|
||||
|
||||
main().catch(console.error);
|
||||
305
test/dry-run.mjs
Normal file
305
test/dry-run.mjs
Normal file
@@ -0,0 +1,305 @@
|
||||
#!/usr/bin/env node
|
||||
/**
|
||||
* Dry Run Test Against Real Project
|
||||
*
|
||||
* Runs the diligence MCP server against a real project (e.g., nexus) in dry-run mode.
|
||||
* This tests the full workflow without making any code changes.
|
||||
*
|
||||
* Usage:
|
||||
* node test/dry-run.mjs --project=/path/to/nexus --task="Fix permission cache"
|
||||
* node test/dry-run.mjs --project=~/bude/codecharm/nexus --scenario=blocking-voice
|
||||
*
|
||||
* Options:
|
||||
* --project=PATH Path to the project to test against
|
||||
* --task=TEXT Task description to start the workflow with
|
||||
* --scenario=ID Use a predefined scenario from test/scenarios/
|
||||
* --interactive Run in interactive mode (prompts for input)
|
||||
*/
|
||||
|
||||
import { spawn } from 'child_process';
|
||||
import { createInterface } from 'readline';
|
||||
import { dirname, join, resolve } from 'path';
|
||||
import { fileURLToPath } from 'url';
|
||||
import { existsSync, readFileSync } from 'fs';
|
||||
|
||||
const __dirname = dirname(fileURLToPath(import.meta.url));
|
||||
|
||||
// Parse CLI args
|
||||
const args = process.argv.slice(2);
|
||||
const projectArg = args.find(a => a.startsWith('--project='));
|
||||
const taskArg = args.find(a => a.startsWith('--task='));
|
||||
const scenarioArg = args.find(a => a.startsWith('--scenario='));
|
||||
const interactive = args.includes('--interactive') || args.includes('-i');
|
||||
|
||||
// Resolve project path
|
||||
let projectPath = projectArg ? projectArg.split('=')[1] : null;
|
||||
if (projectPath) {
|
||||
projectPath = projectPath.replace(/^~/, process.env.HOME);
|
||||
projectPath = resolve(projectPath);
|
||||
}
|
||||
|
||||
// Colors
|
||||
const colors = {
|
||||
reset: '\x1b[0m',
|
||||
green: '\x1b[32m',
|
||||
red: '\x1b[31m',
|
||||
yellow: '\x1b[33m',
|
||||
blue: '\x1b[34m',
|
||||
cyan: '\x1b[36m',
|
||||
dim: '\x1b[2m',
|
||||
bold: '\x1b[1m',
|
||||
};
|
||||
|
||||
function log(msg, color = 'reset') {
|
||||
console.log(`${colors[color]}${msg}${colors.reset}`);
|
||||
}
|
||||
|
||||
function logSection(title) {
|
||||
console.log(`\n${colors.cyan}${colors.bold}═══ ${title} ═══${colors.reset}\n`);
|
||||
}
|
||||
|
||||
// Load scenario
|
||||
function loadScenario(id) {
|
||||
const path = join(__dirname, 'scenarios', `${id}.json`);
|
||||
if (!existsSync(path)) {
|
||||
throw new Error(`Scenario not found: ${id}`);
|
||||
}
|
||||
return JSON.parse(readFileSync(path, 'utf-8'));
|
||||
}
|
||||
|
||||
// Simple MCP client for dry run
|
||||
class DryRunClient {
|
||||
constructor(projectPath) {
|
||||
this.projectPath = projectPath;
|
||||
this.serverPath = join(__dirname, '..', 'index.mjs');
|
||||
this.process = null;
|
||||
this.requestId = 0;
|
||||
this.pendingRequests = new Map();
|
||||
this.readline = null;
|
||||
}
|
||||
|
||||
async connect() {
|
||||
return new Promise((resolve, reject) => {
|
||||
this.process = spawn('node', [this.serverPath], {
|
||||
stdio: ['pipe', 'pipe', 'pipe'],
|
||||
cwd: this.projectPath,
|
||||
});
|
||||
|
||||
this.readline = createInterface({
|
||||
input: this.process.stdout,
|
||||
crlfDelay: Infinity,
|
||||
});
|
||||
|
||||
this.readline.on('line', (line) => {
|
||||
try {
|
||||
const message = JSON.parse(line);
|
||||
if (message.id !== undefined && this.pendingRequests.has(message.id)) {
|
||||
const { resolve, reject } = this.pendingRequests.get(message.id);
|
||||
this.pendingRequests.delete(message.id);
|
||||
if (message.error) {
|
||||
reject(new Error(message.error.message || JSON.stringify(message.error)));
|
||||
} else {
|
||||
resolve(message.result);
|
||||
}
|
||||
}
|
||||
} catch (e) {
|
||||
// Ignore non-JSON lines
|
||||
}
|
||||
});
|
||||
|
||||
this.process.stderr.on('data', (data) => {
|
||||
// Show server stderr in debug mode
|
||||
if (process.env.DEBUG) {
|
||||
console.error(colors.dim + '[server] ' + data.toString() + colors.reset);
|
||||
}
|
||||
});
|
||||
|
||||
this.process.on('error', reject);
|
||||
|
||||
// Initialize
|
||||
this._send({
|
||||
jsonrpc: '2.0',
|
||||
id: this.requestId++,
|
||||
method: 'initialize',
|
||||
params: {
|
||||
protocolVersion: '0.1.0',
|
||||
clientInfo: { name: 'dry-run-client', version: '1.0.0' },
|
||||
capabilities: {},
|
||||
},
|
||||
}).then(() => {
|
||||
this._sendNotification('notifications/initialized', {});
|
||||
resolve();
|
||||
}).catch(reject);
|
||||
});
|
||||
}
|
||||
|
||||
async disconnect() {
|
||||
if (this.process) {
|
||||
this.process.kill('SIGTERM');
|
||||
this.process = null;
|
||||
}
|
||||
}
|
||||
|
||||
_send(message) {
|
||||
return new Promise((resolve, reject) => {
|
||||
this.pendingRequests.set(message.id, { resolve, reject });
|
||||
this.process.stdin.write(JSON.stringify(message) + '\n');
|
||||
setTimeout(() => {
|
||||
if (this.pendingRequests.has(message.id)) {
|
||||
this.pendingRequests.delete(message.id);
|
||||
reject(new Error('Request timeout'));
|
||||
}
|
||||
}, 30000);
|
||||
});
|
||||
}
|
||||
|
||||
_sendNotification(method, params) {
|
||||
this.process.stdin.write(JSON.stringify({ jsonrpc: '2.0', method, params }) + '\n');
|
||||
}
|
||||
|
||||
async callTool(name, args = {}) {
|
||||
const result = await this._send({
|
||||
jsonrpc: '2.0',
|
||||
id: this.requestId++,
|
||||
method: 'tools/call',
|
||||
params: { name, arguments: args },
|
||||
});
|
||||
if (result.content?.[0]?.text) {
|
||||
return { text: result.content[0].text, isError: result.isError || false };
|
||||
}
|
||||
return result;
|
||||
}
|
||||
}
|
||||
|
||||
// Interactive prompt
|
||||
function prompt(question) {
|
||||
const rl = createInterface({
|
||||
input: process.stdin,
|
||||
output: process.stdout,
|
||||
});
|
||||
return new Promise(resolve => {
|
||||
rl.question(question, answer => {
|
||||
rl.close();
|
||||
resolve(answer);
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
async function main() {
|
||||
log('\n🔍 Diligence Dry Run\n', 'cyan');
|
||||
|
||||
// Validate project path
|
||||
if (!projectPath) {
|
||||
log('Error: --project=PATH required', 'red');
|
||||
log('\nUsage:', 'dim');
|
||||
log(' node test/dry-run.mjs --project=~/bude/codecharm/nexus --task="Fix bug"', 'dim');
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
if (!existsSync(projectPath)) {
|
||||
log(`Error: Project path not found: ${projectPath}`, 'red');
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
// Check for CODEBASE_CONTEXT.md
|
||||
const contextPath = join(projectPath, '.claude', 'CODEBASE_CONTEXT.md');
|
||||
if (!existsSync(contextPath)) {
|
||||
log(`Warning: No .claude/CODEBASE_CONTEXT.md found in ${projectPath}`, 'yellow');
|
||||
log('The Worker and Reviewer will have limited context.', 'dim');
|
||||
} else {
|
||||
log(`Found: ${contextPath}`, 'green');
|
||||
}
|
||||
|
||||
// Determine task
|
||||
let task;
|
||||
if (scenarioArg) {
|
||||
const scenarioId = scenarioArg.split('=')[1];
|
||||
const scenario = loadScenario(scenarioId);
|
||||
task = scenario.task;
|
||||
log(`Using scenario: ${scenario.name}`, 'blue');
|
||||
} else if (taskArg) {
|
||||
task = taskArg.split('=')[1];
|
||||
} else if (interactive) {
|
||||
task = await prompt('Enter task: ');
|
||||
} else {
|
||||
log('Error: Either --task=TEXT or --scenario=ID required', 'red');
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
log(`\nProject: ${projectPath}`, 'dim');
|
||||
log(`Task: ${task}\n`, 'dim');
|
||||
|
||||
// Connect to MCP server
|
||||
const client = new DryRunClient(projectPath);
|
||||
|
||||
try {
|
||||
log('Connecting to MCP server...', 'dim');
|
||||
await client.connect();
|
||||
log('Connected!', 'green');
|
||||
|
||||
// Check initial status
|
||||
logSection('Status');
|
||||
const status = await client.callTool('status');
|
||||
log(status.text, 'dim');
|
||||
|
||||
// Start the workflow
|
||||
logSection('Starting Workflow');
|
||||
const startResult = await client.callTool('start', { task });
|
||||
log(startResult.text, startResult.isError ? 'red' : 'green');
|
||||
|
||||
if (startResult.isError) {
|
||||
// Try to abort and restart
|
||||
log('\nAborting existing workflow...', 'yellow');
|
||||
await client.callTool('abort', { reason: 'Dry run restart' });
|
||||
const retryResult = await client.callTool('start', { task });
|
||||
log(retryResult.text, retryResult.isError ? 'red' : 'green');
|
||||
}
|
||||
|
||||
// Get worker brief
|
||||
logSection('Worker Brief');
|
||||
const workerBrief = await client.callTool('get_worker_brief');
|
||||
|
||||
// Show truncated brief
|
||||
const briefLines = workerBrief.text.split('\n');
|
||||
const truncatedBrief = briefLines.slice(0, 50).join('\n');
|
||||
log(truncatedBrief, 'dim');
|
||||
if (briefLines.length > 50) {
|
||||
log(`\n... (${briefLines.length - 50} more lines)`, 'dim');
|
||||
}
|
||||
|
||||
logSection('Dry Run Complete');
|
||||
|
||||
log(`
|
||||
${colors.yellow}What happens next in a real session:${colors.reset}
|
||||
|
||||
1. ${colors.bold}Worker Agent${colors.reset} (fresh sub-agent) receives the brief above
|
||||
- Researches the codebase using Glob, Grep, Read tools
|
||||
- Proposes a fix with file:line citations
|
||||
- Submits via ${colors.cyan}diligence.propose${colors.reset}
|
||||
|
||||
2. ${colors.bold}Reviewer Agent${colors.reset} (fresh sub-agent) verifies the proposal
|
||||
- Searches codebase to verify Worker's claims
|
||||
- Checks against patterns in CODEBASE_CONTEXT.md
|
||||
- Submits ${colors.green}APPROVED${colors.reset} or ${colors.yellow}NEEDS_WORK${colors.reset} via ${colors.cyan}diligence.review${colors.reset}
|
||||
|
||||
3. If ${colors.yellow}NEEDS_WORK${colors.reset}: Worker revises, Reviewer re-checks (up to 5 rounds)
|
||||
|
||||
4. If ${colors.green}APPROVED${colors.reset}: ${colors.cyan}diligence.implement${colors.reset} → code changes → ${colors.cyan}diligence.complete${colors.reset}
|
||||
|
||||
${colors.dim}This was a dry run - no code changes were made.${colors.reset}
|
||||
`, 'reset');
|
||||
|
||||
// Cleanup - abort the workflow
|
||||
await client.callTool('abort', { reason: 'Dry run completed' });
|
||||
log('Workflow aborted (dry run cleanup)', 'dim');
|
||||
|
||||
} finally {
|
||||
await client.disconnect();
|
||||
log('\nDisconnected from MCP server.', 'dim');
|
||||
}
|
||||
}
|
||||
|
||||
main().catch(err => {
|
||||
console.error('Error:', err.message);
|
||||
process.exit(1);
|
||||
});
|
||||
150
test/fixture/.claude/CODEBASE_CONTEXT.md
Normal file
150
test/fixture/.claude/CODEBASE_CONTEXT.md
Normal file
@@ -0,0 +1,150 @@
|
||||
# Codebase Context: Test Fixture
|
||||
|
||||
This is a simplified test codebase that mirrors real-world patterns. Use this context to understand the architecture before making changes.
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
src/
|
||||
├── broker/
|
||||
│ └── events.ts # Broker event bus (Subject-based pub/sub)
|
||||
├── services/
|
||||
│ ├── user-block.service.ts # Blocking logic
|
||||
│ ├── voice-channel.service.ts # Voice channels and DM calls
|
||||
│ ├── chat.service.ts # Chat channels and messages
|
||||
│ └── team.service.ts # Team state and permission caching
|
||||
└── controllers/
|
||||
└── roles.controller.ts # REST API for roles
|
||||
```
|
||||
|
||||
## Critical Pattern: Broker Events
|
||||
|
||||
**All state changes that affect multiple services MUST emit broker events.**
|
||||
|
||||
### Available Events
|
||||
|
||||
| Event | Emitted When | Expected Subscribers |
|
||||
|-------|--------------|---------------------|
|
||||
| `BusUserBlockChange` | User blocks/unblocks another | Voice services, DM services |
|
||||
| `BusTeamRoleChange` | Role created/updated/deleted | Permission caches |
|
||||
| `BusTeamMemberRoleChange` | User role assigned/removed | Permission caches |
|
||||
| `BusVoiceParticipant` | User joins/leaves voice | Voice UI components |
|
||||
| `BusDmCall` | DM call state changes | Call observers |
|
||||
|
||||
### Pattern Example
|
||||
|
||||
```typescript
|
||||
// CORRECT: Emit event after state change
|
||||
async updateRole(teamId, roleId, updates) {
|
||||
const role = await db.update(roleId, updates);
|
||||
|
||||
BusTeamRoleChange.next({ // ← MUST emit event
|
||||
teamId,
|
||||
roleId,
|
||||
action: 'updated',
|
||||
timestamp: new Date(),
|
||||
});
|
||||
|
||||
return role;
|
||||
}
|
||||
```
|
||||
|
||||
## Critical Pattern: Permission vs Action Checks
|
||||
|
||||
**Permission = Visibility. Action = Separate Check.**
|
||||
|
||||
### Why This Matters
|
||||
|
||||
For DM channels, blocking creates a `'read'` permission, NOT `'denied'`. The user can still SEE the DM channel, but cannot SEND messages.
|
||||
|
||||
```typescript
|
||||
// Permission check (for visibility)
|
||||
if (isBlocked) {
|
||||
return { permission: 'read', reason: 'blocked' }; // ← 'read', not 'denied'
|
||||
}
|
||||
|
||||
// Action check (separate from permission)
|
||||
async sendMessage(userId, channel, content) {
|
||||
if (await isBlockingEitherWay(userA, userB)) {
|
||||
throw new Error('Cannot send messages'); // ← Separate check
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Voice Permission Pattern
|
||||
|
||||
For DM channels, voice permissions are **always true**:
|
||||
|
||||
```typescript
|
||||
return {
|
||||
permission: 'read',
|
||||
voiceListen: true, // Always true for DM
|
||||
voiceTalk: true, // Blocking checked on JOIN, not here
|
||||
voiceWebcam: true,
|
||||
voiceScreenshare: true,
|
||||
};
|
||||
```
|
||||
|
||||
Blocking is enforced by **action checks** in:
|
||||
- `joinVoiceChannel()` - line 33
|
||||
- `startDmCall()` - line 56
|
||||
|
||||
## Critical Pattern: Cache Invalidation
|
||||
|
||||
**Caches MUST subscribe to relevant broker events.**
|
||||
|
||||
### Current Bug Pattern
|
||||
|
||||
```typescript
|
||||
// BAD: Only clears on team switch
|
||||
constructor() {
|
||||
teamChange$.subscribe(() => {
|
||||
this.memoizedPermissions.clear();
|
||||
});
|
||||
}
|
||||
|
||||
// GOOD: Also clear on role changes
|
||||
constructor() {
|
||||
teamChange$.subscribe(() => this.clearCache());
|
||||
BusTeamRoleChange.subscribe(() => this.clearCache()); // ← ADD THIS
|
||||
BusTeamMemberRoleChange.subscribe(() => this.clearCache()); // ← AND THIS
|
||||
}
|
||||
```
|
||||
|
||||
## Checklist: Before Making Changes
|
||||
|
||||
### For ANY state change:
|
||||
|
||||
1. [ ] Does this change affect other services?
|
||||
2. [ ] Is there a broker event for this? If not, should there be?
|
||||
3. [ ] Are all relevant services subscribed to the event?
|
||||
|
||||
### For blocking-related changes:
|
||||
|
||||
1. [ ] Is blocking checked on all relevant ACTIONS (not just permissions)?
|
||||
2. [ ] What happens if block is created DURING an action (e.g., mid-call)?
|
||||
3. [ ] Are broker events emitted for blocking changes?
|
||||
4. [ ] Do voice services subscribe to `BusUserBlockChange`?
|
||||
|
||||
### For permission/cache changes:
|
||||
|
||||
1. [ ] What events should invalidate this cache?
|
||||
2. [ ] Is the cache subscribed to all relevant broker events?
|
||||
3. [ ] What's the TTL? Is stale data acceptable?
|
||||
|
||||
## Files Quick Reference
|
||||
|
||||
| File | Key Functions | Known Issues |
|
||||
|------|---------------|--------------|
|
||||
| `user-block.service.ts` | `blockUser()`, `unblockUser()` | Missing voice cleanup on block |
|
||||
| `voice-channel.service.ts` | `answerDmCall()`, `startDmCall()` | Missing blocking check on answer |
|
||||
| `team.service.ts` | `getPermission()`, `clearCache()` | Cache doesn't subscribe to role events |
|
||||
| `roles.controller.ts` | `createRole()`, `deleteRole()` | Missing broker events |
|
||||
| `chat.service.ts` | `getChannelPermission()` | **Reference implementation** (correct) |
|
||||
|
||||
## Anti-Patterns to Avoid
|
||||
|
||||
1. **Fixing in ONE place** - If blocking is checked in `startDmCall()`, it should also be in `answerDmCall()`
|
||||
2. **Changing permissions** - Don't change `voiceListen: true` to `voiceListen: !isBlocked`. Use action checks instead.
|
||||
3. **Forgetting broker events** - Every CRUD operation on roles/permissions should emit an event
|
||||
4. **Assuming cache is fresh** - If an operation can change state, subscribe to its event
|
||||
79
test/fixture/src/broker/events.ts
Normal file
79
test/fixture/src/broker/events.ts
Normal file
@@ -0,0 +1,79 @@
|
||||
/**
|
||||
* Broker Event Bus System
|
||||
*
|
||||
* All state changes that affect multiple services should emit broker events.
|
||||
* Services subscribe to these events to maintain consistency.
|
||||
*/
|
||||
|
||||
import { Subject } from 'rxjs';
|
||||
|
||||
// ============================================================================
|
||||
// User Events
|
||||
// ============================================================================
|
||||
|
||||
export interface UserBlockEvent {
|
||||
sourceUserId: string;
|
||||
targetUserId: string;
|
||||
blocked: boolean; // true = blocked, false = unblocked
|
||||
timestamp: Date;
|
||||
}
|
||||
|
||||
export const BusUserBlockChange = new Subject<UserBlockEvent>();
|
||||
|
||||
// ============================================================================
|
||||
// Team Events
|
||||
// ============================================================================
|
||||
|
||||
export interface TeamRoleEvent {
|
||||
teamId: string;
|
||||
roleId: string;
|
||||
action: 'created' | 'updated' | 'deleted';
|
||||
flags?: number;
|
||||
timestamp: Date;
|
||||
}
|
||||
|
||||
export interface TeamMemberRoleEvent {
|
||||
teamId: string;
|
||||
userId: string;
|
||||
roleId: string;
|
||||
action: 'assigned' | 'removed';
|
||||
timestamp: Date;
|
||||
}
|
||||
|
||||
export const BusTeamRoleChange = new Subject<TeamRoleEvent>();
|
||||
export const BusTeamMemberRoleChange = new Subject<TeamMemberRoleEvent>();
|
||||
|
||||
// ============================================================================
|
||||
// Voice Events
|
||||
// ============================================================================
|
||||
|
||||
export interface VoiceParticipantEvent {
|
||||
channelId: string;
|
||||
userId: string;
|
||||
action: 'joined' | 'left' | 'kicked';
|
||||
timestamp: Date;
|
||||
}
|
||||
|
||||
export interface DmCallEvent {
|
||||
callId: string;
|
||||
callerId: string;
|
||||
calleeId: string;
|
||||
state: 'ringing' | 'active' | 'ended' | 'declined';
|
||||
timestamp: Date;
|
||||
}
|
||||
|
||||
export const BusVoiceParticipant = new Subject<VoiceParticipantEvent>();
|
||||
export const BusDmCall = new Subject<DmCallEvent>();
|
||||
|
||||
// ============================================================================
|
||||
// Channel Events
|
||||
// ============================================================================
|
||||
|
||||
export interface ChannelMemberEvent {
|
||||
channelId: string;
|
||||
userId: string;
|
||||
hidden: boolean;
|
||||
timestamp: Date;
|
||||
}
|
||||
|
||||
export const BusChannelMember = new Subject<ChannelMemberEvent>();
|
||||
99
test/fixture/src/controllers/roles.controller.ts
Normal file
99
test/fixture/src/controllers/roles.controller.ts
Normal file
@@ -0,0 +1,99 @@
|
||||
/**
|
||||
* Roles Controller
|
||||
*
|
||||
* REST API for managing team roles.
|
||||
*/
|
||||
|
||||
import { BusTeamRoleChange } from '../broker/events';
|
||||
|
||||
interface TeamRole {
|
||||
id: string;
|
||||
teamId: string;
|
||||
name: string;
|
||||
flags: number;
|
||||
}
|
||||
|
||||
// In-memory store
|
||||
const roles: TeamRole[] = [];
|
||||
|
||||
export class RolesController {
|
||||
/**
|
||||
* Create a new role.
|
||||
*
|
||||
* BUG: Doesn't emit BusTeamRoleChange event!
|
||||
* Clients won't know a new role was created.
|
||||
*/
|
||||
async createRole(teamId: string, name: string, flags: number): Promise<TeamRole> {
|
||||
const role: TeamRole = {
|
||||
id: `role_${Date.now()}`,
|
||||
teamId,
|
||||
name,
|
||||
flags,
|
||||
};
|
||||
roles.push(role);
|
||||
|
||||
// BUG: Missing broker event!
|
||||
// Should emit:
|
||||
// BusTeamRoleChange.next({
|
||||
// teamId,
|
||||
// roleId: role.id,
|
||||
// action: 'created',
|
||||
// flags,
|
||||
// timestamp: new Date(),
|
||||
// });
|
||||
|
||||
return role;
|
||||
}
|
||||
|
||||
/**
|
||||
* Update an existing role.
|
||||
*
|
||||
* Emits broker event correctly (this one is fine).
|
||||
*/
|
||||
async updateRole(teamId: string, roleId: string, updates: Partial<TeamRole>): Promise<TeamRole> {
|
||||
const role = roles.find(r => r.id === roleId && r.teamId === teamId);
|
||||
if (!role) throw new Error('Role not found');
|
||||
|
||||
Object.assign(role, updates);
|
||||
|
||||
// This one correctly emits the event
|
||||
BusTeamRoleChange.next({
|
||||
teamId,
|
||||
roleId: role.id,
|
||||
action: 'updated',
|
||||
flags: role.flags,
|
||||
timestamp: new Date(),
|
||||
});
|
||||
|
||||
return role;
|
||||
}
|
||||
|
||||
/**
|
||||
* Delete a role.
|
||||
*
|
||||
* BUG: Doesn't emit BusTeamRoleChange event!
|
||||
* Clients won't know the role was deleted, will have stale data.
|
||||
*/
|
||||
async deleteRole(teamId: string, roleId: string): Promise<void> {
|
||||
const index = roles.findIndex(r => r.id === roleId && r.teamId === teamId);
|
||||
if (index === -1) throw new Error('Role not found');
|
||||
|
||||
roles.splice(index, 1);
|
||||
|
||||
// BUG: Missing broker event!
|
||||
// Should emit:
|
||||
// BusTeamRoleChange.next({
|
||||
// teamId,
|
||||
// roleId,
|
||||
// action: 'deleted',
|
||||
// timestamp: new Date(),
|
||||
// });
|
||||
}
|
||||
|
||||
/**
|
||||
* Get all roles for a team.
|
||||
*/
|
||||
async getRoles(teamId: string): Promise<TeamRole[]> {
|
||||
return roles.filter(r => r.teamId === teamId);
|
||||
}
|
||||
}
|
||||
122
test/fixture/src/services/chat.service.ts
Normal file
122
test/fixture/src/services/chat.service.ts
Normal file
@@ -0,0 +1,122 @@
|
||||
/**
|
||||
* Chat Service
|
||||
*
|
||||
* Manages chat channels and permissions.
|
||||
* This shows the CORRECT pattern for handling blocking in chat.
|
||||
*/
|
||||
|
||||
import { UserBlockService } from './user-block.service';
|
||||
|
||||
interface ChannelPermission {
|
||||
permission: 'read' | 'write' | 'admin' | 'denied';
|
||||
voiceListen: boolean;
|
||||
voiceTalk: boolean;
|
||||
voiceWebcam: boolean;
|
||||
voiceScreenshare: boolean;
|
||||
reason?: string;
|
||||
}
|
||||
|
||||
interface Channel {
|
||||
id: string;
|
||||
type: 'dm' | 'project';
|
||||
userA?: string; // For DM channels
|
||||
userB?: string; // For DM channels
|
||||
projectId?: string; // For project channels
|
||||
}
|
||||
|
||||
export class ChatService {
|
||||
constructor(private userBlockService: UserBlockService) {}
|
||||
|
||||
/**
|
||||
* Get permission for a channel.
|
||||
*
|
||||
* For DM channels:
|
||||
* - Blocking returns 'read' permission (can see messages, can't send)
|
||||
* - This is the CORRECT pattern: permission = visibility, not action validation
|
||||
*
|
||||
* For voice permissions in DM:
|
||||
* - voiceListen, voiceTalk, etc. are always TRUE for DM channels
|
||||
* - This is INTENTIONAL: voice blocking is handled by action checks, not permissions
|
||||
*
|
||||
* Pattern: Permission controls VISIBILITY; Actions have SEPARATE blocking checks.
|
||||
* See chat.sendMessage() for how blocking is enforced on actions.
|
||||
*/
|
||||
async getChannelPermission(
|
||||
userId: string,
|
||||
channel: Channel
|
||||
): Promise<ChannelPermission> {
|
||||
if (channel.type === 'dm' && channel.userA && channel.userB) {
|
||||
const otherUser = channel.userA === userId ? channel.userB : channel.userA;
|
||||
|
||||
// Check blocking status
|
||||
const isBlockingOut = await this.userBlockService.isBlocking(userId, otherUser);
|
||||
const isBlockingInc = await this.userBlockService.isBlocking(otherUser, userId);
|
||||
|
||||
// Return 'read' permission for blocked DMs (can see, can't send)
|
||||
// This is the correct pattern - permission controls visibility
|
||||
if (isBlockingOut) {
|
||||
return {
|
||||
permission: 'read',
|
||||
reason: 'block-user',
|
||||
// Voice permissions are always true for DM - blocking is checked on actions
|
||||
voiceListen: true,
|
||||
voiceTalk: true,
|
||||
voiceWebcam: true,
|
||||
voiceScreenshare: true,
|
||||
};
|
||||
}
|
||||
|
||||
if (isBlockingInc) {
|
||||
return {
|
||||
permission: 'read',
|
||||
reason: 'blocked-by-user',
|
||||
voiceListen: true,
|
||||
voiceTalk: true,
|
||||
voiceWebcam: true,
|
||||
voiceScreenshare: true,
|
||||
};
|
||||
}
|
||||
|
||||
// Normal DM permission
|
||||
return {
|
||||
permission: 'write',
|
||||
voiceListen: true,
|
||||
voiceTalk: true,
|
||||
voiceWebcam: true,
|
||||
voiceScreenshare: true,
|
||||
};
|
||||
}
|
||||
|
||||
// Project channel - normal permission flow
|
||||
return {
|
||||
permission: 'write',
|
||||
voiceListen: true,
|
||||
voiceTalk: true,
|
||||
voiceWebcam: true,
|
||||
voiceScreenshare: true,
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Send a message to a channel.
|
||||
*
|
||||
* This is the CORRECT pattern for blocking enforcement:
|
||||
* - Check blocking SEPARATELY from permission
|
||||
* - Permission controls visibility; this check controls action
|
||||
*/
|
||||
async sendMessage(userId: string, channel: Channel, content: string): Promise<void> {
|
||||
// Separate blocking check for the ACTION (not permission)
|
||||
if (channel.type === 'dm' && channel.userA && channel.userB) {
|
||||
const isBlocked = await this.userBlockService.isBlockingEitherWay(
|
||||
channel.userA,
|
||||
channel.userB
|
||||
);
|
||||
if (isBlocked) {
|
||||
throw new Error('You cannot send messages to this user');
|
||||
}
|
||||
}
|
||||
|
||||
// Send the message...
|
||||
console.log(`[chat] ${userId} -> ${channel.id}: ${content}`);
|
||||
}
|
||||
}
|
||||
102
test/fixture/src/services/team.service.ts
Normal file
102
test/fixture/src/services/team.service.ts
Normal file
@@ -0,0 +1,102 @@
|
||||
/**
|
||||
* Team Service (Client-side)
|
||||
*
|
||||
* Manages team state including permissions and role caching.
|
||||
*/
|
||||
|
||||
import { BusTeamRoleChange, BusTeamMemberRoleChange } from '../broker/events';
|
||||
|
||||
interface Permission {
|
||||
permission: 'read' | 'write' | 'admin' | 'denied';
|
||||
voiceListen: boolean;
|
||||
voiceTalk: boolean;
|
||||
voiceWebcam: boolean;
|
||||
voiceScreenshare: boolean;
|
||||
}
|
||||
|
||||
interface Team {
|
||||
id: string;
|
||||
name: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Permission cache for computed permissions.
|
||||
*
|
||||
* BUG: This cache only clears on team switch, not on role changes!
|
||||
* When a user's roles change, their cached permissions become stale.
|
||||
*/
|
||||
const memoizedPermissions = new Map<string, Permission>();
|
||||
|
||||
export class TeamService {
|
||||
private currentTeam: Team | null = null;
|
||||
|
||||
constructor() {
|
||||
// Subscribe to team changes to clear cache
|
||||
// BUG: Only clears on team SWITCH, not on role updates!
|
||||
this.setupTeamChangeSubscription();
|
||||
|
||||
// BUG: Missing subscription to role changes!
|
||||
// Should subscribe to BusTeamRoleChange and BusTeamMemberRoleChange
|
||||
// and clear the cache when roles change.
|
||||
}
|
||||
|
||||
/**
|
||||
* Get cached permission for a project.
|
||||
*/
|
||||
getPermission(projectId: string): Permission | undefined {
|
||||
return memoizedPermissions.get(projectId);
|
||||
}
|
||||
|
||||
/**
|
||||
* Set cached permission for a project.
|
||||
*/
|
||||
setPermission(projectId: string, permission: Permission): void {
|
||||
memoizedPermissions.set(projectId, permission);
|
||||
}
|
||||
|
||||
/**
|
||||
* Clear permission cache.
|
||||
*
|
||||
* Called when active team changes.
|
||||
* BUG: Should also be called when roles change!
|
||||
*/
|
||||
clearPermissionCache(): void {
|
||||
memoizedPermissions.clear();
|
||||
}
|
||||
|
||||
/**
|
||||
* Switch to a different team.
|
||||
*/
|
||||
setActiveTeam(team: Team): void {
|
||||
this.currentTeam = team;
|
||||
this.clearPermissionCache();
|
||||
}
|
||||
|
||||
/**
|
||||
* Setup subscription to team changes.
|
||||
*
|
||||
* BUG: Only subscribes to team SWITCH, not to:
|
||||
* - BusTeamRoleChange (role created/updated/deleted)
|
||||
* - BusTeamMemberRoleChange (user role assigned/removed)
|
||||
*/
|
||||
private setupTeamChangeSubscription(): void {
|
||||
// This would normally be an observable subscription
|
||||
// For now, we just clear on setActiveTeam()
|
||||
}
|
||||
|
||||
/**
|
||||
* FIX: Should add these subscriptions:
|
||||
*
|
||||
* BusTeamRoleChange.subscribe(event => {
|
||||
* if (event.teamId === this.currentTeam?.id) {
|
||||
* this.clearPermissionCache();
|
||||
* }
|
||||
* });
|
||||
*
|
||||
* BusTeamMemberRoleChange.subscribe(event => {
|
||||
* if (event.teamId === this.currentTeam?.id) {
|
||||
* this.clearPermissionCache();
|
||||
* }
|
||||
* });
|
||||
*/
|
||||
}
|
||||
117
test/fixture/src/services/user-block.service.ts
Normal file
117
test/fixture/src/services/user-block.service.ts
Normal file
@@ -0,0 +1,117 @@
|
||||
/**
|
||||
* User Block Service
|
||||
*
|
||||
* Handles blocking/unblocking between users.
|
||||
* Blocking affects:
|
||||
* - DM visibility
|
||||
* - Voice call permissions
|
||||
* - Feed following
|
||||
*/
|
||||
|
||||
import { BusUserBlockChange } from '../broker/events';
|
||||
|
||||
interface UserBlockRecord {
|
||||
sourceUserId: string;
|
||||
targetUserId: string;
|
||||
createdAt: Date;
|
||||
}
|
||||
|
||||
// In-memory store for testing
|
||||
const blocks: UserBlockRecord[] = [];
|
||||
|
||||
export class UserBlockService {
|
||||
/**
|
||||
* Block a user.
|
||||
*
|
||||
* When a user is blocked:
|
||||
* 1. They can no longer send DMs
|
||||
* 2. They are unfollowed from feeds
|
||||
* 3. DM channel becomes read-only
|
||||
*
|
||||
* BUG: Missing voice call cleanup!
|
||||
* - Should end any active DM call between these users
|
||||
* - Should kick from shared voice channels
|
||||
*/
|
||||
async blockUser(sourceUserId: string, targetUserId: string): Promise<void> {
|
||||
// Check if already blocked
|
||||
const existing = blocks.find(
|
||||
b => b.sourceUserId === sourceUserId && b.targetUserId === targetUserId
|
||||
);
|
||||
if (existing) return;
|
||||
|
||||
// Create block record
|
||||
const block: UserBlockRecord = {
|
||||
sourceUserId,
|
||||
targetUserId,
|
||||
createdAt: new Date(),
|
||||
};
|
||||
blocks.push(block);
|
||||
|
||||
// Unfollow in both directions
|
||||
await this.unfollowUser(sourceUserId, targetUserId);
|
||||
await this.unfollowUser(targetUserId, sourceUserId);
|
||||
|
||||
// Emit broker event
|
||||
BusUserBlockChange.next({
|
||||
sourceUserId,
|
||||
targetUserId,
|
||||
blocked: true,
|
||||
timestamp: block.createdAt,
|
||||
});
|
||||
|
||||
// BUG: No voice cleanup here!
|
||||
// Should call: voiceChannelService.endDmCallBetweenUsers(sourceUserId, targetUserId)
|
||||
// Should call: voiceChannelService.kickFromSharedChannels(sourceUserId, targetUserId)
|
||||
}
|
||||
|
||||
/**
|
||||
* Unblock a user.
|
||||
*/
|
||||
async unblockUser(sourceUserId: string, targetUserId: string): Promise<void> {
|
||||
const index = blocks.findIndex(
|
||||
b => b.sourceUserId === sourceUserId && b.targetUserId === targetUserId
|
||||
);
|
||||
if (index === -1) return;
|
||||
|
||||
blocks.splice(index, 1);
|
||||
|
||||
// Unhide DM channel
|
||||
await this.unhideDmChannel(sourceUserId, targetUserId);
|
||||
|
||||
// Emit broker event
|
||||
BusUserBlockChange.next({
|
||||
sourceUserId,
|
||||
targetUserId,
|
||||
blocked: false,
|
||||
timestamp: new Date(),
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Check if either user has blocked the other.
|
||||
*/
|
||||
async isBlockingEitherWay(userA: string, userB: string): Promise<boolean> {
|
||||
return blocks.some(
|
||||
b =>
|
||||
(b.sourceUserId === userA && b.targetUserId === userB) ||
|
||||
(b.sourceUserId === userB && b.targetUserId === userA)
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* Check if source has blocked target.
|
||||
*/
|
||||
async isBlocking(sourceUserId: string, targetUserId: string): Promise<boolean> {
|
||||
return blocks.some(
|
||||
b => b.sourceUserId === sourceUserId && b.targetUserId === targetUserId
|
||||
);
|
||||
}
|
||||
|
||||
private async unfollowUser(userId: string, targetId: string): Promise<void> {
|
||||
// Unfollow logic...
|
||||
}
|
||||
|
||||
private async unhideDmChannel(userA: string, userB: string): Promise<void> {
|
||||
// Unhide DM channel logic...
|
||||
}
|
||||
}
|
||||
220
test/fixture/src/services/voice-channel.service.ts
Normal file
220
test/fixture/src/services/voice-channel.service.ts
Normal file
@@ -0,0 +1,220 @@
|
||||
/**
|
||||
* Voice Channel Service
|
||||
*
|
||||
* Manages voice channel state, participants, and DM calls.
|
||||
*/
|
||||
|
||||
import { BusVoiceParticipant, BusDmCall, BusUserBlockChange } from '../broker/events';
|
||||
import { UserBlockService } from './user-block.service';
|
||||
|
||||
interface VoiceParticipant {
|
||||
channelId: string;
|
||||
odlUserId: string;
|
||||
userId: string;
|
||||
joinedAt: Date;
|
||||
muted: boolean;
|
||||
deafened: boolean;
|
||||
}
|
||||
|
||||
interface DmCall {
|
||||
callId: string;
|
||||
callerId: string;
|
||||
calleeId: string;
|
||||
channelId: string;
|
||||
state: 'ringing' | 'active' | 'ended' | 'declined';
|
||||
createdAt: Date;
|
||||
}
|
||||
|
||||
// In-memory stores
|
||||
const participants: VoiceParticipant[] = [];
|
||||
const dmCalls: DmCall[] = [];
|
||||
|
||||
export class VoiceChannelService {
|
||||
constructor(private userBlockService: UserBlockService) {}
|
||||
|
||||
/**
|
||||
* Join a voice channel.
|
||||
*
|
||||
* For DM channels, checks blocking before allowing join.
|
||||
*/
|
||||
async joinVoiceChannel(
|
||||
userId: string,
|
||||
channelId: string,
|
||||
channelType: 'dm' | 'project'
|
||||
): Promise<void> {
|
||||
// For DM channels, check blocking
|
||||
if (channelType === 'dm') {
|
||||
const otherUserId = this.getOtherDmUser(channelId, userId);
|
||||
const isBlocked = await this.userBlockService.isBlockingEitherWay(userId, otherUserId);
|
||||
if (isBlocked) {
|
||||
throw new Error('Cannot join voice - user is blocked');
|
||||
}
|
||||
}
|
||||
|
||||
const participant: VoiceParticipant = {
|
||||
channelId,
|
||||
odlUserId: `odl_${userId}`,
|
||||
userId,
|
||||
joinedAt: new Date(),
|
||||
muted: false,
|
||||
deafened: false,
|
||||
};
|
||||
participants.push(participant);
|
||||
|
||||
BusVoiceParticipant.next({
|
||||
channelId,
|
||||
userId,
|
||||
action: 'joined',
|
||||
timestamp: participant.joinedAt,
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Start a DM call.
|
||||
*
|
||||
* Checks blocking before creating the call.
|
||||
*/
|
||||
async startDmCall(callerId: string, calleeId: string): Promise<DmCall> {
|
||||
// Check blocking
|
||||
const isBlocked = await this.userBlockService.isBlockingEitherWay(callerId, calleeId);
|
||||
if (isBlocked) {
|
||||
throw new Error('Cannot start call - user is blocked');
|
||||
}
|
||||
|
||||
const call: DmCall = {
|
||||
callId: `call_${Date.now()}`,
|
||||
callerId,
|
||||
calleeId,
|
||||
channelId: `dm_${callerId}_${calleeId}`,
|
||||
state: 'ringing',
|
||||
createdAt: new Date(),
|
||||
};
|
||||
dmCalls.push(call);
|
||||
|
||||
BusDmCall.next({
|
||||
callId: call.callId,
|
||||
callerId,
|
||||
calleeId,
|
||||
state: 'ringing',
|
||||
timestamp: call.createdAt,
|
||||
});
|
||||
|
||||
// Notify callee
|
||||
this.notifyDmCall(call);
|
||||
|
||||
return call;
|
||||
}
|
||||
|
||||
/**
|
||||
* Answer a DM call.
|
||||
*
|
||||
* BUG: Missing blocking check!
|
||||
* If block is created after call starts but before answer,
|
||||
* the callee can still answer.
|
||||
*/
|
||||
async answerDmCall(callId: string, userId: string): Promise<void> {
|
||||
const call = dmCalls.find(c => c.callId === callId);
|
||||
if (!call) throw new Error('Call not found');
|
||||
if (call.calleeId !== userId) throw new Error('Not the callee');
|
||||
if (call.state !== 'ringing') throw new Error('Call is not ringing');
|
||||
|
||||
// BUG: No blocking check here!
|
||||
// Should check: await this.userBlockService.isBlockingEitherWay(call.callerId, call.calleeId)
|
||||
|
||||
call.state = 'active';
|
||||
|
||||
BusDmCall.next({
|
||||
callId: call.callId,
|
||||
callerId: call.callerId,
|
||||
calleeId: call.calleeId,
|
||||
state: 'active',
|
||||
timestamp: new Date(),
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Decline a DM call.
|
||||
*
|
||||
* BUG: Missing blocking check!
|
||||
*/
|
||||
async declineDmCall(callId: string, userId: string): Promise<void> {
|
||||
const call = dmCalls.find(c => c.callId === callId);
|
||||
if (!call) throw new Error('Call not found');
|
||||
if (call.calleeId !== userId) throw new Error('Not the callee');
|
||||
|
||||
// BUG: No blocking check here either!
|
||||
|
||||
call.state = 'declined';
|
||||
|
||||
BusDmCall.next({
|
||||
callId: call.callId,
|
||||
callerId: call.callerId,
|
||||
calleeId: call.calleeId,
|
||||
state: 'declined',
|
||||
timestamp: new Date(),
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* End a DM call between two users.
|
||||
*
|
||||
* Used when block is created to clean up active calls.
|
||||
*/
|
||||
async endDmCallBetweenUsers(userA: string, userB: string): Promise<void> {
|
||||
const call = dmCalls.find(
|
||||
c =>
|
||||
(c.callerId === userA && c.calleeId === userB) ||
|
||||
(c.callerId === userB && c.calleeId === userA)
|
||||
);
|
||||
if (call && call.state !== 'ended') {
|
||||
call.state = 'ended';
|
||||
BusDmCall.next({
|
||||
callId: call.callId,
|
||||
callerId: call.callerId,
|
||||
calleeId: call.calleeId,
|
||||
state: 'ended',
|
||||
timestamp: new Date(),
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Kick a user from a voice channel.
|
||||
*/
|
||||
async leaveChannel(userId: string, channelId?: string): Promise<void> {
|
||||
const index = participants.findIndex(
|
||||
p => p.userId === userId && (!channelId || p.channelId === channelId)
|
||||
);
|
||||
if (index !== -1) {
|
||||
const participant = participants[index];
|
||||
participants.splice(index, 1);
|
||||
|
||||
BusVoiceParticipant.next({
|
||||
channelId: participant.channelId,
|
||||
userId,
|
||||
action: 'left',
|
||||
timestamp: new Date(),
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Notify callee of incoming DM call.
|
||||
*
|
||||
* BUG: Doesn't filter for blocking!
|
||||
* Blocked users still receive call notifications.
|
||||
*/
|
||||
private notifyDmCall(call: DmCall): void {
|
||||
// BUG: Should check blocking before notifying
|
||||
// if (await this.userBlockService.isBlockingEitherWay(call.callerId, call.calleeId)) return;
|
||||
|
||||
// Send notification to callee...
|
||||
console.log(`[notify] ${call.calleeId}: Incoming call from ${call.callerId}`);
|
||||
}
|
||||
|
||||
private getOtherDmUser(channelId: string, userId: string): string {
|
||||
// Parse DM channel ID to get the other user
|
||||
const parts = channelId.replace('dm_', '').split('_');
|
||||
return parts.find(id => id !== userId) || '';
|
||||
}
|
||||
}
|
||||
229
test/mcp-client.mjs
Normal file
229
test/mcp-client.mjs
Normal file
@@ -0,0 +1,229 @@
|
||||
#!/usr/bin/env node
|
||||
/**
|
||||
* MCP Test Client
|
||||
*
|
||||
* Programmatically tests the diligence MCP server by:
|
||||
* 1. Spawning the server as a child process
|
||||
* 2. Sending JSON-RPC messages via stdio
|
||||
* 3. Receiving and parsing responses
|
||||
*
|
||||
* Usage:
|
||||
* const client = new McpClient();
|
||||
* await client.connect();
|
||||
* const result = await client.callTool('status', {});
|
||||
* await client.disconnect();
|
||||
*/
|
||||
|
||||
import { spawn } from 'child_process';
|
||||
import { createInterface } from 'readline';
|
||||
import { dirname, join } from 'path';
|
||||
import { fileURLToPath } from 'url';
|
||||
|
||||
const __dirname = dirname(fileURLToPath(import.meta.url));
|
||||
|
||||
export class McpClient {
|
||||
constructor(serverPath = join(__dirname, '..', 'index.mjs')) {
|
||||
this.serverPath = serverPath;
|
||||
this.process = null;
|
||||
this.requestId = 0;
|
||||
this.pendingRequests = new Map();
|
||||
this.readline = null;
|
||||
}
|
||||
|
||||
async connect() {
|
||||
return new Promise((resolve, reject) => {
|
||||
this.process = spawn('node', [this.serverPath], {
|
||||
stdio: ['pipe', 'pipe', 'pipe'],
|
||||
cwd: join(__dirname, 'fixture'), // Run in fixture directory
|
||||
});
|
||||
|
||||
this.readline = createInterface({
|
||||
input: this.process.stdout,
|
||||
crlfDelay: Infinity,
|
||||
});
|
||||
|
||||
this.readline.on('line', (line) => {
|
||||
try {
|
||||
const message = JSON.parse(line);
|
||||
if (message.id !== undefined && this.pendingRequests.has(message.id)) {
|
||||
const { resolve, reject } = this.pendingRequests.get(message.id);
|
||||
this.pendingRequests.delete(message.id);
|
||||
if (message.error) {
|
||||
reject(new Error(message.error.message || JSON.stringify(message.error)));
|
||||
} else {
|
||||
resolve(message.result);
|
||||
}
|
||||
}
|
||||
} catch (e) {
|
||||
// Ignore non-JSON lines
|
||||
}
|
||||
});
|
||||
|
||||
this.process.stderr.on('data', (data) => {
|
||||
// Server logs to stderr
|
||||
if (process.env.DEBUG) {
|
||||
console.error('[server]', data.toString());
|
||||
}
|
||||
});
|
||||
|
||||
this.process.on('error', reject);
|
||||
this.process.on('exit', (code) => {
|
||||
if (code !== 0 && code !== null) {
|
||||
console.error(`Server exited with code ${code}`);
|
||||
}
|
||||
});
|
||||
|
||||
// Initialize the MCP connection
|
||||
this._send({
|
||||
jsonrpc: '2.0',
|
||||
id: this.requestId++,
|
||||
method: 'initialize',
|
||||
params: {
|
||||
protocolVersion: '0.1.0',
|
||||
clientInfo: { name: 'test-client', version: '1.0.0' },
|
||||
capabilities: {},
|
||||
},
|
||||
}).then(() => {
|
||||
// Send initialized notification
|
||||
this._sendNotification('notifications/initialized', {});
|
||||
resolve();
|
||||
}).catch(reject);
|
||||
});
|
||||
}
|
||||
|
||||
async disconnect() {
|
||||
if (this.process) {
|
||||
this.process.kill('SIGTERM');
|
||||
this.process = null;
|
||||
}
|
||||
if (this.readline) {
|
||||
this.readline.close();
|
||||
this.readline = null;
|
||||
}
|
||||
}
|
||||
|
||||
_send(message) {
|
||||
return new Promise((resolve, reject) => {
|
||||
if (!this.process) {
|
||||
reject(new Error('Not connected'));
|
||||
return;
|
||||
}
|
||||
this.pendingRequests.set(message.id, { resolve, reject });
|
||||
this.process.stdin.write(JSON.stringify(message) + '\n');
|
||||
|
||||
// Timeout after 10 seconds
|
||||
setTimeout(() => {
|
||||
if (this.pendingRequests.has(message.id)) {
|
||||
this.pendingRequests.delete(message.id);
|
||||
reject(new Error('Request timeout'));
|
||||
}
|
||||
}, 10000);
|
||||
});
|
||||
}
|
||||
|
||||
_sendNotification(method, params) {
|
||||
if (!this.process) return;
|
||||
this.process.stdin.write(JSON.stringify({
|
||||
jsonrpc: '2.0',
|
||||
method,
|
||||
params,
|
||||
}) + '\n');
|
||||
}
|
||||
|
||||
async listTools() {
|
||||
const result = await this._send({
|
||||
jsonrpc: '2.0',
|
||||
id: this.requestId++,
|
||||
method: 'tools/list',
|
||||
params: {},
|
||||
});
|
||||
return result.tools;
|
||||
}
|
||||
|
||||
async callTool(name, args = {}) {
|
||||
const result = await this._send({
|
||||
jsonrpc: '2.0',
|
||||
id: this.requestId++,
|
||||
method: 'tools/call',
|
||||
params: { name, arguments: args },
|
||||
});
|
||||
// Extract text from content array
|
||||
if (result.content && result.content[0] && result.content[0].text) {
|
||||
return {
|
||||
text: result.content[0].text,
|
||||
isError: result.isError || false,
|
||||
};
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
// Convenience methods for common workflows
|
||||
async status() {
|
||||
return this.callTool('status');
|
||||
}
|
||||
|
||||
async start(task) {
|
||||
return this.callTool('start', { task });
|
||||
}
|
||||
|
||||
async propose(proposal) {
|
||||
return this.callTool('propose', { proposal });
|
||||
}
|
||||
|
||||
async review(decision, reasoning) {
|
||||
return this.callTool('review', { decision, reasoning });
|
||||
}
|
||||
|
||||
async getWorkerBrief() {
|
||||
return this.callTool('get_worker_brief');
|
||||
}
|
||||
|
||||
async getReviewerBrief() {
|
||||
return this.callTool('get_reviewer_brief');
|
||||
}
|
||||
|
||||
async implement() {
|
||||
return this.callTool('implement');
|
||||
}
|
||||
|
||||
async complete(summary) {
|
||||
return this.callTool('complete', { summary });
|
||||
}
|
||||
|
||||
async abort(reason) {
|
||||
return this.callTool('abort', { reason });
|
||||
}
|
||||
|
||||
async approve(reason) {
|
||||
return this.callTool('approve', { reason });
|
||||
}
|
||||
}
|
||||
|
||||
// CLI usage for quick testing
|
||||
if (process.argv[1] === fileURLToPath(import.meta.url)) {
|
||||
const client = new McpClient();
|
||||
|
||||
try {
|
||||
console.log('Connecting to MCP server...');
|
||||
await client.connect();
|
||||
console.log('Connected!\n');
|
||||
|
||||
// List tools
|
||||
const tools = await client.listTools();
|
||||
console.log('Available tools:');
|
||||
tools.forEach(t => console.log(` - ${t.name}: ${t.description.slice(0, 60)}...`));
|
||||
console.log();
|
||||
|
||||
// Check status
|
||||
const status = await client.status();
|
||||
console.log('Status:');
|
||||
console.log(status.text);
|
||||
|
||||
await client.disconnect();
|
||||
console.log('\nDisconnected.');
|
||||
} catch (err) {
|
||||
console.error('Error:', err.message);
|
||||
await client.disconnect();
|
||||
process.exit(1);
|
||||
}
|
||||
}
|
||||
183
test/results/2026-01-22-comparison-report.md
Normal file
183
test/results/2026-01-22-comparison-report.md
Normal file
@@ -0,0 +1,183 @@
|
||||
# Diligence vs Naive Approach: Comparison Report
|
||||
|
||||
**Date:** 2026-01-22
|
||||
**Test Bug:** B1 - Blocked users can answer DM voice calls
|
||||
**Project:** nexus (~/bude/codecharm/nexus)
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
| Metric | Naive Approach | Diligence Approach |
|
||||
|--------|----------------|-------------------|
|
||||
| Bug verified exists? | ✅ Yes | ✅ Yes |
|
||||
| Correct line numbers? | ✅ Yes (1050, 965) | ✅ Worker correct |
|
||||
| Found declineDmCall gap? | ✅ Yes | ⚠️ Reviewer found it |
|
||||
| Found notification filtering? | ✅ Yes | ⚠️ Reviewer found it |
|
||||
| Found blockUser cleanup? | ✅ Yes | ⚠️ Reviewer found it |
|
||||
| Reviewer caught errors? | N/A | ✅ Caught line number discrepancy* |
|
||||
|
||||
*Reviewer searched wrong codebase (test fixture instead of nexus), but the PROCESS of verification worked.
|
||||
|
||||
---
|
||||
|
||||
## Bug Verification: CONFIRMED REAL
|
||||
|
||||
**Evidence from actual nexus code:**
|
||||
|
||||
```typescript
|
||||
// startDmCall (lines 965-969) - HAS blocking check ✅
|
||||
const blocked = await this.userBlockService.isBlockingEitherWay(callerId, calleeId);
|
||||
if (blocked) {
|
||||
throw new UserError('Cannot call this user');
|
||||
}
|
||||
|
||||
// answerDmCall (line 1050+) - NO blocking check ❌
|
||||
async answerDmCall(callId: MongoId): Promise<{ token: string; channelId: string }> {
|
||||
// Only checks: auth, call exists, state=ringing, user=callee, not expired
|
||||
// MISSING: blocking check
|
||||
}
|
||||
|
||||
// declineDmCall (line 1115+) - NO blocking check ❌
|
||||
async declineDmCall(callId: MongoId): Promise<void> {
|
||||
// Only checks: auth, call exists, state=ringing, user=callee
|
||||
// MISSING: blocking check
|
||||
}
|
||||
```
|
||||
|
||||
**Conclusion:** Bug B1 is REAL. Both approaches correctly identified it.
|
||||
|
||||
---
|
||||
|
||||
## Detailed Comparison
|
||||
|
||||
### Naive Approach Output
|
||||
|
||||
The naive agent (single Explore agent) produced:
|
||||
- ✅ Root cause analysis
|
||||
- ✅ Correct file identification (`voice-channel.rpc.ts`)
|
||||
- ✅ Correct line numbers (965-969, 1050-1109)
|
||||
- ✅ Compared startDmCall vs answerDmCall patterns
|
||||
- ✅ Identified additional issues:
|
||||
- declineDmCall needs blocking check
|
||||
- notifyDmCall needs filtering
|
||||
- blockUser() needs voice cleanup
|
||||
- BusUserBlockChange subscription needed
|
||||
- ✅ Implementation order recommendation
|
||||
- ✅ Edge cases considered
|
||||
|
||||
**Quality:** Surprisingly thorough. Searched actual code, cited lines, found patterns.
|
||||
|
||||
### Diligence Approach Output
|
||||
|
||||
**Worker:**
|
||||
- ✅ Verified bug exists by searching code
|
||||
- ✅ Correct line numbers
|
||||
- ✅ Cited exact file:line
|
||||
- ✅ Proposed fix matching startDmCall pattern
|
||||
|
||||
**Reviewer:**
|
||||
- ✅ Attempted independent verification (correct process)
|
||||
- ❌ Searched wrong codebase (test fixture, 220 lines)
|
||||
- ✅ Noticed discrepancy ("file only 220 lines, Worker cited 1050")
|
||||
- ✅ Found additional gaps (declineDmCall, notification filtering)
|
||||
- ✅ Gave NEEDS_WORK decision with specific issues
|
||||
|
||||
**Quality:** Process worked correctly. Reviewer caught a "discrepancy" (even if due to searching wrong place).
|
||||
|
||||
---
|
||||
|
||||
## Key Findings
|
||||
|
||||
### 1. Both approaches verified the bug exists
|
||||
|
||||
Neither approach blindly trusted the task description. Both:
|
||||
- Searched for answerDmCall implementation
|
||||
- Compared with startDmCall pattern
|
||||
- Verified blocking check is actually missing
|
||||
|
||||
### 2. Naive approach was surprisingly thorough
|
||||
|
||||
The single agent produced analysis comparable to the Worker. This suggests:
|
||||
- For bugs with clear descriptions, naive approach may suffice
|
||||
- The value of diligence may be in more ambiguous tasks
|
||||
|
||||
### 3. Reviewer process works, but needs correct context
|
||||
|
||||
The Reviewer:
|
||||
- Did NOT rubber-stamp the Worker's proposal
|
||||
- Actually searched and found discrepancies
|
||||
- Caught additional issues the Worker missed
|
||||
- BUT searched the wrong codebase due to test setup
|
||||
|
||||
### 4. Test setup flaw identified
|
||||
|
||||
The Reviewer searched `/Users/marc/bude/strikt/diligence/test/fixture/` instead of `~/bude/codecharm/nexus`. This is because:
|
||||
- Agents were spawned from the diligence project
|
||||
- They defaulted to searching the current working directory
|
||||
|
||||
**Fix needed:** In real usage, diligence MCP runs IN the target project, so this wouldn't happen.
|
||||
|
||||
---
|
||||
|
||||
## What Diligence Should Catch That Naive Might Miss
|
||||
|
||||
Based on this test, diligence adds value when:
|
||||
|
||||
1. **Worker makes incorrect claims** - Reviewer verifies by searching
|
||||
2. **Worker misses related issues** - Reviewer's independent search finds them
|
||||
3. **Task description is wrong** - Both should verify bug exists, not assume
|
||||
4. **Patterns are misunderstood** - Reviewer checks against CODEBASE_CONTEXT.md
|
||||
|
||||
### This test showed:
|
||||
|
||||
| Scenario | Did Diligence Help? |
|
||||
|----------|---------------------|
|
||||
| Verify bug exists | Both approaches did this |
|
||||
| Catch wrong line numbers | Reviewer caught discrepancy ✅ |
|
||||
| Find additional gaps | Reviewer found more than Worker ✅ |
|
||||
| Prevent hallucinated bugs | Would catch if Reviewer searched correctly |
|
||||
|
||||
---
|
||||
|
||||
## Recommendations
|
||||
|
||||
### 1. Run real test in nexus project
|
||||
|
||||
Start a Claude Code session IN nexus and test the full workflow there. This ensures:
|
||||
- MCP server runs in correct project
|
||||
- Agents search the right codebase
|
||||
- Full context from CODEBASE_CONTEXT.md is loaded
|
||||
|
||||
### 2. Test with a more ambiguous bug
|
||||
|
||||
B1 is well-documented. Test with something like:
|
||||
- "Voice seems laggy sometimes"
|
||||
- "Users report weird permission issues"
|
||||
|
||||
These require more investigation to even determine if there's a bug.
|
||||
|
||||
### 3. Test if diligence catches non-bugs
|
||||
|
||||
Give a task for a bug that doesn't exist. Does the workflow correctly identify "no bug found"?
|
||||
|
||||
### 4. Add explicit codebase path to Worker/Reviewer briefs
|
||||
|
||||
The briefs should specify: "Search in /path/to/project, not elsewhere"
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
**Does diligence work?** Yes, the process is sound:
|
||||
- Worker researches and proposes
|
||||
- Reviewer independently verifies
|
||||
- Discrepancies are caught
|
||||
- Multiple rounds can iterate
|
||||
|
||||
**Is it better than naive?** For this test, similar results. But:
|
||||
- Reviewer caught additional issues Worker missed
|
||||
- Process would catch hallucinated bugs if Reviewer searches correctly
|
||||
- Real value may be in more complex/ambiguous tasks
|
||||
|
||||
**Next step:** Run a real test in a Claude Code session in nexus, with a more ambiguous task.
|
||||
415
test/run-tests.mjs
Normal file
415
test/run-tests.mjs
Normal file
@@ -0,0 +1,415 @@
|
||||
#!/usr/bin/env node
|
||||
/**
|
||||
* Diligence Test Runner
|
||||
*
|
||||
* Runs end-to-end tests of the Worker-Reviewer loop.
|
||||
*
|
||||
* Modes:
|
||||
* --workflow Test MCP workflow mechanics only (no AI)
|
||||
* --mock Use mock Worker/Reviewer responses
|
||||
* --live Use real Claude API for Worker/Reviewer (requires ANTHROPIC_API_KEY)
|
||||
*
|
||||
* Usage:
|
||||
* node test/run-tests.mjs --workflow
|
||||
* node test/run-tests.mjs --mock --scenario=blocking-voice
|
||||
* node test/run-tests.mjs --live --scenario=permission-cache
|
||||
*/
|
||||
|
||||
import { McpClient } from './mcp-client.mjs';
|
||||
import { readFileSync, existsSync, unlinkSync } from 'fs';
|
||||
import { dirname, join } from 'path';
|
||||
import { fileURLToPath } from 'url';
|
||||
|
||||
const __dirname = dirname(fileURLToPath(import.meta.url));
|
||||
|
||||
// Parse CLI args
|
||||
const args = process.argv.slice(2);
|
||||
const mode = args.find(a => ['--workflow', '--mock', '--live'].includes(a)) || '--workflow';
|
||||
const scenarioArg = args.find(a => a.startsWith('--scenario='));
|
||||
const scenarioId = scenarioArg ? scenarioArg.split('=')[1] : null;
|
||||
const verbose = args.includes('--verbose') || args.includes('-v');
|
||||
|
||||
// Colors for output
|
||||
const colors = {
|
||||
reset: '\x1b[0m',
|
||||
green: '\x1b[32m',
|
||||
red: '\x1b[31m',
|
||||
yellow: '\x1b[33m',
|
||||
blue: '\x1b[34m',
|
||||
dim: '\x1b[2m',
|
||||
};
|
||||
|
||||
function log(msg, color = 'reset') {
|
||||
console.log(`${colors[color]}${msg}${colors.reset}`);
|
||||
}
|
||||
|
||||
function logSection(title) {
|
||||
console.log(`\n${colors.blue}=== ${title} ===${colors.reset}`);
|
||||
}
|
||||
|
||||
// Load scenario
|
||||
function loadScenario(id) {
|
||||
const path = join(__dirname, 'scenarios', `${id}.json`);
|
||||
if (!existsSync(path)) {
|
||||
throw new Error(`Scenario not found: ${id}`);
|
||||
}
|
||||
return JSON.parse(readFileSync(path, 'utf-8'));
|
||||
}
|
||||
|
||||
// Load all scenarios
|
||||
function loadAllScenarios() {
|
||||
const index = JSON.parse(readFileSync(join(__dirname, 'scenarios', 'index.json'), 'utf-8'));
|
||||
return index.scenarios.map(s => loadScenario(s.id));
|
||||
}
|
||||
|
||||
// Clean up state file before test
|
||||
function cleanState() {
|
||||
const stateFile = join(__dirname, 'fixture', '.claude', '.diligence-state.json');
|
||||
if (existsSync(stateFile)) {
|
||||
unlinkSync(stateFile);
|
||||
}
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Workflow Tests (no AI, just MCP mechanics)
|
||||
// ============================================================================
|
||||
|
||||
async function testWorkflow() {
|
||||
logSection('Workflow Tests');
|
||||
|
||||
const client = new McpClient();
|
||||
let passed = 0;
|
||||
let failed = 0;
|
||||
|
||||
try {
|
||||
cleanState();
|
||||
await client.connect();
|
||||
log('Connected to MCP server', 'green');
|
||||
|
||||
// Test 1: Status in conversation phase
|
||||
{
|
||||
const result = await client.status();
|
||||
const ok = result.text.includes('Phase: conversation');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] Initial status is conversation`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 2: Start workflow
|
||||
{
|
||||
const result = await client.start('Test task');
|
||||
const ok = result.text.includes('researching') && result.text.includes('Round: 1/5');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] Start transitions to researching`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 3: Cannot start again while in progress
|
||||
{
|
||||
const result = await client.start('Another task');
|
||||
const ok = result.isError && result.text.includes('Already in');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] Cannot start while in progress`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 4: Get worker brief
|
||||
{
|
||||
const result = await client.getWorkerBrief();
|
||||
const ok = result.text.includes('Worker Brief') && result.text.includes('Test task');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] Worker brief contains task`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 5: Submit proposal
|
||||
{
|
||||
const result = await client.propose('## Analysis\n\nProposed fix here');
|
||||
const ok = result.text.includes('Proposal submitted');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] Proposal submitted`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 6: Get reviewer brief includes proposal
|
||||
{
|
||||
const result = await client.getReviewerBrief();
|
||||
const ok = result.text.includes('Reviewer Brief') && result.text.includes('Proposed fix here');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] Reviewer brief contains proposal`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 7: Review with NEEDS_WORK
|
||||
{
|
||||
const result = await client.review('NEEDS_WORK', 'Missing broker event handling');
|
||||
const ok = result.text.includes('NEEDS_WORK') && result.text.includes('Round 2/5');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] NEEDS_WORK increments round`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 8: Worker brief now includes feedback
|
||||
{
|
||||
const result = await client.getWorkerBrief();
|
||||
const ok = result.text.includes('Previous Feedback') && result.text.includes('broker event');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] Worker brief includes previous feedback`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 9: Submit revised proposal
|
||||
{
|
||||
const result = await client.propose('## Revised\n\nNow with broker events');
|
||||
const ok = result.text.includes('Proposal submitted');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] Revised proposal submitted`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 10: Review with APPROVED
|
||||
{
|
||||
const result = await client.review('APPROVED', 'All checks pass');
|
||||
const ok = result.text.includes('APPROVED') && result.text.includes('2 round');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] APPROVED after review`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 11: Status shows approved
|
||||
{
|
||||
const result = await client.status();
|
||||
const ok = result.text.includes('Phase: approved');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] Status shows approved phase`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 12: Implement
|
||||
{
|
||||
const result = await client.implement();
|
||||
const ok = result.text.includes('Implementation phase');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] Implement starts implementation`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 13: Complete
|
||||
{
|
||||
const result = await client.complete('Fixed the bug');
|
||||
const ok = result.text.includes('Complete') && result.text.includes('Reset to conversation');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] Complete resets workflow`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 14: Back to conversation
|
||||
{
|
||||
const result = await client.status();
|
||||
const ok = result.text.includes('Phase: conversation');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] Back to conversation phase`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 15: Abort works
|
||||
{
|
||||
await client.start('Task to abort');
|
||||
const result = await client.abort('Changed my mind');
|
||||
const ok = result.text.includes('Aborted') && result.text.includes('Reset to conversation');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] Abort resets workflow`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
// Test 16: Max rounds enforcement
|
||||
{
|
||||
await client.start('Task for max rounds');
|
||||
for (let i = 0; i < 5; i++) {
|
||||
await client.propose(`Proposal ${i + 1}`);
|
||||
if (i < 4) {
|
||||
await client.review('NEEDS_WORK', `Feedback ${i + 1}`);
|
||||
}
|
||||
}
|
||||
const result = await client.review('NEEDS_WORK', 'Still not good');
|
||||
const ok = result.text.includes('MAX ROUNDS') && result.text.includes('reset');
|
||||
log(` [${ok ? 'PASS' : 'FAIL'}] Max rounds resets workflow`, ok ? 'green' : 'red');
|
||||
ok ? passed++ : failed++;
|
||||
}
|
||||
|
||||
log(`\nWorkflow tests: ${passed} passed, ${failed} failed`, failed ? 'red' : 'green');
|
||||
return failed === 0;
|
||||
|
||||
} finally {
|
||||
await client.disconnect();
|
||||
cleanState();
|
||||
}
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Mock Tests (predefined Worker/Reviewer responses)
|
||||
// ============================================================================
|
||||
|
||||
async function testWithMocks(scenario) {
|
||||
logSection(`Mock Test: ${scenario.name}`);
|
||||
|
||||
const client = new McpClient();
|
||||
|
||||
try {
|
||||
cleanState();
|
||||
await client.connect();
|
||||
|
||||
// Start the workflow
|
||||
await client.start(scenario.task);
|
||||
log(`Started task: ${scenario.task.slice(0, 60)}...`, 'dim');
|
||||
|
||||
// Round 1: Worker submits naive fix
|
||||
const naiveProposal = `## Analysis
|
||||
|
||||
${scenario.naive_fix.description}
|
||||
|
||||
### Changes
|
||||
${scenario.naive_fix.changes.map(c => `- ${c.file}: ${c.change}`).join('\n')}
|
||||
`;
|
||||
|
||||
await client.propose(naiveProposal);
|
||||
log('Worker submitted naive proposal', 'dim');
|
||||
|
||||
// Round 1: Reviewer catches issues
|
||||
const issues = scenario.naive_fix.issues;
|
||||
const reviewFeedback = `NEEDS_WORK
|
||||
|
||||
Issues found:
|
||||
${issues.map((issue, i) => `${i + 1}. ${issue}`).join('\n')}
|
||||
|
||||
The proposal misses critical aspects. Please address all issues.
|
||||
`;
|
||||
|
||||
await client.review('NEEDS_WORK', reviewFeedback);
|
||||
log(`Reviewer found ${issues.length} issues`, 'yellow');
|
||||
|
||||
// Round 2: Worker submits correct fix
|
||||
const correctProposal = `## Revised Analysis
|
||||
|
||||
${scenario.correct_fix.description}
|
||||
|
||||
### Required Changes
|
||||
|
||||
${scenario.correct_fix.required_changes.map(c =>
|
||||
`#### ${c.file}:${c.function}
|
||||
- ${c.change}
|
||||
- Reference: ${c.line_reference}`
|
||||
).join('\n\n')}
|
||||
|
||||
### Broker Subscriptions
|
||||
|
||||
${scenario.correct_fix.required_broker_subscriptions.map(s =>
|
||||
`- ${s.service} subscribes to ${s.event}: ${s.action}`
|
||||
).join('\n')}
|
||||
|
||||
### Pattern References
|
||||
|
||||
${scenario.correct_fix.pattern_references.map(p => `- ${p}`).join('\n')}
|
||||
`;
|
||||
|
||||
await client.propose(correctProposal);
|
||||
log('Worker submitted revised proposal', 'dim');
|
||||
|
||||
// Round 2: Reviewer approves
|
||||
await client.review('APPROVED', 'All required changes identified. Pattern followed correctly.');
|
||||
log('Reviewer approved', 'green');
|
||||
|
||||
// Validate the proposal
|
||||
const validation = scenario.validation_criteria;
|
||||
let validationPassed = true;
|
||||
|
||||
log('\nValidation:', 'blue');
|
||||
|
||||
// Check must_mention
|
||||
for (const item of validation.must_mention || []) {
|
||||
const found = correctProposal.toLowerCase().includes(item.toLowerCase());
|
||||
log(` [${found ? 'PASS' : 'FAIL'}] Mentions: ${item}`, found ? 'green' : 'red');
|
||||
if (!found) validationPassed = false;
|
||||
}
|
||||
|
||||
// Check pattern reference
|
||||
if (validation.should_reference_pattern) {
|
||||
const found = correctProposal.includes(validation.should_reference_pattern);
|
||||
log(` [${found ? 'PASS' : 'FAIL'}] References pattern: ${validation.should_reference_pattern}`, found ? 'green' : 'red');
|
||||
if (!found) validationPassed = false;
|
||||
}
|
||||
|
||||
// Complete the workflow
|
||||
await client.implement();
|
||||
await client.complete('Test completed');
|
||||
|
||||
log(`\nMock test ${scenario.id}: ${validationPassed ? 'PASSED' : 'FAILED'}`, validationPassed ? 'green' : 'red');
|
||||
return validationPassed;
|
||||
|
||||
} finally {
|
||||
await client.disconnect();
|
||||
cleanState();
|
||||
}
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Live Tests (real Claude API)
|
||||
// ============================================================================
|
||||
|
||||
async function testLive(scenario) {
|
||||
logSection(`Live Test: ${scenario.name}`);
|
||||
|
||||
const apiKey = process.env.ANTHROPIC_API_KEY;
|
||||
if (!apiKey) {
|
||||
log('ANTHROPIC_API_KEY not set. Skipping live test.', 'yellow');
|
||||
return null;
|
||||
}
|
||||
|
||||
log('Live tests with Claude API not yet implemented.', 'yellow');
|
||||
log('This would spawn real Worker and Reviewer sub-agents.', 'dim');
|
||||
|
||||
// TODO: Implement Claude API integration
|
||||
// 1. Get worker brief
|
||||
// 2. Call Claude API with worker prompt + brief
|
||||
// 3. Submit Claude's proposal
|
||||
// 4. Get reviewer brief
|
||||
// 5. Call Claude API with reviewer prompt + brief
|
||||
// 6. Submit Claude's review
|
||||
// 7. Loop until approved or max rounds
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Main
|
||||
// ============================================================================
|
||||
|
||||
async function main() {
|
||||
log('\n🔍 Diligence Test Runner\n', 'blue');
|
||||
log(`Mode: ${mode}`, 'dim');
|
||||
|
||||
let allPassed = true;
|
||||
|
||||
switch (mode) {
|
||||
case '--workflow':
|
||||
allPassed = await testWorkflow();
|
||||
break;
|
||||
|
||||
case '--mock': {
|
||||
const scenarios = scenarioId ? [loadScenario(scenarioId)] : loadAllScenarios();
|
||||
for (const scenario of scenarios) {
|
||||
const passed = await testWithMocks(scenario);
|
||||
if (!passed) allPassed = false;
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
case '--live': {
|
||||
const scenarios = scenarioId ? [loadScenario(scenarioId)] : loadAllScenarios();
|
||||
for (const scenario of scenarios) {
|
||||
const result = await testLive(scenario);
|
||||
if (result === false) allPassed = false;
|
||||
}
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
console.log();
|
||||
if (allPassed) {
|
||||
log('✓ All tests passed', 'green');
|
||||
process.exit(0);
|
||||
} else {
|
||||
log('✗ Some tests failed', 'red');
|
||||
process.exit(1);
|
||||
}
|
||||
}
|
||||
|
||||
main().catch(err => {
|
||||
console.error('Error:', err);
|
||||
process.exit(1);
|
||||
});
|
||||
78
test/scenarios/blocking-voice.json
Normal file
78
test/scenarios/blocking-voice.json
Normal file
@@ -0,0 +1,78 @@
|
||||
{
|
||||
"id": "blocking-voice",
|
||||
"name": "Blocking + Voice Bug",
|
||||
"description": "Fix blocked users can answer DM voice calls",
|
||||
|
||||
"task": "Fix: blocked users can still answer DM voice calls. When user A blocks user B, user B should not be able to answer calls from user A.",
|
||||
|
||||
"naive_fix": {
|
||||
"description": "Add blocking check to answerDmCall()",
|
||||
"changes": [
|
||||
{
|
||||
"file": "src/services/voice-channel.service.ts",
|
||||
"function": "answerDmCall",
|
||||
"change": "Add isBlockingEitherWay check before answering"
|
||||
}
|
||||
],
|
||||
"issues": [
|
||||
"Doesn't handle block created DURING active call",
|
||||
"Doesn't clean up existing calls when block is created",
|
||||
"Blocked users still receive call notifications"
|
||||
]
|
||||
},
|
||||
|
||||
"correct_fix": {
|
||||
"description": "Full blocking enforcement following chat.service.ts pattern",
|
||||
"required_changes": [
|
||||
{
|
||||
"file": "src/services/voice-channel.service.ts",
|
||||
"function": "answerDmCall",
|
||||
"change": "Add isBlockingEitherWay check",
|
||||
"line_reference": "line 75"
|
||||
},
|
||||
{
|
||||
"file": "src/services/voice-channel.service.ts",
|
||||
"function": "declineDmCall",
|
||||
"change": "Add isBlockingEitherWay check (consistency)",
|
||||
"line_reference": "line 93"
|
||||
},
|
||||
{
|
||||
"file": "src/services/voice-channel.service.ts",
|
||||
"function": "notifyDmCall",
|
||||
"change": "Filter notifications for blocked users",
|
||||
"line_reference": "line 138"
|
||||
},
|
||||
{
|
||||
"file": "src/services/user-block.service.ts",
|
||||
"function": "blockUser",
|
||||
"change": "Add voice cleanup: endDmCallBetweenUsers()",
|
||||
"line_reference": "line 33"
|
||||
}
|
||||
],
|
||||
"required_broker_subscriptions": [
|
||||
{
|
||||
"service": "voice-channel.service.ts",
|
||||
"event": "BusUserBlockChange",
|
||||
"action": "Kick users from DM voice when block is created mid-call"
|
||||
}
|
||||
],
|
||||
"pattern_references": [
|
||||
"chat.service.ts:sendMessage - shows correct action check pattern",
|
||||
"chat.service.ts:getChannelPermission - shows permission vs action separation"
|
||||
]
|
||||
},
|
||||
|
||||
"validation_criteria": {
|
||||
"must_mention": [
|
||||
"answerDmCall",
|
||||
"BusUserBlockChange",
|
||||
"user-block.service",
|
||||
"notifyDmCall"
|
||||
],
|
||||
"must_not_change": [
|
||||
"voiceListen permission values",
|
||||
"voiceTalk permission values"
|
||||
],
|
||||
"should_reference_pattern": "chat.service.ts"
|
||||
}
|
||||
}
|
||||
21
test/scenarios/index.json
Normal file
21
test/scenarios/index.json
Normal file
@@ -0,0 +1,21 @@
|
||||
{
|
||||
"scenarios": [
|
||||
{
|
||||
"id": "blocking-voice",
|
||||
"file": "blocking-voice.json",
|
||||
"difficulty": "medium",
|
||||
"tags": ["blocking", "voice", "broker-events"]
|
||||
},
|
||||
{
|
||||
"id": "permission-cache",
|
||||
"file": "permission-cache.json",
|
||||
"difficulty": "medium",
|
||||
"tags": ["cache", "permissions", "broker-events"]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"version": "1.0.0",
|
||||
"fixture_path": "../fixture",
|
||||
"description": "Test scenarios for diligence MCP server"
|
||||
}
|
||||
}
|
||||
81
test/scenarios/permission-cache.json
Normal file
81
test/scenarios/permission-cache.json
Normal file
@@ -0,0 +1,81 @@
|
||||
{
|
||||
"id": "permission-cache",
|
||||
"name": "Permission Cache Invalidation Bug",
|
||||
"description": "Fix permission cache not invalidating when roles change",
|
||||
|
||||
"task": "Fix: permission cache doesn't invalidate when user roles change. Users see stale permissions for hours after their roles are updated.",
|
||||
|
||||
"naive_fix": {
|
||||
"description": "Add .clear() call somewhere",
|
||||
"changes": [
|
||||
{
|
||||
"file": "src/services/team.service.ts",
|
||||
"function": "somewhere",
|
||||
"change": "Call memoizedPermissions.clear()"
|
||||
}
|
||||
],
|
||||
"issues": [
|
||||
"Doesn't identify WHEN cache should clear",
|
||||
"Missing BusTeamRoleChange subscription",
|
||||
"Missing BusTeamMemberRoleChange subscription",
|
||||
"Doesn't fix roles.controller.ts missing broker events"
|
||||
]
|
||||
},
|
||||
|
||||
"correct_fix": {
|
||||
"description": "Subscribe to all role-related broker events",
|
||||
"required_changes": [
|
||||
{
|
||||
"file": "src/services/team.service.ts",
|
||||
"function": "constructor",
|
||||
"change": "Subscribe to BusTeamRoleChange, clear cache on event",
|
||||
"line_reference": "line 30"
|
||||
},
|
||||
{
|
||||
"file": "src/services/team.service.ts",
|
||||
"function": "constructor",
|
||||
"change": "Subscribe to BusTeamMemberRoleChange, clear cache on event",
|
||||
"line_reference": "line 30"
|
||||
},
|
||||
{
|
||||
"file": "src/controllers/roles.controller.ts",
|
||||
"function": "createRole",
|
||||
"change": "Emit BusTeamRoleChange event after creating role",
|
||||
"line_reference": "line 22"
|
||||
},
|
||||
{
|
||||
"file": "src/controllers/roles.controller.ts",
|
||||
"function": "deleteRole",
|
||||
"change": "Emit BusTeamRoleChange event before deleting role",
|
||||
"line_reference": "line 62"
|
||||
}
|
||||
],
|
||||
"required_broker_subscriptions": [
|
||||
{
|
||||
"service": "team.service.ts",
|
||||
"event": "BusTeamRoleChange",
|
||||
"action": "Clear permission cache"
|
||||
},
|
||||
{
|
||||
"service": "team.service.ts",
|
||||
"event": "BusTeamMemberRoleChange",
|
||||
"action": "Clear permission cache"
|
||||
}
|
||||
],
|
||||
"pattern_references": [
|
||||
"roles.controller.ts:updateRole - shows correct broker event emission"
|
||||
]
|
||||
},
|
||||
|
||||
"validation_criteria": {
|
||||
"must_mention": [
|
||||
"BusTeamRoleChange",
|
||||
"BusTeamMemberRoleChange",
|
||||
"createRole",
|
||||
"deleteRole",
|
||||
"team.service"
|
||||
],
|
||||
"must_identify_root_cause": "Cache only clears on team switch, not role changes",
|
||||
"should_reference_pattern": "roles.controller.ts:updateRole"
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user