Clawdbot Scanner: Multi‑Agent Architecture for Honeypot Detection

Project Overview

The Clawdbot Scanner is a TypeScript‑based tool that discovers, probes, and fingerprints Clawdbot Gateway instances on the internet or a local network.
Its primary goal is to differentiate genuine Clawdbot services from honeypots—decoy systems that mimic real gateways to lure attackers. By providing reliable, automated detection, the scanner helps security researchers and network operators avoid false‑positive alerts and focus on real threats.

Key Features

Full protocol implementation – WebSocket client that performs the Clawdbot handshake, supports all 32 protocol methods (20 fully functional, 12 stubbed), and captures real‑time events such as tick, presence, and health.
Discovery system – Integrated Shodan API for internet‑wide scans and mDNS for local‑network discovery, both with rate‑limiting and exponential‑backoff retry logic.
Analysis engine – Multi‑dimensional fingerprinting across eight feature categories, behavioral pattern matching, and confidence scoring to flag likely honeypots.
Reporting subsystem – Persistent SQLite storage, HTML dashboards (light & dark themes), and export options (JSON, Markdown, CSV).
Resilience – Timeout handling, automatic reconnection with backoff, strict input validation, and graceful shutdown.
Concurrent probing – Configurable parallel method execution to reduce overall scan time.

Architecture Overview

The scanner is built around a four‑agent parallel architecture. Each agent runs independently and communicates through well‑defined TypeScript interfaces.

TEXT

+----------------+   +----------------+   +----------------+   +----------------+
| Discovery Agent| → | Protocol Agent | → | Analysis Agent | → | Report Agent   |
| (Shodan / mDNS)|   | (WebSocket)    |   | (Fingerprint) |   | (SQLite/HTML)  |
+----------------+   +----------------+   +----------------+   +----------------+

Discovery Agent gathers target addresses.
Protocol Agent establishes WebSocket connections, authenticates, and issues protocol calls.
Analysis Agent consumes the responses, builds fingerprints, and scores each target.
Report Agent stores results and renders dashboards.

How It Works

1. Establishing the WebSocket Connection

The Protocol Agent creates a WebSocket connection to a Clawdbot Gateway. Below is a minimal example that mirrors the real client implementation (src/protocol/client.ts).

TYPESCRIPT

import WebSocket from 'ws';

/**
 * Opens a WebSocket connection to a Clawdbot Gateway.
 * Returns the connected socket ready for the handshake.
 */
export async function openGateway(host: string, port: number): Promise<WebSocket> {
  const ws = new WebSocket(`ws://${host}:${port}`);

  return new Promise((resolve, reject) => {
    ws.once('open', () => resolve(ws));
    ws.once('error', err => reject(err));
  });
}

2. Performing the Handshake

After the socket is open, the client sends an authentication payload defined in src/protocol/handshake.ts. The payload includes a static client identifier and a timestamp; a real deployment would replace these with credentials supplied via environment variables.

TYPESCRIPT

interface HandshakePayload {
  type: 'handshake';
  clientId: string;
  timestamp: number;
  token?: string; // optional, used when an API key is configured
}

/**
 * Sends the handshake message and waits for the server's acknowledgement.
 */
export async function doHandshake(ws: WebSocket, clientId: string, token?: string): Promise<void> {
  const payload: HandshakePayload = {
    type: 'handshake',
    clientId,
    timestamp: Date.now(),
    token,
  };

  ws.send(JSON.stringify(payload));

  await new Promise<void>((resolve, reject) => {
    const timeout = setTimeout(() => reject(new Error('Handshake timeout')), 5000);
    ws.once('message', data => {
      clearTimeout(timeout);
      const msg = JSON.parse(data.toString());
      if (msg.type === 'handshake-ack') resolve();
      else reject(new Error('Unexpected handshake response'));
    });
  });
}

3. Probing Protocol Methods

Once authenticated, the Protocol Agent can invoke any of the 32 methods. Each request carries a unique incremental requestId so that responses can be correlated.

TYPESCRIPT

let nextId = 1;

/**
 * Sends a method call and returns a promise that resolves with the response.
 */
export function callMethod(
  ws: WebSocket,
  method: string,
  params: unknown = {}
): Promise<unknown> {
  const requestId = nextId++;
  const request = { type: 'request', requestId, method, params };
  ws.send(JSON.stringify(request));

  return new Promise((resolve, reject) => {
    const handler = (data: WebSocket.Data) => {
      const resp = JSON.parse(data.toString());
      if (resp.type === 'response' && resp.requestId === requestId) {
        ws.off('message', handler);
        if (resp.error) reject(new Error(resp.error));
        else resolve(resp.result);
      }
    };
    ws.on('message', handler);
  });
}

A typical scan runs several of these calls concurrently (controlled by the Discovery Agent) and feeds the raw responses into the Analysis Agent.

4. Building the Fingerprint

The Analysis Agent receives the method results, timing data, and any event streams (e.g., health events). It aggregates them into a fingerprint object:

TYPESCRIPT

interface Fingerprint {
  latencyMs: number[];
  supportedMethods: string[];
  healthEvents: Record<string, unknown>[];
  errorCodes: number[];
}

/**
 * Updates a fingerprint with a new method response.
 */
export function updateFingerprint(
  fp: Fingerprint,
  method: string,
  response: unknown,
  latency: number
): Fingerprint {
  fp.latencyMs.push(latency);
  if (typeof response === 'object' && response !== null) {
    fp.supportedMethods.push(method);
  }
  // Additional logic for health events and error codes would go here
  return fp;
}

After all probes finish, the fingerprint is compared against a built‑in signature database. A confidence score (0‑100) is produced; scores below a configurable threshold are flagged as potential honeypots.

5. Storing and Visualising Results

The Report Agent writes each target’s fingerprint and confidence score to a SQLite database (src/report/database.ts). A simple query can retrieve the latest scans for dashboard generation.

TYPESCRIPT

import Database from 'better-sqlite3';

const db = new Database('scans.db');

export function initSchema(): void {
  db.exec(`
    CREATE TABLE IF NOT EXISTS scans (
      id INTEGER PRIMARY KEY AUTOINCREMENT,
      host TEXT NOT NULL,
      port INTEGER NOT NULL,
      timestamp INTEGER NOT NULL,
      confidence INTEGER NOT NULL,
      fingerprint TEXT NOT NULL
    );
  `);
}

/**
 * Persists a single scan result.
 */
export function saveScan(
  host: string,
  port: number,
  confidence: number,
  fingerprint: object
): void {
  const stmt = db.prepare(`
    INSERT INTO scans (host, port, timestamp, confidence, fingerprint)
    VALUES (?, ?, ?, ?, ?)
  `);
  stmt.run(host, port, Date.now(), confidence, JSON.stringify(fingerprint));
}

The HTML dashboard reads this table, renders charts, and allows filtering by confidence score, making it easy to spot suspicious gateways at a glance.

Getting Started

Clone the repository
BASH
```
git clone https://github.com/xxdesmus/clawdbot-scanner.git
cd clawdbot-scanner
```
Install dependencies
BASH
```
npm install
```
Build the TypeScript sources
BASH
```
npm run build
```
Configure optional Shodan API key (required for internet‑wide discovery)
BASH
```
cp .env.example .env
# edit .env and add your SHODAN_API_KEY
```
Run a basic scan
BASH
```
node dist/index.js --host 178.62.226.116 --port 18789
```
Add --probe-all to execute all 32 methods, or --listen 60 to keep the connection open for 60 seconds and capture live events.

Recent Developments

Security & performance boost (commit fb99c98) – added stricter input validation, hardened reconnection logic, and reduced memory churn during concurrent probing.
Documentation for AI‑assisted contributions (commit 3f6a933) – introduced AGENTS.md guidelines, now superseded by the detailed ARCHITECTURE.md.
Architecture documentation moved (commit 94b3b9a) – a dedicated ARCHITECTURE.md file now contains the full multi‑agent diagram and component responsibilities.
Workflow file removal (commit cf7b3e5) – eliminated a CI workflow that caused permission issues in certain environments.

These updates refine the scanner’s reliability while keeping the core functionality unchanged.

Conclusion

The Clawdbot Scanner offers a production‑ready, multi‑agent solution for detecting honeypot implementations of the Clawdbot Gateway protocol. Its comprehensive protocol support, intelligent fingerprinting, and flexible reporting make it an indispensable tool for anyone needing accurate visibility into Clawdbot deployments. By separating discovery, communication, analysis, and reporting into independent agents, the project stays maintainable and extensible—ready for future protocol extensions or integration with other threat‑intel platforms.

Explore the project, contribute, or run your own scans at the GitHub repository: https://github.com/xxdesmus/clawdbot-scanner .