Image Tools / Docs
Automation

For AI agents

A primer for LLMs and automation tools — how to read these docs, how to emit valid editor documents, and how to think about programmatic design.

If you're an LLM or an automation pipeline driving the Image Editor, this page is for you.

#Mental model

The editor is a structured document model with a UI on top — not a UI with state under it.

This matters because it means you can produce a complete, valid project without ever simulating clicks. Build a document.json, package it with a manifest.json into a ZIP, and the editor will load it identically to any human-saved file.

Three places to ground yourself:

  1. The .img format — container, manifest, document JSON shape.
  2. Layers and Groups — the document model.
  3. Per-tool reference — each layer kind has its own page (shape / text / icon / asset / draw).

#What you can produce directly

LayerEffortNotes
shapeLowPure JSON. No external resources.
textLowPure JSON. Font must be in the editor's bundled list (or web-loaded by the user).
iconLowReference an iconId in the bundled library.
groupLowPointer to children by id.
assetMediumNeeds an `assets/.png
bitmap (base)MediumSame as asset — a PNG / JPEG in bitmaps/.
drawHighThe draw layer is a packed mask; emitting one from scratch is rarely worth it. Prefer shape + asset for programmatic output.

#A canonical workflow

For an LLM producing a poster from a prompt:

  1. Plan layers. Decompose the design into background, hero shape(s), title, subtitle, optional accent icon. Each becomes one entry in layers[].
  2. Pick a canvas. 1080×1080 (square), 1080×1920 (story), 1920×1080 (HD), or print.
  3. Emit the JSON. Use the type definitions on each tool page. Validate that every layer has id, type, transform, opacity, and the kind-specific fields.
  4. Set z-order. layers is z-ordered: index 0 paints first (bottom), last paints last (top).
  5. Optional: emit effects — drop shadow on titles is a near-universal upgrade.
  6. Package the ZIP. manifest.json + document.json + any asset blobs. See .img format for a complete code example.

#Constraints to respect

  • Required fields. schemaVersion, canvas, layers, groups are all required. Missing any of them → migration error or load failure.
  • groupId integrity. If you reference a groupId on a layer, that group id must exist in groups.
  • Resource refs. If a layer references assetId: 'foo', the manifest's resources array must contain an entry with that id, and the corresponding file must exist in the ZIP.
  • Coordinate system. Origin is top-left of the canvas. Positive y is down. Rotation in degrees, clockwise.
  • Color format. Hex strings (#RRGGBB or #RRGGBBAA) or rgb()/rgba(). Avoid HSL serialized strings — they're parsed but inconsistently across renderers.
  • Schema version. Always emit schemaVersion: 2 for new documents. The migration runner will accept v1 too, but emitting current saves a round-trip.

#Don't fight the editor

Things that look possible but aren't (yet):

  • Custom fonts can't be embedded in .img. Reference a font in the bundled list, or accept that the user may need to load the font themselves.
  • Filters are presets — name them by id, don't try to compose new presets in JSON.
  • Curves / Levels / HSL can be serialized, but the on-disk shape is internal — copy from a project the editor produced, don't hand-author from scratch.
  • Brush strokes can't be reasonably hand-authored as JSON. Use shapes for any "drawing" output.

#A minimum-viable document

Copy this and fill in the blanks. It loads in the editor as-is.

{
  "schemaVersion": 2,
  "canvas": {
    "width": 1080,
    "height": 1080,
    "background": { "type": "color", "color": "#0a0a0a" }
  },
  "layers": [
    {
      "id": "lyr_bg_pad",
      "type": "shape",
      "shapeType": "rect",
      "x": 60, "y": 60, "width": 960, "height": 960,
      "rotation": 0, "cornerRadius": 24,
      "fill": { "type": "solid", "color": "#1f1f24" },
      "opacity": 1
    },
    {
      "id": "lyr_title",
      "type": "text",
      "x": 120, "y": 220, "width": 840, "height": 200, "rotation": 0,
      "text": "Hello\\nfrom an agent",
      "fontFamily": "Inter", "fontSize": 96, "fontWeight": 800,
      "italic": false, "align": "left", "color": "#f59e0b",
      "opacity": 1
    }
  ],
  "groups": [],
  "history": { "past": [], "future": [], "current": {} }
}

#Building higher-level ops

Eventually the editor will expose a JSON-RPC-style edit operations API — "add layer", "set fill", "group", "apply effect" — so an agent can drive a live session instead of re-emitting the whole document. The shape is settled in the document model already; the wire surface is (planned).

For now: emit whole documents, or read-and-rewrite an existing .img to make targeted edits.

#Where to ask

If something in these docs is unclear or contradicts the live editor's behavior, the source is the source — src/routes/tools/image-editor/document/ is the canonical contract.