feat: expand AI workflow support and refresh docs

2026-03-12 13:47:42 +08:00
parent 8688422578
commit 4ecab597f4
28 changed files with 4806 additions and 1907 deletions
--- a/docs/ai/ark-seedream-5-update.md
+++ b/docs/ai/ark-seedream-5-update.md
@@ -0,0 +1,44 @@
+# Ark Seedream 5.0 Update
+
+This project now aligns its generic image-generation flow with the current Ark Seedream 5.0 usage pattern.
+
+## What changed
+
+- Single-image text-to-image is supported through the generic design-image endpoint.
+- Reference-image generation is supported by passing the latest uploaded chat image into the image model.
+- Multi-image generation is supported with Ark sequential image generation.
+- When the requested image count is greater than 1, the backend collects stream events and returns a final image list to the chat UI.
+
+## Runtime mapping
+
+- Backend image model entry:
+  - `Server/app/api/v1/ai_llm.py`
+  - Uses `client.images.generate(...)`
+  - Supports:
+    - `sequential_image_generation="disabled"` for single image
+    - `sequential_image_generation="auto"` for multi-image
+    - `SequentialImageGenerationOptions(max_images=...)`
+
+- Generic API endpoint:
+  - `POST /ai/generate-design-images`
+  - File: `Server/app/api/v1/ai_pattern.py`
+
+- Chat tool:
+  - `generate_design_images`
+  - File: `Server/app/api/v1/ai_tools.py`
+
+- Frontend executor:
+  - `Designer/src/utils/aiToolExecutor.ts`
+
+## Supported use cases
+
+- "生成一张高级感海报背景图"
+- "根据这张参考图出 4 张不同方向稿"
+- "生成一组连贯插画"
+- "做 3 张 KV 草案给我选"
+
+## Notes
+
+- The current chat UI renders the returned image URLs directly inside the assistant conversation.
+- The current implementation caps multi-image generation at 4 images per request.
+- The project still uses the configured `AI_IMAGE_EDIT_MODEL` as the unified Ark image-generation model slot.
--- a/docs/ai/openclaw-skill-adaptation.md
+++ b/docs/ai/openclaw-skill-adaptation.md
@@ -0,0 +1,53 @@
+# OpenClaw Skill Adaptation
+
+This project borrows ideas from selected skills in the [OpenClaw skills repo](https://github.com/openclaw/skills) and adapts them to DesignerCEP's Photoshop-first workflow.
+
+## Upstream skills used
+
+- `photoshop-automator`
+  - Source: `skills/abdul-karim-mia/photoshop-automator`
+  - URL: https://github.com/openclaw/skills/tree/main/skills/abdul-karim-mia/photoshop-automator
+  - What we adopted:
+    - Treat the active document as the primary execution context
+    - Warn about modal dialogs blocking Photoshop automation
+    - Require exact layer targeting instead of guessing
+    - Prefer short, reversible automation chains
+
+- `ui-ux-pro-max-0-1-0`
+  - Source: `skills/15349185792/ui-ux-pro-max-0-1-0`
+  - URL: https://github.com/openclaw/skills/tree/main/skills/15349185792/ui-ux-pro-max-0-1-0
+  - What we adopted:
+    - Triage before execution
+    - Structured deliverables instead of vague design advice
+    - A design-system style way of thinking about layout, hierarchy, spacing, and goals
+
+- `ui-designer-skill`
+  - Source: `skills/1999azzar/ui-designer-skill`
+  - URL: https://github.com/openclaw/skills/tree/main/skills/1999azzar/ui-designer-skill
+  - What we adopted:
+    - Stronger visual-language analysis
+    - Style and palette vocabulary
+    - Better phrasing for aesthetic direction and accessibility-minded critique
+
+## Internal mapping in DesignerCEP
+
+- `ps-generalist`
+  - Strengthened with Photoshop automation guardrails from `photoshop-automator`
+
+- `ps-layout-designer`
+  - Strengthened with triage and deliverables from `ui-ux-pro-max`
+  - Enhanced with visual-language cues from `ui-designer-skill`
+
+- `creative-direction-strategist`
+  - New internal skill
+  - Uses image-aware design-direction analysis when the user provides a reference image
+  - Bridges "design thinking" into concrete Photoshop actions
+
+- `visual-analysis-advisor`
+  - Enriched with more designer-facing visual analysis language
+
+## Notes
+
+- We intentionally did not copy upstream skills verbatim into the product.
+- The goal is to preserve DesignerCEP's current Vue + FastAPI + Photoshop-tool architecture while borrowing useful planning and safety patterns.
+- Upstream skills are references; the runtime behavior is driven by `Server/app/api/v1/ai_skills.py` and `Server/app/api/v1/ai_llm.py`.