Compare commits

...

13 Commits

Author SHA1 Message Date
P0luz
3aae7582fb feat: redirect / to /dashboard
Some checks are pending
Build & Push Docker Image / build-and-push (push) Waiting to run
Tests / test (push) Waiting to run
2026-04-21 20:53:34 +08:00
P0luz
529c9cc172 chore: dead code cleanup (pyflakes-clean)
Some checks failed
Build & Push Docker Image / build-and-push (push) Has been cancelled
Tests / test (push) Has been cancelled
- Remove unused imports: time/asyncio/Path/Optional/re/Counter/jieba
  across backfill_embeddings, bucket_manager, embedding_engine, import_memory
- Drop unused local var `old_primary` in reclassify_domains
- Fix 3 placeholder-less f-strings in backfill_embeddings, migrate_to_domains,
  reclassify_domains
- Delete legacy quick-check scripts test_smoke.py / test_tools.py (replaced
  by tests/ pytest suite, never collected by pytest)
- Delete backup_20260405_2124/ (stale local snapshot, no code references)

Verified: pyflakes *.py → 0 warnings; pytest tests/ → 237 passed, 7 skipped.
2026-04-21 20:11:47 +08:00
P0luz
71154d905f refactor: doc/code consistency, OMBRE_PORT, webhook push, host-vault dashboard
Doc-code consistency (per BEHAVIOR_SPEC.md ground truth):
- INTERNALS.md, dehydrator.py, README.md, config.example.yaml: drop the
  outdated "API 不可用自动降级到本地关键词提取" claims; align with the
  "RuntimeError on API outage, no silent fallback" design decision
- INTERNALS.md & BEHAVIOR_SPEC.md narrative: activation_count=1 → 0 (B-04)
- server.py header: 5 MCP tools → 6 (add dream)

OMBRE_PORT (T5/T6):
- Replace hardcoded 8000 in FastMCP / uvicorn / keepalive URL
  with int(os.environ.get("OMBRE_PORT", "8000"))

OMBRE_HOOK_URL / OMBRE_HOOK_SKIP webhook (T7):
- Implement _fire_webhook() helper: fire-and-forget POST with 5s timeout,
  failures logged at WARNING but never propagated
- Wired into breath / dream MCP tools and /breath-hook + /dream-hook routes
- Push payload: {event, timestamp, payload:{...}}; documented in ENV_VARS.md

Dashboard host-vault input (T12, per user request):
- New /api/host-vault GET/POST endpoints persist OMBRE_HOST_VAULT_DIR
  to project-root .env (idempotent upsert, preserves other entries,
  rejects quotes/newlines)
- Settings tab gains a "宿主机记忆桶目录 (Docker)" panel with
  load/save buttons and a clear "需要 docker compose down/up 生效" notice
2026-04-21 20:08:52 +08:00
P0luz
38be7610f4 fix: replace personal filesystem paths with env vars / config
- docker-compose.yml: hardcoded iCloud Obsidian vault volume → ${OMBRE_HOST_VAULT_DIR:-./buckets}
- write_memory.py / migrate_to_domains.py / reclassify_domains.py / reclassify_api.py:
  hardcoded ~/Documents/Obsidian Vault/Ombre Brain → OMBRE_BUCKETS_DIR > load_config() > ./buckets
- write_memory.py: also fix B-04 regression (activation_count: 1 → 0 in frontmatter template)
- reclassify_api.py: model + base_url now read from config (was hardcoded SiliconFlow / DeepSeek-V3)
- tests/dataset.py + test_feel_flow.py: anonymize fixture identifiers (P酱/P0lar1s/北极星 → TestUser/北方)

Project identifiers (git.p0lar1s.uk, p0luz/ombre-brain, P0luz/Ombre-Brain GitHub) intentionally retained as project branding per user decision.
2026-04-21 19:53:24 +08:00
P0luz
b869a111c7 feat: add base_url env vars, iCloud conflict detector, user compose guidance
Some checks failed
Build & Push Docker Image / build-and-push (push) Has been cancelled
Tests / test (push) Has been cancelled
- utils.py: support OMBRE_DEHYDRATION_BASE_URL and OMBRE_EMBEDDING_BASE_URL
  so Gemini/non-DeepSeek users can configure without mounting a custom config
- docker-compose.user.yml: pass all 4 model/url env vars from .env;
  add commented Gemini example + optional config.yaml mount hint
- ENV_VARS.md: document OMBRE_DEHYDRATION_BASE_URL and OMBRE_EMBEDDING_BASE_URL
- check_icloud_conflicts.py: scan bucket dir for iCloud conflict artefacts
  and duplicate bucket IDs (report-only, no file modifications)
2026-04-21 19:18:32 +08:00
P0luz
cddc809f02 chore(gitignore): exclude private memory data and dev test suites
- data/ : local user memory buckets (privacy)
- tests/integration/, tests/regression/, tests/unit/ :
  developer-only test suites kept out of upstream
2026-04-21 19:05:22 +08:00
P0luz
2646f8f7d0 docs: refresh INTERNALS / README / dashboard after B-fix series 2026-04-21 19:05:18 +08:00
P0luz
b318e557b0 fix: complete B-03/B-08/B-09 and add OMBRE_*_MODEL env vars
- decay_engine: keep activation_count as float (B-03);
  refresh local meta after auto_resolve so resolved_factor
  applies in the same cycle (B-08)
- server.hold(): user-supplied valence/arousal now takes
  priority over analyze() output (B-09)
- utils.load_config: support OMBRE_DEHYDRATION_MODEL
  (with OMBRE_MODEL alias) and OMBRE_EMBEDDING_MODEL
- ENV_VARS.md: document new model env vars
- tests/conftest.py: align fixture with spec-correct weights
  (time_proximity=1.5, content_weight=1.0) and feel subdir layout
2026-04-21 19:05:08 +08:00
P0luz
d2d4b89715 fix(search): keep resolved buckets reachable by keyword
Apply ×0.3 resolved penalty *after* fuzzy_threshold filter so
resolved buckets that genuinely match the query still surface
in search results (penalty only affects ranking order).
Update BEHAVIOR_SPEC.md scoring section to document new order.
2026-04-21 18:46:04 +08:00
P0luz
ccdffdb626 spec: add BEHAVIOR_SPEC and fix B-01~B-10 (resolved/decay/scoring)
- Add BEHAVIOR_SPEC.md as full system behaviour reference
- B-01: stop auto-archiving resolved buckets in update()
- B-03: keep activation_count as float in calculate_score
- B-04: initialise activation_count=0 on create
- B-05: time score coefficient 0.1 -> 0.02
- B-06: w_time default 2.5 -> 1.5
- B-07: content_weight default 3.0 -> 1.0
- B-08: refresh local meta after auto_resolve
- B-09: user-supplied valence/arousal takes priority over analyze()
- B-10: allow empty domain for feel buckets
- Refresh INTERNALS/README/dashboard accordingly
2026-04-21 18:45:52 +08:00
P0luz
c7ddfd46ad Merge pull request #3 from msz136/main 2026-04-21 13:29:18 +08:00
mousongzhe
2d2de45d5a 单独配置embedding模型 2026-04-21 13:21:23 +08:00
P0luz
e9d61b5d9d fix: 移除本地保底脱水的过时描述(README+dehydrator注释)
Some checks failed
Build & Push Docker Image / build-and-push (push) Has been cancelled
Tests / test (push) Has been cancelled
2026-04-19 18:19:04 +08:00
32 changed files with 2124 additions and 2158 deletions

4
.gitignore vendored
View File

@@ -15,3 +15,7 @@ scarp_paper
backup_*/ backup_*/
*.db *.db
import_state.json import_state.json
data/
tests/integration/
tests/regression/
tests/unit/

632
BEHAVIOR_SPEC.md Normal file
View File

@@ -0,0 +1,632 @@
# Ombre Brain 用户全流程行为规格书
> 版本:基于 server.py / bucket_manager.py / decay_engine.py / dehydrator.py / embedding_engine.py / CLAUDE_PROMPT.md / config.example.yaml
---
## 一、系统角色说明
### 1.1 参与方总览
| 角色 | 实体 | 职责边界 |
|------|------|---------|
| **用户** | 人类 | 发起对话,提供原始内容;可直接访问 Dashboard Web UI |
| **Claude模型端** | LLM如 Claude 3.x| 理解语义、决策何时调用工具、用自然语言回应用户;不直接操作文件 |
| **OB 服务端** | `server.py` + 各模块 | 接收 MCP 工具调用,执行持久化、搜索、衰减;对 Claude 不透明 |
### 1.2 Claude 端职责边界
- **必须做**:每次新对话第一步无参调用 `breath()`;对话内容有记忆价值时主动调用 `hold` / `grow`
- **不做**:不直接读写 `.md` 文件;不执行衰减计算;不操作 SQLite
- **决策权**Claude 决定是否存、存哪些、何时 resolveOB 决定如何存(合并/新建)
### 1.3 OB 服务端内部模块职责
| 模块 | 核心职责 |
|------|---------|
| `server.py` | 注册 MCP 工具(`breath/hold/grow/trace/pulse/dream`);路由 Dashboard HTTP 请求;`_merge_or_create()` 合并逻辑中枢 |
| `bucket_manager.py` | 桶 CRUD多维搜索fuzzy + embedding 双通道);`touch()` 激活刷新;`_time_ripple()` 时间波纹 |
| `dehydrator.py` | `analyze()` 自动打标;`merge()` 内容融合;`digest()` 日记拆分;`dehydrate()` 内容压缩 |
| `embedding_engine.py` | `generate_and_store()` 生成向量并存 SQLite`search_similar()` 余弦相似度检索 |
| `decay_engine.py` | `calculate_score()` 衰减分计算;`run_decay_cycle()` 周期扫描归档;后台定时循环 |
| `utils.py` | 配置加载路径安全校验ID 生成token 估算 |
---
## 二、场景全流程
---
### 场景 1新对话开始冷启动无历史记忆
**用户操作**:打开新对话窗口,说第一句话
**Claude 行为**:在任何回复之前,先调用 `breath()`(无参)
**OB 工具调用**
```
breath(query="", max_tokens=10000, domain="", valence=-1, arousal=-1, max_results=20, importance_min=-1)
```
**系统内部发生什么**
1. `decay_engine.ensure_started()` — 懒加载启动后台衰减循环(若未运行)
2. 进入"浮现模式"`not query or not query.strip()`
3. `bucket_mgr.list_all(include_archive=False)` — 遍历 `permanent/` + `dynamic/` + `feel/` 目录,加载所有 `.md` 文件的 frontmatter + 正文
4. 筛选钉选桶(`pinned=True``protected=True`
5. 筛选未解决桶(`resolved=False`,排除 `permanent/feel/pinned`
6. **冷启动检测**:找 `activation_count==0 && importance>=8` 的桶,最多取 2 个插入排序最前(**决策:`create()` 初始化应为 0区分"创建"与"被主动召回",见 B-04**
7.`decay_engine.calculate_score(metadata)` 降序排列剩余未解决桶
8. 对 top-20 以外随机洗牌top-1 固定2~20 随机)
9. 截断到 `max_results`
10. 对每个桶调用 `dehydrator.dehydrate(strip_wikilinks(content), clean_meta)` 压缩摘要
11.`max_tokens` 预算截断输出
**返回结果**
- 无记忆时:`"权重池平静,没有需要处理的记忆。"`
- 有记忆时:`"=== 核心准则 ===\n📌 ...\n\n=== 浮现记忆 ===\n[权重:X.XX] [bucket_id:xxx] ..."`
**注意**:浮现模式**不调用** `touch()`,不重置衰减计时器
---
### 场景 2新对话开始有历史记忆breath 自动浮现)
(与场景 1 相同流程,区别在于桶文件已存在)
**Claude 行为(完整对话启动序列,来自 CLAUDE_PROMPT.md**
```
1. breath() — 浮现未解决记忆
2. dream() — 消化最近记忆,有沉淀写 feel
3. breath(domain="feel") — 读取之前的 feel
4. 开始和用户说话
```
**`breath(domain="feel")` 内部流程**
1. 检测到 `domain.strip().lower() == "feel"` → 进入 feel 专用通道
2. `bucket_mgr.list_all()` 过滤 `type=="feel"` 的桶
3.`created` 降序排列
4.`max_tokens` 截断,不压缩(直接展示原文)
5. 返回:`"=== 你留下的 feel ===\n[时间] [bucket_id:xxx]\n内容..."`
---
### 场景 3用户说了一件事Claude 决定存入记忆hold
**用户操作**:例如"我刚刚拿到了实习 offer有点激动"
**Claude 行为**:判断值得记忆,调用:
```python
hold(content="用户拿到实习 offer情绪激动", importance=7)
```
**OB 工具调用**`hold(content, tags="", importance=7, pinned=False, feel=False, source_bucket="", valence=-1, arousal=-1)`
**系统内部发生什么**
1. `decay_engine.ensure_started()`
2. 输入校验:`content.strip()` 非空
3. `importance = max(1, min(10, 7))` = 7
4. `extra_tags = []`(未传 tags
5. **自动打标**`dehydrator.analyze(content)` → 调用 `_api_analyze()` → LLM 返回 JSON
- 返回示例:`{"domain": ["成长", "求职"], "valence": 0.8, "arousal": 0.7, "tags": ["实习", "offer", "激动", ...], "suggested_name": "实习offer获得"}`
- 失败时降级:`{"domain": ["未分类"], "valence": 0.5, "arousal": 0.3, "tags": [], "suggested_name": ""}`
6. 合并 `auto_tags + extra_tags` 去重
7. **合并检测**`_merge_or_create(content, tags, importance=7, domain, valence, arousal, name)`
- `bucket_mgr.search(content, limit=1, domain_filter=domain)` — 搜索最相似的桶
- 若最高分 > `config["merge_threshold"]`(默认 75且该桶非 pinned/protected
- `dehydrator.merge(old_content, new_content)``_api_merge()` → LLM 融合
- `bucket_mgr.update(bucket_id, content=merged, tags=union, importance=max, domain=union, valence=avg, arousal=avg)`
- `embedding_engine.generate_and_store(bucket_id, merged_content)` 更新向量
- 返回 `(bucket_name, True)`
- 否则:
- `bucket_mgr.create(content, tags, importance=7, domain, valence, arousal, name)` → 写 `.md` 文件到 `dynamic/<主题域>/` 目录
- `embedding_engine.generate_and_store(bucket_id, content)` 生成并存储向量
- 返回 `(bucket_id, False)`
**返回结果**
- 新建:`"新建→实习offer获得 成长,求职"`
- 合并:`"合并→求职经历 成长,求职"`
**bucket_mgr.create() 详情**
- `generate_bucket_id()``uuid4().hex[:12]`
- `sanitize_name(name)` → 正则清洗,最长 80 字符
- 写 YAML frontmatter + 正文到 `safe_path(domain_dir, f"{name}_{id}.md")`
- frontmatter 字段:`id, name, tags, domain, valence, arousal, importance, type, created, last_active, activation_count=0`**决策:初始为 0`touch()` 首次被召回后变为 1**
---
### 场景 4用户说了一段长日记Claude 整理存入grow
**用户操作**:发送一大段混合内容,如"今天去医院体检,结果还好;晚上和朋友吃饭聊了很多;最近有点焦虑..."
**Claude 行为**
```python
grow(content="今天去医院体检,结果还好;晚上和朋友吃饭聊了很多;最近有点焦虑...")
```
**系统内部发生什么**
1. `decay_engine.ensure_started()`
2. 内容长度检查:`len(content.strip()) < 30` → 若短于 30 字符走**快速路径**`dehydrator.analyze()` + `_merge_or_create()`,跳过 digest
3. **日记拆分**(正常路径):`dehydrator.digest(content)``_api_digest()` → LLM 调用 `DIGEST_PROMPT`
- LLM 返回 JSON 数组,每项含:`name, content, domain, valence, arousal, tags, importance`
- `_parse_digest()` 安全解析,校验 valence/arousal 范围
4. 对每个 `item` 调用 `_merge_or_create(item["content"], item["tags"], item["importance"], item["domain"], item["valence"], item["arousal"], item["name"])`
- 每项独立走合并或新建逻辑(同场景 3
- 单条失败不影响其他条(`try/except` 隔离)
**返回结果**
```
3条|新2合1
📝体检结果
📌朋友聚餐
📎近期焦虑情绪
```
---
### 场景 5用户想找某段记忆breath 带 query 检索)
**用户操作**:例如"还记得我之前说过关于实习的事吗"
**Claude 行为**
```python
breath(query="实习", domain="成长", valence=0.7, arousal=0.5)
```
**系统内部发生什么**
1. `decay_engine.ensure_started()`
2. 检测到 `query` 非空,进入**检索模式**
3. 解析 `domain_filter = ["成长"]``q_valence=0.7``q_arousal=0.5`
4. **关键词检索**`bucket_mgr.search(query, limit=20, domain_filter, q_valence, q_arousal)`
- **Layer 1**domain 预筛 → 仅保留 domain 包含"成长"的桶;若为空则回退全量
- **Layer 1.5**embedding 已开启时):`embedding_engine.search_similar(query, top_k=50)` → 用 embedding 候选集替换/缩小精排范围
- **Layer 2**:多维加权精排:
- `_calc_topic_score()`: `fuzz.partial_ratio(query, name)×3 + domain×2.5 + tags×2 + body×1`,归一化 0~1
- `_calc_emotion_score()`: `1 - √((v差²+a差²)/2)`0~1
- `_calc_time_score()`: `e^(-0.02×days_since_last_active)`0~1
- `importance_score`: `importance / 10`
- `total = topic×4 + emotion×2 + time×1.5 + importance×1`,归一化到 0~100
- 过滤 `score >= fuzzy_threshold`(默认 50
- 通过阈值后,`resolved` 桶仅在排序时降权 ×0.3(不影响是否被检出)
- 返回最多 `limit`
5. 排除 pinned/protected 桶(它们在浮现模式展示)
6. **向量补充通道**server.py 额外层):`embedding_engine.search_similar(query, top_k=20)` → 相似度 > 0.5 的桶补充到结果集(标记 `vector_match=True`
7. 对每个结果:
- 记忆重构:若传了 `q_valence`,展示层 valence 做微调:`shift = (q_valence - 0.5) × 0.2`,最大 ±0.1
- `dehydrator.dehydrate(strip_wikilinks(content), clean_meta)` 压缩摘要
- `bucket_mgr.touch(bucket_id)` — 刷新 `last_active` + `activation_count += 1` + 触发 `_time_ripple()`(对 48h 内创建的邻近桶 activation_count + 0.3,最多 5 个桶)
8. **随机漂流**:若检索结果 < 3 且 `random.random() < 0.4`,随机从 `decay_score < 2.0` 的旧桶里取 1~3 条,标注 `[surface_type: random]`
**返回结果**
```
[bucket_id:abc123] [重要度:7] [主题:成长] 实习offer获得...
[语义关联] [bucket_id:def456] 求职经历...
--- 忽然想起来 ---
[surface_type: random] 某段旧记忆...
```
---
### 场景 6用户想查看所有记忆状态pulse
**用户操作**"帮我看看你现在都记得什么"
**Claude 行为**
```python
pulse(include_archive=False)
```
**系统内部发生什么**
1. `bucket_mgr.get_stats()` — 遍历三个目录,统计文件数量和 KB 大小
2. `bucket_mgr.list_all(include_archive=False)` — 加载全部桶
3. 对每个桶:`decay_engine.calculate_score(metadata)` 计算当前权重分
4. 按类型/状态分配图标:📌钉选 / 📦permanent / 🫧feel / 🗄archived / ✅resolved / 💭普通
5. 拼接每桶摘要行:`名称 bucket_id 主题 情感坐标 重要度 权重 标签`
**返回结果**
```
=== Ombre Brain 记忆系统 ===
固化记忆桶: 2 个
动态记忆桶: 15 个
归档记忆桶: 3 个
总存储大小: 48.3 KB
衰减引擎: 运行中
=== 记忆列表 ===
📌 [核心原则] bucket_id:abc123 主题:内心 情感:V0.8/A0.5 ...
💭 [实习offer获得] bucket_id:def456 主题:成长 情感:V0.8/A0.7 ...
```
---
### 场景 7用户想修改/标记已解决/删除某条记忆trace
#### 7a 标记已解决
**Claude 行为**
```python
trace(bucket_id="abc123", resolved=1)
```
**系统内部**
1. `resolved in (0, 1)``updates["resolved"] = True`
2. `bucket_mgr.update("abc123", resolved=True)` → 读取 `.md` 文件,更新 frontmatter 中 `resolved=True`,写回,**桶留在原 `dynamic/` 目录,不移动**
3. 后续 `breath()` 浮现时:该桶 `decay_engine.calculate_score()` 乘以 `resolved_factor=0.05`(若同时 `digested=True`×0.02),自然降权,最终由 decay 引擎在得分 < threshold 时归档
4. `bucket_mgr.search()` 中该桶得分乘以 0.3 降权,但仍可被关键词激活
> ⚠️ **代码 Bug B-01**:当前实现中 `update(resolved=True)` 会将桶**立即移入 `archive/`**,导致桶完全消失于所有搜索路径,与上述规格不符。需移除 `bucket_manager.py` `update()` 中 resolved → `_move_bucket(archive_dir)` 的自动归档逻辑。
**返回**`"已修改记忆桶 abc123: resolved=True → 已沉底,只在关键词触发时重新浮现"`
#### 7b 修改元数据
```python
trace(bucket_id="abc123", name="新名字", importance=8, tags="焦虑,成长")
```
**系统内部**:收集非默认值字段 → `bucket_mgr.update()` 批量更新 frontmatter
#### 7c 删除
```python
trace(bucket_id="abc123", delete=True)
```
**系统内部**
1. `bucket_mgr.delete("abc123")``_find_bucket_file()` 定位文件 → `os.remove(file_path)`
2. `embedding_engine.delete_embedding("abc123")` → SQLite `DELETE WHERE bucket_id=?`
3. 返回:`"已遗忘记忆桶: abc123"`
---
### 场景 8记忆长期未被激活自动衰减归档后台 decay
**触发方式**:服务启动后,`decay_engine.start()` 创建后台 asyncio Task`check_interval_hours`(默认 24h执行一次 `run_decay_cycle()`
**系统内部发生什么**
1. `bucket_mgr.list_all(include_archive=False)` — 获取所有活跃桶
2. 跳过 `type in ("permanent","feel")``pinned=True``protected=True` 的桶
3. **自动 resolve**:若 `importance <= 4` 且距上次激活 > 30 天且 `resolved=False``bucket_mgr.update(bucket_id, resolved=True)`
4. 对每桶调用 `calculate_score(metadata)`
**短期days_since ≤ 3**
```
time_weight = 1.0 + e^(-hours/36) (t=0→×2.0, t=36h→×1.5)
emotion_weight = base(1.0) + arousal × arousal_boost(0.8)
combined = time_weight×0.7 + emotion_weight×0.3
base_score = importance × activation_count^0.3 × e^(-λ×days) × combined
```
**长期days_since > 3**
```
combined = emotion_weight×0.7 + time_weight×0.3
```
**修正因子**
- `resolved=True` → ×0.05
- `resolved=True && digested=True` → ×0.02
- `arousal > 0.7 && resolved=False` → ×1.5(高唤醒紧迫加成)
- `pinned/protected/permanent` → 返回 999.0(永不衰减)
- `type=="feel"` → 返回 50.0(固定)
5. `score < threshold`(默认 0.3)→ `bucket_mgr.archive(bucket_id)` → `_move_bucket()` 将文件从 `dynamic/` 移动到 `archive/` 目录,更新 frontmatter `type="archived"`
**返回 stats**`{"checked": N, "archived": N, "auto_resolved": N, "lowest_score": X}`
---
### 场景 9用户使用 dream 工具进行记忆沉淀
**触发**Claude 在对话启动时,`breath()` 之后调用 `dream()`
**OB 工具调用**`dream()`(无参数)
**系统内部发生什么**
1. `bucket_mgr.list_all()` → 过滤非 `permanent/feel/pinned/protected` 桶
2. 按 `created` 降序取前 10 条(最近新增的记忆)
3. 对每条拼接名称、resolved 状态、domain、V/A、创建时间、正文前 500 字符
4. **连接提示**embedding 已开启 && 桶数 >= 2
- 取每个最近桶的 embedding`embedding_engine.get_embedding(bucket_id)`
- 两两计算 `_cosine_similarity()`,找相似度最高的对
- 若 `best_sim > 0.5` → 输出提示:`"[名A] 和 [名B] 似乎有关联 (相似度:X.XX)"`
5. **feel 结晶提示**embedding 已开启 && feel 数 >= 3
- 对所有 feel 桶两两计算相似度
- 若某 feel 与 >= 2 个其他 feel 相似度 > 0.7 → 提示升级为 pinned 桶
6. 返回标准 header 说明(引导 Claude 自省)+ 记忆列表 + 连接提示 + 结晶提示
**Claude 后续行为**(根据 CLAUDE_PROMPT 引导):
- `trace(bucket_id, resolved=1)` 放下可以放下的
- `hold(content="...", feel=True, source_bucket="xxx", valence=0.6)` 写感受
- 无沉淀则不操作
---
### 场景 10用户使用 feel 工具记录 Claude 的感受
**触发**Claude 在 dream 后决定记录某段记忆带来的感受
**OB 工具调用**
```python
hold(content="她问起了警校的事,我感觉她在用问题保护自己,问是为了不去碰那个真实的恐惧。", feel=True, source_bucket="abc123", valence=0.45, arousal=0.4)
```
**系统内部发生什么**
1. `feel=True` → 进入 feel 专用路径,跳过自动打标和合并检测
2. `feel_valence = valence`Claude 自身视角的情绪,非事件情绪)
3. `bucket_mgr.create(content, tags=[], importance=5, domain=[], valence=feel_valence, arousal=feel_arousal, bucket_type="feel")` → 写入 `feel/` 目录
4. `embedding_engine.generate_and_store(bucket_id, content)` — feel 桶同样有向量(供 dream 结晶检测使用)
5. 若 `source_bucket` 非空:`bucket_mgr.update(source_bucket, digested=True, model_valence=feel_valence)` → 标记源记忆已消化
- 此后该源桶 `calculate_score()` 中 `resolved_factor = 0.02`accelerated fade
**衰减特性**feel 桶 `type=="feel"` → `calculate_score()` 固定返回 50.0,永不归档
**检索特性**:不参与普通 `breath()` 浮现;只通过 `breath(domain="feel")` 读取
**返回**`"🫧feel→<bucket_id>"`
---
### 场景 11用户带 importance_min 参数批量拉取重要记忆
**Claude 行为**
```python
breath(importance_min=8)
```
**系统内部发生什么**
1. `importance_min >= 1` → 进入**批量拉取模式**,完全跳过语义搜索
2. `bucket_mgr.list_all(include_archive=False)` 全量加载
3. 过滤 `importance >= 8` 且 `type != "feel"` 的桶
4. 按 `importance` 降序排列,截断到最多 20 条
5. 对每条调用 `dehydrator.dehydrate()` 压缩,按 `max_tokens`(默认 10000预算截断
**返回**
```
[importance:10] [bucket_id:xxx] ...(核心原则)
---
[importance:9] [bucket_id:yyy] ...
---
[importance:8] [bucket_id:zzz] ...
```
---
### 场景 12embedding 向量化检索场景(开启 embedding 时)
**前提**`config.yaml` 中 `embedding.enabled: true` 且 `OMBRE_API_KEY` 已配置
**embedding 介入的两个层次**
#### 层次 ABucketManager.search() 内的 Layer 1.5 预筛
- 调用点:`bucket_mgr.search()` → Layer 1.5
- 函数:`embedding_engine.search_similar(query, top_k=50)` → 生成查询 embedding → SQLite 全量余弦计算 → 返回 `[(bucket_id, similarity)]` 按相似度降序
- 作用:将精排候选集从所有桶缩小到向量最近邻的 50 个,加速后续多维精排
#### 层次 Bserver.py breath 的额外向量通道
- 调用点:`breath()` 检索模式中keyword 搜索完成后
- 函数:`embedding_engine.search_similar(query, top_k=20)` → 相似度 > 0.5 的桶补充到结果集
- 标注:补充桶带 `[语义关联]` 前缀
**向量存储路径**
- 新建桶后:`embedding_engine.generate_and_store(bucket_id, content)` → `_generate_embedding(text[:2000])` → API 调用 → `_store_embedding()` → SQLite `INSERT OR REPLACE`
- 合并更新后:同上,用 merged content 重新生成
- 删除桶时:`embedding_engine.delete_embedding(bucket_id)` → `DELETE FROM embeddings`
**SQLite 结构**
```sql
CREATE TABLE embeddings (
bucket_id TEXT PRIMARY KEY,
embedding TEXT NOT NULL, -- JSON 序列化的 float 数组
updated_at TEXT NOT NULL
)
```
**相似度计算**`_cosine_similarity(a, b)` = dot(a,b) / (|a| × |b|)
---
## 三、边界与降级行为
| 场景 | 异常情况 | 降级行为 |
|------|---------|---------|
| `breath()` 浮现 | 桶目录为空 | 返回 `"权重池平静,没有需要处理的记忆。"` |
| `breath()` 浮现 | `list_all()` 异常 | 返回 `"记忆系统暂时无法访问。"` |
| `breath()` 检索 | `bucket_mgr.search()` 异常 | 返回 `"检索过程出错,请稍后重试。"` |
| `breath()` 检索 | embedding 不可用 / API 失败 | `logger.warning()` 记录,跳过向量通道,仅用 keyword 检索 |
| `breath()` 检索 | 结果 < 3 条 | 40% 概率从低权重旧桶随机浮现 1~3 条,标注 `[surface_type: random]` |
| `hold()` 自动打标 | `dehydrator.analyze()` 失败 | 降级到默认值:`domain=["未分类"], valence=0.5, arousal=0.3, tags=[], name=""` |
| `hold()` 合并检测 | `bucket_mgr.search()` 失败 | `logger.warning()`,直接走新建路径 |
| `hold()` 合并 | `dehydrator.merge()` 失败 | `logger.warning()`,跳过合并,直接新建 |
| `hold()` embedding | API 失败 | `try/except` 吞掉embedding 缺失但不影响存储 |
| `grow()` 日记拆分 | `dehydrator.digest()` 失败 | 返回 `"日记整理失败: {e}"` |
| `grow()` 单条处理失败 | 单个 item 异常 | `logger.warning()` + 标注 `⚠️条目名`,其他条目正常继续 |
| `grow()` 内容 < 30 字 | — | 快速路径:`analyze()` + `_merge_or_create()`,跳过 `digest()`(节省 token |
| `trace()` | `bucket_mgr.get()` 返回 None | 返回 `"未找到记忆桶: {bucket_id}"` |
| `trace()` | 未传任何可修改字段 | 返回 `"没有任何字段需要修改。"` |
| `pulse()` | `get_stats()` 失败 | 返回 `"获取系统状态失败: {e}"` |
| `dream()` | embedding 未开启 | 跳过连接提示和结晶提示,仅返回记忆列表 |
| `dream()` | 桶列表为空 | 返回 `"没有需要消化的新记忆。"` |
| `decay_cycle` | `list_all()` 失败 | 返回 `{"checked":0, "archived":0, ..., "error": str(e)}`,不终止后台循环 |
| `decay_cycle` | 单桶 `calculate_score()` 失败 | `logger.warning()`,跳过该桶继续 |
| 所有 feel 操作 | `source_bucket` 不存在 | `logger.warning()` 记录feel 桶本身仍成功创建 |
| `dehydrator.dehydrate()` / `analyze()` / `merge()` / `digest()` | API 不可用(`api_available=False`| **直接向 MCP 调用端明确报错(`RuntimeError`**,无本地降级。本地关键词提取质量不足以替代语义打标与合并,静默降级比报错更危险(可能产生错误分类记忆)。 |
| `embedding_engine.search_similar()` | `enabled=False` | 直接返回 `[]`,调用方 fallback 到 keyword 搜索 |
---
## 四、数据流图
### 4.1 一条记忆的完整生命周期
```
用户输入内容
Claude 决策: hold / grow / 自动
├─[grow 长内容]──→ dehydrator.digest(content)
│ DIGEST_PROMPT → LLM API
│ 返回 [{name,content,domain,...}]
│ ↓ 每条独立处理 ↓
└─[hold 单条]──→ dehydrator.analyze(content)
ANALYZE_PROMPT → LLM API
返回 {domain, valence, arousal, tags, suggested_name}
_merge_or_create()
bucket_mgr.search(content, limit=1, domain_filter)
┌─────┴─────────────────────────┐
│ score > merge_threshold (75)? │
│ │
YES NO
│ │
▼ ▼
dehydrator.merge( bucket_mgr.create(
old_content, new) content, tags,
MERGE_PROMPT → LLM importance, domain,
│ valence, arousal,
▼ bucket_type="dynamic"
bucket_mgr.update(...) )
│ │
└──────────┬─────────────┘
embedding_engine.generate_and_store(
bucket_id, content)
→ _generate_embedding(text[:2000])
→ API 调用 (gemini-embedding-001)
→ _store_embedding() → SQLite
文件写入: {buckets_dir}/dynamic/{domain}/{name}_{id}.md
YAML frontmatter:
id, name, tags, domain, valence, arousal,
importance, type="dynamic", created, last_active,
activation_count=0 # B-04: starts at 0; touch() bumps to 1+
┌─────── 记忆桶存活期 ──────────────────────────────────────┐
│ │
│ 每次被 breath(query) 检索命中: │
│ bucket_mgr.touch(bucket_id) │
│ → last_active = now_iso() │
│ → activation_count += 1 │
│ → _time_ripple(source_id, now, hours=48) │
│ 对 48h 内邻近桶 activation_count += 0.3 │
│ │
│ 被 dream() 消化: │
│ hold(feel=True, source_bucket=id) → │
│ bucket_mgr.update(id, digested=True) │
│ │
│ 被 trace(resolved=1) 标记: │
│ resolved=True → decay score ×0.05 (或 ×0.02) │
│ │
└───────────────────────────────────────────────────────────┘
decay_engine 后台循环 (每 check_interval_hours=24h)
run_decay_cycle()
→ 列出所有动态桶
→ calculate_score(metadata)
importance × activation_count^0.3
× e^(-λ×days)
× combined_weight
× resolved_factor
× urgency_boost
→ score < threshold (0.3)?
┌─────┴──────┐
│ │
YES NO
│ │
▼ ▼
bucket_mgr.archive(id) 继续存活
→ _move_bucket()
→ 文件移动到 archive/
→ frontmatter type="archived"
记忆桶归档(不再参与浮现/搜索)
但文件仍存在,可通过 pulse(include_archive=True) 查看
```
### 4.2 feel 桶的特殊路径
```
hold(feel=True, source_bucket="xxx", valence=0.45)
bucket_mgr.create(bucket_type="feel")
写入 feel/ 目录
├─→ embedding_engine.generate_and_store()(供 dream 结晶检测)
└─→ bucket_mgr.update(source_bucket, digested=True, model_valence=0.45)
源桶 resolved_factor → 0.02
加速衰减直到归档
feel 桶自身:
- calculate_score() 返回固定 50.0
- 不参与普通 breath 浮现
- 不参与 dreaming 候选
- 只通过 breath(domain="feel") 读取
- 永不归档
```
---
## 五、代码与规格差异汇总(审查版)
> 本节由完整源码审查生成2026-04-21记录原待实现项最终状态、新发现 Bug 及参数决策。
---
### 5.1 原待实现项最终状态
| 编号 | 原描述 | 状态 | 结论 |
|------|--------|------|------|
| ⚠️-1 | `dehydrate()` 无本地降级 fallback | **已确认为设计决策** | API 不可用时直接向 MCP 调用端报错RuntimeError不降级见三、降级行为表 |
| ⚠️-2 | `run_decay_cycle()` auto_resolved 实现存疑 | ✅ 已确认实现 | `decay_engine.py` 完整实现 imp≤4 + >30天 + 未解决 → `bucket_mgr.update(resolved=True)` |
| ⚠️-3 | `list_all()` 是否遍历 `feel/` 子目录 | ✅ 已确认实现 | `list_all()` dirs 明确包含 `self.feel_dir`,递归遍历 |
| ⚠️-4 | `_time_ripple()` 浮点增量被 `int()` 截断 | ❌ 已确认 Bug | 见 B-03决策见下 |
| ⚠️-5 | Dashboard `/api/*` 路由认证覆盖 | ✅ 已确认覆盖 | 所有 `/api/buckets`、`/api/search`、`/api/network`、`/api/bucket/{id}`、`/api/breath-debug` 均调用 `_require_auth(request)` |
---
### 5.2 新发现 Bug 及修复决策
| 编号 | 场景 | 严重度 | 问题描述 | 决策 & 修复方案 |
|------|------|--------|----------|----------------|
| **B-01** | 场景7a | 高 | `bucket_mgr.update(resolved=True)` 当前会将桶立即移入 `archive/`type="archived"),规格预期"降权留存、关键词可激活"。resolved 桶实质上立即从所有搜索路径消失。 | **修复**:移除 `bucket_manager.py` `update()` 中 `resolved → _move_bucket(archive_dir)` 的自动归档逻辑,仅更新 frontmatter `resolved=True`,由 decay 引擎自然衰减至 archive。 |
| **B-03** | 全局 | 高 | `_time_ripple()` 对 `activation_count` 做浮点增量(+0.3),但 `calculate_score()` 中 `max(1, int(...))` 截断小数,增量丢失,时间涟漪对衰减分无实际效果。 | **修复**`decay_engine.py` `calculate_score()` 中改为 `activation_count = max(1.0, float(metadata.get("activation_count", 1)))` |
| **B-04** | 场景1 | 中 | `bucket_manager.create()` 初始化 `activation_count=1`,冷启动检测条件 `activation_count==0` 对所有正常创建的桶永不满足,高重要度新桶不被优先浮现。 | **决策:初始化改为 `activation_count=0`**。语义上"创建"≠"被召回"`touch()` 首次命中后变为 1冷启动检测自然生效。规格已更新见场景1步骤6 & 场景3 create 详情)。 |
| **B-05** | 场景5 | 中 | `bucket_manager.py` `_calc_time_score()` 实现 `e^(-0.1×days)`,规格为 `e^(-0.02×days)`,衰减速度快 5 倍30天后时间分 ≈ 0.05(规格预期 ≈ 0.55),旧记忆时间维度近乎失效。 | **决策:保留规格值 `0.02`**。记忆系统中旧记忆应通过关键词仍可被唤醒,时间维度是辅助信号不是淘汰信号。修复:`_calc_time_score()` 改为 `return math.exp(-0.02 * days)` |
| **B-06** | 场景5 | 中 | `bucket_manager.py` `w_time` 默认值为 `2.5`,规格为 `1.5`,叠加 B-05 会导致时间维度严重偏重近期记忆。 | **决策:保留规格值 `1.5`**。修复:`w_time = scoring.get("time_proximity", 1.5)` |
| **B-07** | 场景5 | 中 | `bucket_manager.py` `content_weight` 默认值为 `3.0`,规格为 `1.0`body×1。正文权重过高导致合并检测`search(content, limit=1)`)误判——内容相似但主题不同的桶被错误合并。 | **决策:保留规格值 `1.0`**。正文是辅助信号,主要靠 name/tags/domain 识别同话题桶。修复:`content_weight = scoring.get("content_weight", 1.0)` |
| **B-08** | 场景8 | 低 | `run_decay_cycle()` 内 auto_resolve 后继续使用旧 `meta` 变量计算 score`resolved_factor=0.05` 需等下一 cycle 才生效。 | **修复**auto_resolve 成功后执行 `meta["resolved"] = True` 刷新本地 meta 变量。 |
| **B-09** | 场景3 | 低 | `hold()` 非 feel 路径中,用户显式传入的 `valence`/`arousal` 被 `analyze()` 返回值完全覆盖。 | **修复**:若用户显式传入(`0 <= valence <= 1`),优先使用用户值,`analyze()` 结果作为 fallback。 |
| **B-10** | 场景10 | 低 | feel 桶以 `domain=[]` 创建,但 `bucket_manager.create()` 中 `domain or ["未分类"]` 兜底写入 `["未分类"]`,数据不干净。 | **修复**`create()` 中对 `bucket_type=="feel"` 单独处理,允许空 domain 直接写入。 |
---
### 5.3 已确认正常实现
- `breath()` 浮现模式不调用 `touch()`,不重置衰减计时器
- `feel` 桶 `calculate_score()` 返回固定 50.0,永不归档
- `breath(domain="feel")` 独立通道,按 `created` 降序,不压缩展示原文
- `decay_engine.calculate_score()` 短期≤3天/ 长期(>3天权重分离公式
- `urgency_boost``arousal > 0.7 && !resolved → ×1.5`
- `dream()` 连接提示best_sim > 0.5+ 结晶提示feel 相似度 > 0.7 × ≥2 个)
- 所有 `/api/*` Dashboard 路由均受 `_require_auth` 保护
- `trace(delete=True)` 同步调用 `embedding_engine.delete_embedding()`
- `grow()` 单条失败 `try/except` 隔离,标注 `⚠️条目名`,其他条继续
---
*本文档基于代码直接推导,每个步骤均可对照源文件函数名和行为验证。如代码更新,请同步修订此文档。*

45
ENV_VARS.md Normal file
View File

@@ -0,0 +1,45 @@
# 环境变量参考
| 变量名 | 必填 | 默认值 | 说明 |
|--------|------|--------|------|
| `OMBRE_API_KEY` | 是 | — | Gemini / OpenAI-compatible API Key用于脱水(dehydration)和向量嵌入 |
| `OMBRE_BASE_URL` | 否 | `https://generativelanguage.googleapis.com/v1beta/openai/` | API Base URL可替换为代理或兼容接口 |
| `OMBRE_TRANSPORT` | 否 | `stdio` | MCP 传输模式:`stdio` / `sse` / `streamable-http` |
| `OMBRE_PORT` | 否 | `8000` | HTTP/SSE 模式监听端口(仅 `sse` / `streamable-http` 生效) |
| `OMBRE_BUCKETS_DIR` | 否 | `./buckets` | 记忆桶文件存放目录(绑定 Docker Volume 时务必设置) |
| `OMBRE_HOOK_URL` | 否 | — | Breath/Dream Webhook 推送地址POST JSON留空则不推送 |
| `OMBRE_HOOK_SKIP` | 否 | `false` | 设为 `true`/`1`/`yes` 跳过 Webhook 推送(即使 `OMBRE_HOOK_URL` 已设置) |
| `OMBRE_DASHBOARD_PASSWORD` | 否 | — | 预设 Dashboard 访问密码;设置后覆盖文件存储的密码,首次访问不弹设置向导 |
| `OMBRE_DEHYDRATION_MODEL` | 否 | `deepseek-chat` | 脱水/打标/合并/拆分用的 LLM 模型名(覆盖 `dehydration.model` |
| `OMBRE_DEHYDRATION_BASE_URL` | 否 | `https://api.deepseek.com/v1` | 脱水模型的 API Base URL覆盖 `dehydration.base_url` |
| `OMBRE_MODEL` | 否 | — | `OMBRE_DEHYDRATION_MODEL` 的别名(前者优先) |
| `OMBRE_EMBEDDING_MODEL` | 否 | `gemini-embedding-001` | 向量嵌入模型名(覆盖 `embedding.model` |
| `OMBRE_EMBEDDING_BASE_URL` | 否 | — | 向量嵌入的 API Base URL覆盖 `embedding.base_url`;留空则复用脱水配置) |
## 说明
- `OMBRE_API_KEY` 也可在 `config.yaml``dehydration.api_key` / `embedding.api_key` 中设置,但**强烈建议**通过环境变量传入,避免密钥写入文件。
- `OMBRE_DASHBOARD_PASSWORD` 设置后Dashboard 的"修改密码"功能将被禁用(显示提示,建议直接修改环境变量)。未设置则密码存储在 `{buckets_dir}/.dashboard_auth.json`SHA-256 + salt
## Webhook 推送格式 (`OMBRE_HOOK_URL`)
设置 `OMBRE_HOOK_URL`Ombre Brain 会在以下事件发生时**异步**fire-and-forget5 秒超时)`POST` JSON 到该 URL
| 事件名 (`event`) | 触发时机 | `payload` 字段 |
|------------------|----------|----------------|
| `breath` | MCP 工具 `breath()` 返回时 | `mode` (`ok`/`empty`), `matches`, `chars` |
| `dream` | MCP 工具 `dream()` 返回时 | `recent`, `chars` |
| `breath_hook` | HTTP `GET /breath-hook` 命中SessionStart 钩子) | `surfaced`, `chars` |
| `dream_hook` | HTTP `GET /dream-hook` 命中 | `surfaced`, `chars` |
请求体结构JSON
```json
{
"event": "breath",
"timestamp": 1730000000.123,
"payload": { "...": "..." }
}
```
Webhook 推送失败仅在服务日志中以 WARNING 级别记录,**不会影响 MCP 工具的正常返回**。

View File

@@ -65,7 +65,7 @@
**自动化处理** **自动化处理**
- 存入时 LLM 自动分析 domain/valence/arousal/tags/name - 存入时 LLM 自动分析 domain/valence/arousal/tags/name
- 大段日记 LLM 拆分为 2~6 条独立记忆 - 大段日记 LLM 拆分为 2~6 条独立记忆
- 浮现时自动脱水压缩LLM 压缩保语义API 不可用降级到本地关键词提取 - 浮现时自动脱水压缩LLM 压缩保语义API 不可用时直接报错,无静默降级)
- Wikilink `[[]]` 由 LLM 在内容中标记 - Wikilink `[[]]` 由 LLM 在内容中标记
--- ---
@@ -76,7 +76,7 @@
| 工具 | 关键参数 | 功能 | | 工具 | 关键参数 | 功能 |
|---|---|---| |---|---|---|
| `breath` | query, max_tokens, domain, valence, arousal, max_results | 检索/浮现记忆 | | `breath` | query, max_tokens, domain, valence, arousal, max_results, **importance_min** | 检索/浮现记忆 |
| `hold` | content, tags, importance, pinned, feel, source_bucket, valence, arousal | 存储记忆 | | `hold` | content, tags, importance, pinned, feel, source_bucket, valence, arousal | 存储记忆 |
| `grow` | content | 日记拆分归档 | | `grow` | content | 日记拆分归档 |
| `trace` | bucket_id, name, domain, valence, arousal, importance, tags, resolved, pinned, digested, content, delete | 修改元数据/内容/删除 | | `trace` | bucket_id, name, domain, valence, arousal, importance, tags, resolved, pinned, digested, content, delete | 修改元数据/内容/删除 |
@@ -85,10 +85,11 @@
**工具详细行为** **工具详细行为**
**`breath`** — 种模式: **`breath`** — 种模式:
- **浮现模式**(无 query无参调用按衰减引擎活跃度排序返回 top 记忆,permanent/pinned 始终浮现 - **浮现模式**(无 query无参调用按衰减引擎活跃度排序返回 top 记忆,钉选桶始终展示;冷启动检测(`activation_count==0 && importance>=8`)的桶最多 2 个插入最前,再 Top-1 固定 + Top-20 随机打乱
- **检索模式**(有 query关键词 + 向量双通道搜索四维评分topic×4 + emotion×2 + time×2.5 + importance×1阈值过滤 - **检索模式**(有 query关键词 + 向量双通道搜索四维评分topic×4 + emotion×2 + time×2.5 + importance×1阈值过滤
- **Feel 检索**`domain="feel"`):特殊通道,按创建时间倒序返回所有 feel 类型桶,不走评分逻辑 - **Feel 检索**`domain="feel"`):特殊通道,按创建时间倒序返回所有 feel 类型桶,不走评分逻辑
- **重要度批量模式**`importance_min>=1`):跳过语义搜索,直接筛选 importance≥importance_min 的桶,按 importance 降序,最多 20 条
- 若指定 valence对匹配桶的 valence 微调 ±0.1(情感记忆重构) - 若指定 valence对匹配桶的 valence 微调 ±0.1(情感记忆重构)
**`hold`** — 两种模式: **`hold`** — 两种模式:
@@ -120,26 +121,41 @@
| `/breath-hook` | GET | SessionStart 钩子 | | `/breath-hook` | GET | SessionStart 钩子 |
| `/dream-hook` | GET | Dream 钩子 | | `/dream-hook` | GET | Dream 钩子 |
| `/dashboard` | GET | Dashboard 页面 | | `/dashboard` | GET | Dashboard 页面 |
| `/api/buckets` | GET | 桶列表 | | `/api/buckets` | GET | 桶列表 🔒 |
| `/api/bucket/{id}` | GET | 桶详情 | | `/api/bucket/{id}` | GET | 桶详情 🔒 |
| `/api/search?q=` | GET | 搜索 | | `/api/search?q=` | GET | 搜索 🔒 |
| `/api/network` | GET | 向量相似网络 | | `/api/network` | GET | 向量相似网络 🔒 |
| `/api/breath-debug` | GET | 评分调试 | | `/api/breath-debug` | GET | 评分调试 🔒 |
| `/api/config` | GET | 配置查看key 脱敏) | | `/api/config` | GET | 配置查看key 脱敏)🔒 |
| `/api/config` | POST | 热更新配置 | | `/api/config` | POST | 热更新配置 🔒 |
| `/api/import/upload` | POST | 上传并启动历史对话导入 | | `/api/status` | GET | 系统状态(版本/桶数/引擎)🔒 |
| `/api/import/status` | GET | 导入进度查询 | | `/api/import/upload` | POST | 上传并启动历史对话导入 🔒 |
| `/api/import/pause` | POST | 暂停/继续导入 | | `/api/import/status` | GET | 导入进度查询 🔒 |
| `/api/import/patterns` | GET | 导入完成后词频规律检测 | | `/api/import/pause` | POST | 暂停/继续导入 🔒 |
| `/api/import/results` | GET | 导入记忆桶列表 | | `/api/import/patterns` | GET | 导入完成后词频规律检测 🔒 |
| `/api/import/review` | POST | 批量审阅/批准导入结果 | | `/api/import/results` | GET | 已导入记忆桶列表 🔒 |
| `/api/import/review` | POST | 批量审阅/批准导入结果 🔒 |
| `/auth/status` | GET | 认证状态(是否需要初始化密码)|
| `/auth/setup` | POST | 首次设置密码 |
| `/auth/login` | POST | 密码登录,颁发 session cookie |
| `/auth/logout` | POST | 注销 session |
| `/auth/change-password` | POST | 修改密码 🔒 |
**Dashboard5 个 Tab** > 🔒 = 需要 Dashboard 认证(未认证返回 401 JSON
**Dashboard 认证**
- 密码存储SHA-256 + 随机 salt保存于 `{buckets_dir}/.dashboard_auth.json`
- 环境变量 `OMBRE_DASHBOARD_PASSWORD` 设置后,覆盖文件密码(只读,不可通过 Dashboard 修改)
- Session内存字典服务重启失效cookie `ombre_session`HttpOnly, SameSite=Lax, 7天
- `/health`, `/breath-hook`, `/dream-hook`, `/mcp*` 路径不受保护(公开)
**Dashboard6 个 Tab**
1. 记忆桶列表6 种过滤器 + 主题域过滤 + 搜索 + 详情面板 1. 记忆桶列表6 种过滤器 + 主题域过滤 + 搜索 + 详情面板
2. Breath 模拟:输入参数 → 可视化五步流程 → 四维条形图 2. Breath 模拟:输入参数 → 可视化五步流程 → 四维条形图
3. 记忆网络Canvas 力导向图(节点=桶,边=相似度) 3. 记忆网络Canvas 力导向图(节点=桶,边=相似度)
4. 配置:热更新脱水/embedding/合并参数 4. 配置:热更新脱水/embedding/合并参数
5. 导入:历史对话拖拽上传 → 分块处理进度条 → 词频规律分析 → 导入结果审阅 5. 导入:历史对话拖拽上传 → 分块处理进度条 → 词频规律分析 → 导入结果审阅
6. 设置:服务状态监控、修改密码、退出登录
**部署选项** **部署选项**
1. 本地 stdio`python server.py` 1. 本地 stdio`python server.py`
@@ -152,7 +168,7 @@
**迁移/批处理工具**`migrate_to_domains.py``reclassify_domains.py``reclassify_api.py``backfill_embeddings.py``write_memory.py``check_buckets.py``import_memory.py`(历史对话导入引擎) **迁移/批处理工具**`migrate_to_domains.py``reclassify_domains.py``reclassify_api.py``backfill_embeddings.py``write_memory.py``check_buckets.py``import_memory.py`(历史对话导入引擎)
**降级策略** **降级策略**
- 脱水 API 不可用 → 本地关键词提取 + 句子评分 - 脱水 API 不可用 → 直接抛 RuntimeError设计决策详见 BEHAVIOR_SPEC.md 三、降级行为表)
- 向量搜索不可用 → 纯 fuzzy match - 向量搜索不可用 → 纯 fuzzy match
- 逐条错误隔离grow 中单条失败不影响其他) - 逐条错误隔离grow 中单条失败不影响其他)
@@ -172,6 +188,7 @@
| `OMBRE_BUCKETS_DIR` | 记忆桶存储目录路径 | 否 | `""` → 回退到 config 或 `./buckets` | | `OMBRE_BUCKETS_DIR` | 记忆桶存储目录路径 | 否 | `""` → 回退到 config 或 `./buckets` |
| `OMBRE_HOOK_URL` | SessionStart 钩子调用的服务器 URL | 否 | `"http://localhost:8000"` | | `OMBRE_HOOK_URL` | SessionStart 钩子调用的服务器 URL | 否 | `"http://localhost:8000"` |
| `OMBRE_HOOK_SKIP` | 设为 `"1"` 跳过 SessionStart 钩子 | 否 | 未设置(不跳过) | | `OMBRE_HOOK_SKIP` | 设为 `"1"` 跳过 SessionStart 钩子 | 否 | 未设置(不跳过) |
| `OMBRE_DASHBOARD_PASSWORD` | 预设 Dashboard 访问密码;设置后覆盖文件密码,首次访问不弹设置向导 | 否 | `""` |
环境变量优先级:`环境变量 > config.yaml > 硬编码默认值`。所有环境变量在 `utils.py` 中读取并注入 config dict。 环境变量优先级:`环境变量 > config.yaml > 硬编码默认值`。所有环境变量在 `utils.py` 中读取并注入 config dict。
@@ -199,7 +216,7 @@
| `server.py` | MCP 服务器主入口,注册工具 + Dashboard API + 钩子端点 | `bucket_manager`, `dehydrator`, `decay_engine`, `embedding_engine`, `utils` | `test_tools.py` | | `server.py` | MCP 服务器主入口,注册工具 + Dashboard API + 钩子端点 | `bucket_manager`, `dehydrator`, `decay_engine`, `embedding_engine`, `utils` | `test_tools.py` |
| `bucket_manager.py` | 记忆桶 CRUD、多维索引搜索、wikilink 注入、激活更新 | `utils` | `server.py`, `check_buckets.py`, `backfill_embeddings.py` | | `bucket_manager.py` | 记忆桶 CRUD、多维索引搜索、wikilink 注入、激活更新 | `utils` | `server.py`, `check_buckets.py`, `backfill_embeddings.py` |
| `decay_engine.py` | 衰减引擎:遗忘曲线计算、自动归档、自动结案 | 无(接收 `bucket_mgr` 实例) | `server.py` | | `decay_engine.py` | 衰减引擎:遗忘曲线计算、自动归档、自动结案 | 无(接收 `bucket_mgr` 实例) | `server.py` |
| `dehydrator.py` | 数据脱水压缩 + 合并 + 自动打标LLM API + 本地降级 | `utils` | `server.py` | | `dehydrator.py` | 数据脱水压缩 + 合并 + 自动打标(LLM API,不可用时报 RuntimeError | `utils` | `server.py` |
| `embedding_engine.py` | 向量化引擎Gemini embedding API + SQLite + 余弦搜索 | `utils` | `server.py`, `backfill_embeddings.py` | | `embedding_engine.py` | 向量化引擎Gemini embedding API + SQLite + 余弦搜索 | `utils` | `server.py`, `backfill_embeddings.py` |
| `utils.py` | 配置加载、日志、路径安全、ID 生成、token 估算 | 无 | 所有模块 | | `utils.py` | 配置加载、日志、路径安全、ID 生成、token 估算 | 无 | 所有模块 |
| `write_memory.py` | 手动写入记忆 CLI绕过 MCP | 无(独立脚本) | 无 | | `write_memory.py` | 手动写入记忆 CLI绕过 MCP | 无(独立脚本) | 无 |
@@ -372,12 +389,12 @@
### 5.4 为什么有 dehydration脱水这一层 ### 5.4 为什么有 dehydration脱水这一层
**决策**:存入前先用 LLM 压缩内容(保留信息密度,去除冗余表达)API 不可用时降级到本地关键词提取 **决策**:存入前先用 LLM 压缩内容(保留信息密度,去除冗余表达)API 不可用时直接抛出 `RuntimeError`,不静默降级。
**理由** **理由**
- MCP 上下文有 token 限制,原始对话冗长,需要压缩 - MCP 上下文有 token 限制,原始对话冗长,需要压缩
- LLM 压缩能保留语义和情感色彩,纯截断会丢信息 - LLM 压缩能保留语义和情感色彩,纯截断会丢信息
- 降级到本地确保离线可用——关键词提取 + 句子排序 + 截断 - 本地关键词提取质量不足以替代语义打标与合并,静默降级会产生错误分类记忆,比报错更危险。详见 BEHAVIOR_SPEC.md 三、降级行为表。
**放弃方案**:只做截断。信息损失太大。 **放弃方案**:只做截断。信息损失太大。
@@ -479,3 +496,92 @@ type: dynamic
桶正文内容... 桶正文内容...
``` ```
---
## 7. Bug 修复记录 (B-01 至 B-10)
### B-01 — `update(resolved=True)` 自动归档 🔴 高
- **文件**: `bucket_manager.py``update()`
- **问题**: `resolved=True` 时立即调用 `_move_bucket(archive_dir)` 将桶移入 `archive/`
- **修复**: 移除 `_move_bucket` 逻辑resolved 桶留在 `dynamic/`,由 decay 引擎自然淘汰
- **影响**: 已解决的桶仍可被关键词检索命中(降权但不消失)
- **测试**: `tests/regression/test_issue_B01.py``tests/integration/test_scenario_07_trace.py`
### B-03 — `int()` 截断浮点 activation_count 🔴 高
- **文件**: `decay_engine.py``calculate_score()`
- **问题**: `max(1, int(activation_count))``_time_ripple` 写入的 1.3 截断为 1涟漪加成失效
- **修复**: 改为 `max(1.0, float(activation_count))`
- **影响**: 时间涟漪效果现在正确反映在 score 上;高频访问的桶衰减更慢
- **测试**: `tests/regression/test_issue_B03.py``tests/unit/test_calculate_score.py`
### B-04 — `create()` 初始化 activation_count=1 🟠 中
- **文件**: `bucket_manager.py``create()`
- **问题**: `activation_count=1` 导致冷启动检测条件 `== 0` 永不满足,新建重要桶无法浮现
- **修复**: 改为 `activation_count=0``touch()` 首次命中后变 1
- **测试**: `tests/regression/test_issue_B04.py``tests/integration/test_scenario_01_cold_start.py`
### B-05 — 时间衰减系数 0.1 过快 🟠 中
- **文件**: `bucket_manager.py``_calc_time_score()`
- **问题**: `math.exp(-0.1 * days)` 导致 30 天后得分仅剩 ≈0.05,远快于人类记忆曲线
- **修复**: 改为 `math.exp(-0.02 * days)`30 天后 ≈0.549
- **影响**: 记忆保留时间更符合人类认知模型
- **测试**: `tests/regression/test_issue_B05.py``tests/unit/test_score_components.py`
### B-06 — `w_time` 默认值 2.5 过高 🟠 中
- **文件**: `bucket_manager.py``_calc_final_score()`(或评分调用处)
- **问题**: `scoring.get("time_proximity", 2.5)` — 时间权重过高,近期低质量记忆得分高于高质量旧记忆
- **修复**: 改为 `scoring.get("time_proximity", 1.5)`
- **测试**: `tests/regression/test_issue_B06.py``tests/unit/test_score_components.py`
### B-07 — `content_weight` 默认值 3.0 过高 🟠 中
- **文件**: `bucket_manager.py``_calc_topic_score()`
- **问题**: `scoring.get("content_weight", 3.0)` — 内容权重远大于名字权重(×3),导致内容重复堆砌的桶得分高于名字精确匹配的桶
- **修复**: 改为 `scoring.get("content_weight", 1.0)`
- **影响**: 名字完全匹配 > 标签匹配 > 内容匹配的得分层级现在正确
- **测试**: `tests/regression/test_issue_B07.py``tests/unit/test_topic_score.py`
### B-08 — `run_decay_cycle()` 同轮 auto_resolve 后 score 未降权 🟡 低
- **文件**: `decay_engine.py``run_decay_cycle()`
- **问题**: `auto_resolve` 标记后立即用旧 `meta`stale计算 score`resolved_factor=0.05` 未生效
- **修复**: 在 `bucket_mgr.update(resolved=True)` 后立即执行 `meta["resolved"] = True`,确保同轮降权
- **测试**: `tests/regression/test_issue_B08.py``tests/integration/test_scenario_08_decay.py`
### B-09 — `hold()` 用 analyze() 覆盖用户传入的 valence/arousal 🟡 低
- **文件**: `server.py``hold()`
- **问题**: 先调 `analyze()`,再直接用结果覆盖用户传入的情感值,情感准确性丢失
- **修复**: 使用 `final_valence = user_valence if user_valence is not None else analyze_result.get("valence")`
- **影响**: 用户明确传入的情感坐标(包括 0.0)不再被 LLM 结果覆盖
- **测试**: `tests/regression/test_issue_B09.py``tests/integration/test_scenario_03_hold.py`
### B-10 — feel 桶 `domain=[]` 被填充为 `["未分类"]` 🟡 低
- **文件**: `bucket_manager.py``create()`
- **问题**: `if not domain: domain = ["未分类"]` 对所有桶类型生效feel 桶的空 domain 被错误填充
- **修复**: 改为 `if not domain and bucket_type != "feel": domain = ["未分类"]`
- **影响**: `breath(domain="feel")` 通道过滤逻辑现在正确feel 桶 domain 始终为空列表)
- **测试**: `tests/regression/test_issue_B10.py``tests/integration/test_scenario_10_feel.py`
---
### Bug 修复汇总表
| ID | 严重度 | 文件 | 方法 | 一句话描述 |
|---|---|---|---|---|
| B-01 | 🔴 高 | `bucket_manager.py` | `update()` | resolved 桶不再自动归档 |
| B-03 | 🔴 高 | `decay_engine.py` | `calculate_score()` | float activation_count 不被 int() 截断 |
| B-04 | 🟠 中 | `bucket_manager.py` | `create()` | 初始 activation_count=0 |
| B-05 | 🟠 中 | `bucket_manager.py` | `_calc_time_score()` | 时间衰减系数 0.02(原 0.1 |
| B-06 | 🟠 中 | `bucket_manager.py` | 评分权重配置 | w_time 默认 1.5(原 2.5 |
| B-07 | 🟠 中 | `bucket_manager.py` | `_calc_topic_score()` | content_weight 默认 1.0(原 3.0 |
| B-08 | 🟡 低 | `decay_engine.py` | `run_decay_cycle()` | auto_resolve 同轮应用 ×0.05 |
| B-09 | 🟡 低 | `server.py` | `hold()` | 用户 valence/arousal 优先 |
| B-10 | 🟡 低 | `bucket_manager.py` | `create()` | feel 桶 domain=[] 不被填充 |

View File

@@ -159,7 +159,7 @@ OMBRE_API_KEY=你的API密钥
> 3. Set `dehydration.base_url` to `https://generativelanguage.googleapis.com/v1beta/openai` in `config.yaml` > 3. Set `dehydration.base_url` to `https://generativelanguage.googleapis.com/v1beta/openai` in `config.yaml`
> Also supports DeepSeek, Ollama, LM Studio, vLLM, or any OpenAI-compatible API. > Also supports DeepSeek, Ollama, LM Studio, vLLM, or any OpenAI-compatible API.
没有 API key 也能用,脱水压缩会降级到本地模式,只是效果差一点。那就写 没有 API key 则脱水压缩和自动打标功能不可用(会报错),但记忆的读写和检索仍正常工作。如果暂时不用脱水功能,可以留空
``` ```
OMBRE_API_KEY= OMBRE_API_KEY=
@@ -435,6 +435,27 @@ Sensitive config via env vars:
- `OMBRE_API_KEY` — LLM API 密钥 - `OMBRE_API_KEY` — LLM API 密钥
- `OMBRE_TRANSPORT` — 覆盖传输方式 - `OMBRE_TRANSPORT` — 覆盖传输方式
- `OMBRE_BUCKETS_DIR` — 覆盖存储路径 - `OMBRE_BUCKETS_DIR` — 覆盖存储路径
- `OMBRE_DASHBOARD_PASSWORD` — Dashboard 访问密码(可选,见下)
## Dashboard 认证 / Dashboard Auth
自 v1.3.0 起Dashboard 和所有 `/api/*` 端点均受密码保护。
Since v1.3.0, the Dashboard and all `/api/*` endpoints are password-protected.
**首次访问**:若未设置密码,浏览器会弹出设置向导,填写并确认密码后即可使用。
**First visit**: If no password is set, a setup wizard will appear. Enter and confirm a password to get started.
**通过环境变量预设密码**:在 `docker-compose.user.yml` 中添加:
**Pre-set via env var** in your `docker-compose.user.yml`:
```yaml
environment:
- OMBRE_DASHBOARD_PASSWORD=your_password_here
```
设置后Dashboard 的"修改密码"功能将被禁用,必须通过环境变量修改。
When set, the in-Dashboard password change is disabled — modify the env var directly.
完整环境变量说明见 [ENV_VARS.md](ENV_VARS.md)。
Full env var reference: [ENV_VARS.md](ENV_VARS.md).
## 衰减公式 / Decay Formula ## 衰减公式 / Decay Formula
@@ -570,14 +591,14 @@ Dashboard浏览器打开 `http://localhost:8000/dashboard`
> **Free tier won't work**: Render free tier has **no persistent disk** — all memory data is lost on restart. It also sleeps on inactivity. **Starter plan ($7/mo) or above is required.** > **Free tier won't work**: Render free tier has **no persistent disk** — all memory data is lost on restart. It also sleeps on inactivity. **Starter plan ($7/mo) or above is required.**
项目根目录已包含 `render.yaml`,点击按钮后: 项目根目录已包含 `render.yaml`,点击按钮后:
1. (可选)设置 `OMBRE_API_KEY`:任何 OpenAI 兼容 API 的 key,不填则自动降级为本地关键词提取 1. 设置 `OMBRE_API_KEY`:任何 OpenAI 兼容 API 的 key**必需**,未设置时 hold/grow 会报错、仅检索类工具可用)
2. (可选)设置 `OMBRE_BASE_URL`API 地址,支持任意 OpenAI 化地址,如 `https://api.deepseek.com/v1` / `http://123.1.1.1:7689/v1` / `http://your-ollama:11434/v1` 2. (可选)设置 `OMBRE_BASE_URL`API 地址,支持任意 OpenAI 化地址,如 `https://api.deepseek.com/v1` / `http://123.1.1.1:7689/v1` / `http://your-ollama:11434/v1`
3. Render 自动挂载持久化磁盘到 `/opt/render/project/src/buckets` 3. Render 自动挂载持久化磁盘到 `/opt/render/project/src/buckets`
4. Dashboard`https://<你的服务名>.onrender.com/dashboard` 4. Dashboard`https://<你的服务名>.onrender.com/dashboard`
5. 部署后 MCP URL`https://<你的服务名>.onrender.com/mcp` 5. 部署后 MCP URL`https://<你的服务名>.onrender.com/mcp`
`render.yaml` is included. After clicking the button: `render.yaml` is included. After clicking the button:
1. (Optional) `OMBRE_API_KEY`: any OpenAI-compatible key; omit to fall back to local keyword extraction 1. `OMBRE_API_KEY`: any OpenAI-compatible key (**required** for hold/grow; without it those tools raise an error)
2. (Optional) `OMBRE_BASE_URL`: any OpenAI-compatible endpoint, e.g. `https://api.deepseek.com/v1`, `http://123.1.1.1:7689/v1`, `http://your-ollama:11434/v1` 2. (Optional) `OMBRE_BASE_URL`: any OpenAI-compatible endpoint, e.g. `https://api.deepseek.com/v1`, `http://123.1.1.1:7689/v1`, `http://your-ollama:11434/v1`
3. Persistent disk auto-mounts at `/opt/render/project/src/buckets` 3. Persistent disk auto-mounts at `/opt/render/project/src/buckets`
4. Dashboard: `https://<your-service>.onrender.com/dashboard` 4. Dashboard: `https://<your-service>.onrender.com/dashboard`
@@ -599,7 +620,7 @@ Dashboard浏览器打开 `http://localhost:8000/dashboard`
- Zeabur auto-detects the `Dockerfile` in root and builds via Docker - Zeabur auto-detects the `Dockerfile` in root and builds via Docker
2. **设置环境变量 / Set environment variables**(服务页面 → **Variables** 标签页) 2. **设置环境变量 / Set environment variables**(服务页面 → **Variables** 标签页)
- `OMBRE_API_KEY`可选)— LLM API 密钥,不填则自动降级为本地关键词提取 - `OMBRE_API_KEY`**必需**)— LLM API 密钥;未设置时 hold/grow/dream 会报错
- `OMBRE_BASE_URL`(可选)— API 地址,如 `https://api.deepseek.com/v1` - `OMBRE_BASE_URL`(可选)— API 地址,如 `https://api.deepseek.com/v1`
> ⚠️ **不需要**手动设置 `OMBRE_TRANSPORT` 和 `OMBRE_BUCKETS_DIR`Dockerfile 里已经设好了默认值。Zeabur 对单阶段 Dockerfile 会自动注入控制台设置的环境变量。 > ⚠️ **不需要**手动设置 `OMBRE_TRANSPORT` 和 `OMBRE_BUCKETS_DIR`Dockerfile 里已经设好了默认值。Zeabur 对单阶段 Dockerfile 会自动注入控制台设置的环境变量。
@@ -783,6 +804,51 @@ sudo systemctl restart ombre-brain # 示例
> - If `requirements.txt` changed, Docker rebuild handles it automatically; non-Docker users need `pip install -r requirements.txt` > - If `requirements.txt` changed, Docker rebuild handles it automatically; non-Docker users need `pip install -r requirements.txt`
> - After updating, visit `/health` to verify the service is running > - After updating, visit `/health` to verify the service is running
## 测试 / Testing
测试套件覆盖规格书所有场景(场景 0111以及 B-01 至 B-10 全部 bug 修复的回归测试。
The test suite covers all spec scenarios (0111) and regression tests for every bug fix (B-01 to B-10).
### 快速运行 / Quick Start
```bash
pip install pytest pytest-asyncio
pytest tests/ # 全部测试
pytest tests/unit/ # 单元测试
pytest tests/integration/ # 集成测试(场景全流程)
pytest tests/regression/ # 回归测试B-01..B-10
pytest tests/ -k "B01" # 单个回归测试
pytest tests/ -v # 详细输出
```
### 测试层级 / Test Layers
| 目录 Directory | 内容 Contents |
|---|---|
| `tests/unit/` | 单独测试 calculate_score、topic_score、时间得分、CRUD 等核心函数 |
| `tests/integration/` | 场景全流程冷启动、hold、search、trace、decay、feel 等 11 个场景 |
| `tests/regression/` | 每个 bugB-01 至 B-10独立回归测试含边界条件 |
### 回归测试覆盖 / Regression Coverage
| 文件 | Bug | 核心断言 |
|---|---|---|
| `test_issue_B01.py` | resolved 桶不再自动归档 | `update(resolved=True)` 后桶留在 `dynamic/`,搜索仍可命中,得分 ×0.05 |
| `test_issue_B03.py` | float activation_count 不被 int() 截断 | 1.3 > 1.0 得分,`_time_ripple` 写入 0.3 增量 |
| `test_issue_B04.py` | create() 初始 activation_count=0 | 新建桶满足冷启动条件touch() 后变 1 |
| `test_issue_B05.py` | 时间衰减系数 0.02(原 0.1| 30天 ≈ 0.549,非旧值 0.049 |
| `test_issue_B06.py` | w_time 默认 1.5(原 2.5| `BucketManager.w_time == 1.5` |
| `test_issue_B07.py` | content_weight 默认 1.0(原 3.0| 名字完全匹配得分 > 内容模糊匹配 |
| `test_issue_B08.py` | auto_resolve 同轮应用降权因子 | stale meta 修复后 score ×0.05 立即生效 |
| `test_issue_B09.py` | hold() 保留用户传入的 valence/arousal | 用户值优先于 analyze() 结果 |
| `test_issue_B10.py` | feel 桶 domain=[] 不被填充 | feel 桶保持 `[]`dynamic 桶正确填 `["未分类"]` |
> **测试隔离**:所有测试运行在 `tmp_path` 临时目录,绝不触碰真实记忆数据。
> **Test isolation**: All tests run in `tmp_path` — your real memory data is never touched.
---
## License ## License
MIT MIT

View File

@@ -13,7 +13,6 @@ Free tier: 1500 requests/day, so ~75 batches of 20.
import asyncio import asyncio
import argparse import argparse
import sys import sys
import time
sys.path.insert(0, ".") sys.path.insert(0, ".")
from utils import load_config from utils import load_config
@@ -79,7 +78,7 @@ async def backfill(batch_size: int = 20, dry_run: bool = False):
print(f" ERROR: {b['id'][:12]} ({name[:30]}): {e}") print(f" ERROR: {b['id'][:12]} ({name[:30]}): {e}")
if i + batch_size < total: if i + batch_size < total:
print(f" Waiting 2s before next batch...") print(" Waiting 2s before next batch...")
await asyncio.sleep(2) await asyncio.sleep(2)
print(f"\n=== Done: {success} success, {failed} failed, {total - success - failed} skipped ===") print(f"\n=== Done: {success} success, {failed} failed, {total - success - failed} skipped ===")

View File

@@ -1,205 +0,0 @@
# Ombre Brain
一个给 Claude 用的长期情绪记忆系统。基于 Russell 效价/唤醒度坐标打标Obsidian 做存储层MCP 接入,带遗忘曲线。
A long-term emotional memory system for Claude. Tags memories using Russell's valence/arousal coordinates, stores them as Obsidian-compatible Markdown, connects via MCP, and has a forgetting curve.
---
## 它是什么 / What is this
Claude 没有跨对话记忆。每次对话结束,之前聊过的所有东西都会消失。
Ombre Brain 给了它一套持久记忆——不是那种冷冰冰的键值存储,而是带情感坐标的、会自然衰减的、像人类记忆一样会遗忘和浮现的系统。
Claude has no cross-conversation memory. Everything from a previous chat vanishes once it ends.
Ombre Brain gives it persistent memory — not cold key-value storage, but a system with emotional coordinates, natural decay, and forgetting/surfacing mechanics that loosely mimic how human memory works.
核心特点 / Key features:
- **情感坐标打标 / Emotional tagging**: 每条记忆用 Russell 环形情感模型的 valence效价和 arousal唤醒度两个连续维度标记。不是"开心/难过"这种离散标签。
Each memory is tagged with two continuous dimensions from Russell's circumplex model: valence and arousal. Not discrete labels like "happy/sad".
- **自然遗忘 / Natural forgetting**: 改进版艾宾浩斯遗忘曲线。不活跃的记忆自动衰减归档,高情绪强度的记忆衰减更慢。
Modified Ebbinghaus forgetting curve. Inactive memories naturally decay and archive. High-arousal memories decay slower.
- **权重池浮现 / Weight pool surfacing**: 记忆不是被动检索的,它们会主动浮现——未解决的、情绪强烈的记忆权重更高,会在对话开头自动推送。
Memories aren't just passively retrieved — they actively surface. Unresolved, emotionally intense memories carry higher weight and get pushed at conversation start.
- **Obsidian 原生 / Obsidian-native**: 每个记忆桶就是一个 Markdown 文件YAML frontmatter 存元数据。可以直接在 Obsidian 里浏览、编辑、搜索。自动注入 `[[双链]]`
Each memory bucket is a Markdown file with YAML frontmatter. Browse, edit, and search directly in Obsidian. Wikilinks are auto-injected.
- **API 降级 / API degradation**: 脱水压缩和自动打标优先用廉价 LLM APIDeepSeek 等API 不可用时自动降级到本地关键词分析——始终可用。
Dehydration and auto-tagging prefer a cheap LLM API (DeepSeek etc.). When the API is unavailable, it degrades to local keyword analysis — always functional.
## 边界说明 / Design boundaries
官方记忆功能已经在做身份层的事了——你是谁你有什么偏好你们的关系是什么。那一层交给它Ombre Brain不打算造重复的轮子。
Ombre Brain 的边界是时间里发生的事,不是你是谁。它记住的是:你们聊过什么,经历了什么,哪些事情还悬在那里没有解决。两层配合用,才是完整的。
每次新对话Claude 从零开始——但它能从 Ombre Brain 里找回跟你有关的一切。不是重建,是接续。
---
Official memory already handles the identity layer — who you are, what you prefer, what your relationship is. That layer belongs there. Ombre Brain isn't trying to duplicate it.
Ombre Brain's boundary is *what happened in time*, not *who you are*. It holds conversations, experiences, unresolved things. The two layers together are what make it feel complete.
Each new conversation starts fresh — but Claude can reach back through Ombre Brain and find everything that happened between you. Not a rebuild. A continuation.
## 架构 / Architecture
```
Claude ←→ MCP Protocol ←→ server.py
┌───────────────┼───────────────┐
│ │ │
bucket_manager dehydrator decay_engine
(CRUD + 搜索) (压缩 + 打标) (遗忘曲线)
Obsidian Vault (Markdown files)
```
5 个 MCP 工具 / 5 MCP tools:
| 工具 Tool | 作用 Purpose |
|-----------|-------------|
| `breath` | 浮现或检索记忆。无参数=推送未解决记忆;有参数=关键词+情感检索 / Surface or search memories |
| `hold` | 存储单条记忆,自动打标+合并相似桶 / Store a single memory with auto-tagging |
| `grow` | 日记归档,自动拆分长内容为多个记忆桶 / Diary digest, auto-split into multiple buckets |
| `trace` | 修改元数据、标记已解决、删除 / Modify metadata, mark resolved, delete |
| `pulse` | 系统状态 + 所有记忆桶列表 / System status + bucket listing |
## 安装 / Setup
### 环境要求 / Requirements
- Python 3.11+
- 一个 Obsidian Vault可选不用也行会在项目目录下自建 `buckets/`
An Obsidian vault (optional — without one, it uses a local `buckets/` directory)
### 步骤 / Steps
```bash
git clone https://github.com/P0lar1zzZ/Ombre-Brain.git
cd Ombre-Brain
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
```
复制配置文件并按需修改 / Copy config and edit as needed:
```bash
cp config.example.yaml config.yaml
```
如果你要用 API 做脱水压缩和自动打标(推荐,效果好很多),设置环境变量:
If you want API-powered dehydration and tagging (recommended, much better quality):
```bash
export OMBRE_API_KEY="your-api-key"
```
支持任何 OpenAI 兼容 API。在 `config.yaml` 里改 `base_url``model` 就行。
Supports any OpenAI-compatible API. Just change `base_url` and `model` in `config.yaml`.
### 接入 Claude Desktop / Connect to Claude Desktop
在 Claude Desktop 配置文件中添加macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
Add to your Claude Desktop config:
```json
{
"mcpServers": {
"ombre-brain": {
"command": "python",
"args": ["/path/to/Ombre-Brain/server.py"],
"env": {
"OMBRE_API_KEY": "your-api-key"
}
}
}
}
```
### 接入 Claude.ai (远程) / Connect to Claude.ai (remote)
需要 HTTP 传输 + 隧道。可以用 Docker
Requires HTTP transport + tunnel. Docker setup:
```bash
echo "OMBRE_API_KEY=your-api-key" > .env
docker-compose up -d
```
`docker-compose.yml` 里配好了 Cloudflare Tunnel。你需要自己在 `~/.cloudflared/` 下放凭证和路由配置。
The `docker-compose.yml` includes Cloudflare Tunnel. You'll need your own credentials under `~/.cloudflared/`.
### 指向 Obsidian / Point to Obsidian
`config.yaml` 里设置 `buckets_dir`
Set `buckets_dir` in `config.yaml`:
```yaml
buckets_dir: "/path/to/your/Obsidian Vault/Ombre Brain"
```
不设的话,默认用项目目录下的 `buckets/`
If not set, defaults to `buckets/` in the project directory.
## 配置 / Configuration
所有参数在 `config.yaml`(从 `config.example.yaml` 复制)。关键的几个:
All parameters in `config.yaml` (copy from `config.example.yaml`). Key ones:
| 参数 Parameter | 说明 Description | 默认 Default |
|---|---|---|
| `transport` | `stdio`(本地)/ `streamable-http`(远程)| `stdio` |
| `buckets_dir` | 记忆桶存储路径 / Bucket storage path | `./buckets/` |
| `dehydration.model` | 脱水用的 LLM 模型 / LLM model for dehydration | `deepseek-chat` |
| `dehydration.base_url` | API 地址 / API endpoint | `https://api.deepseek.com/v1` |
| `decay.lambda` | 衰减速率,越大越快忘 / Decay rate | `0.05` |
| `decay.threshold` | 归档阈值 / Archive threshold | `0.3` |
| `merge_threshold` | 合并相似度阈值 (0-100) / Merge similarity | `75` |
敏感配置用环境变量:
Sensitive config via env vars:
- `OMBRE_API_KEY` — LLM API 密钥
- `OMBRE_TRANSPORT` — 覆盖传输方式
- `OMBRE_BUCKETS_DIR` — 覆盖存储路径
## 衰减公式 / Decay Formula
$$Score = Importance \times activation\_count^{0.3} \times e^{-\lambda \times days} \times (base + arousal \times boost)$$
- `importance`: 1-10记忆重要性 / memory importance
- `activation_count`: 被检索的次数,越常被想起衰减越慢 / retrieval count; more recalls = slower decay
- `days`: 距上次激活的天数 / days since last activation
- `arousal`: 唤醒度,越强烈的记忆越难忘 / arousal; intense memories are harder to forget
- 已解决的记忆权重降到 5%,沉底等被关键词唤醒 / resolved memories drop to 5%, sink until keyword-triggered
## 给 Claude 的使用指南 / Usage Guide for Claude
`CLAUDE_PROMPT.md` 是写给 Claude 看的使用说明。放到你的 system prompt 或 custom instructions 里就行。
`CLAUDE_PROMPT.md` is the usage guide written for Claude. Put it in your system prompt or custom instructions.
## 工具脚本 / Utility Scripts
| 脚本 Script | 用途 Purpose |
|---|---|
| `write_memory.py` | 手动写入记忆,绕过 MCP / Manually write memories, bypass MCP |
| `migrate_to_domains.py` | 迁移平铺文件到域子目录 / Migrate flat files to domain subdirs |
| `reclassify_domains.py` | 基于关键词重分类 / Reclassify by keywords |
| `reclassify_api.py` | 用 API 重打标未分类桶 / Re-tag uncategorized buckets via API |
| `test_smoke.py` | 冒烟测试 / Smoke test |
## License
MIT

View File

@@ -1,755 +0,0 @@
# ============================================================
# Module: Memory Bucket Manager (bucket_manager.py)
# 模块:记忆桶管理器
#
# CRUD operations, multi-dimensional index search, activation updates
# for memory buckets.
# 记忆桶的增删改查、多维索引搜索、激活更新。
#
# Core design:
# 核心逻辑:
# - Each bucket = one Markdown file (YAML frontmatter + body)
# 每个记忆桶 = 一个 Markdown 文件
# - Storage by type: permanent / dynamic / archive
# 存储按类型分目录
# - Multi-dimensional soft index: domain + valence/arousal + fuzzy text
# 多维软索引:主题域 + 情感坐标 + 文本模糊匹配
# - Search strategy: domain pre-filter → weighted multi-dim ranking
# 搜索策略:主题域预筛 → 多维加权精排
# - Emotion coordinates based on Russell circumplex model:
# 情感坐标基于环形情感模型Russell circumplex
# valence (0~1): 0=negative → 1=positive
# arousal (0~1): 0=calm → 1=excited
#
# Depended on by: server.py, decay_engine.py
# 被谁依赖server.py, decay_engine.py
# ============================================================
import os
import math
import logging
import re
import shutil
from collections import Counter
from datetime import datetime
from pathlib import Path
from typing import Optional
import frontmatter
import jieba
from rapidfuzz import fuzz
from utils import generate_bucket_id, sanitize_name, safe_path, now_iso
logger = logging.getLogger("ombre_brain.bucket")
class BucketManager:
"""
Memory bucket manager — entry point for all bucket CRUD operations.
Buckets are stored as Markdown files with YAML frontmatter for metadata
and body for content. Natively compatible with Obsidian browsing/editing.
记忆桶管理器 —— 所有桶的 CRUD 操作入口。
桶以 Markdown 文件存储YAML frontmatter 存元数据,正文存内容。
天然兼容 Obsidian 直接浏览和编辑。
"""
def __init__(self, config: dict):
# --- Read storage paths from config / 从配置中读取存储路径 ---
self.base_dir = config["buckets_dir"]
self.permanent_dir = os.path.join(self.base_dir, "permanent")
self.dynamic_dir = os.path.join(self.base_dir, "dynamic")
self.archive_dir = os.path.join(self.base_dir, "archive")
self.fuzzy_threshold = config.get("matching", {}).get("fuzzy_threshold", 50)
self.max_results = config.get("matching", {}).get("max_results", 5)
# --- Wikilink config / 双链配置 ---
wikilink_cfg = config.get("wikilink", {})
self.wikilink_enabled = wikilink_cfg.get("enabled", True)
self.wikilink_use_tags = wikilink_cfg.get("use_tags", False)
self.wikilink_use_domain = wikilink_cfg.get("use_domain", True)
self.wikilink_use_auto_keywords = wikilink_cfg.get("use_auto_keywords", True)
self.wikilink_auto_top_k = wikilink_cfg.get("auto_top_k", 8)
self.wikilink_min_len = wikilink_cfg.get("min_keyword_len", 2)
self.wikilink_exclude_keywords = set(wikilink_cfg.get("exclude_keywords", []))
self.wikilink_stopwords = {
"", "", "", "", "", "", "", "", "", "",
"", "一个", "", "", "", "", "", "", "",
"", "", "", "没有", "", "", "自己", "", "", "",
"我们", "你们", "他们", "然后", "今天", "昨天", "明天", "一下",
"the", "and", "for", "are", "but", "not", "you", "all", "can",
"had", "her", "was", "one", "our", "out", "has", "have", "with",
"this", "that", "from", "they", "been", "said", "will", "each",
}
self.wikilink_stopwords |= {w.lower() for w in self.wikilink_exclude_keywords}
# --- Search scoring weights / 检索权重配置 ---
scoring = config.get("scoring_weights", {})
self.w_topic = scoring.get("topic_relevance", 4.0)
self.w_emotion = scoring.get("emotion_resonance", 2.0)
self.w_time = scoring.get("time_proximity", 1.5)
self.w_importance = scoring.get("importance", 1.0)
# ---------------------------------------------------------
# Create a new bucket
# 创建新桶
# Write content and metadata into a .md file
# 将内容和元数据写入一个 .md 文件
# ---------------------------------------------------------
async def create(
self,
content: str,
tags: list[str] = None,
importance: int = 5,
domain: list[str] = None,
valence: float = 0.5,
arousal: float = 0.3,
bucket_type: str = "dynamic",
name: str = None,
) -> str:
"""
Create a new memory bucket, return bucket ID.
创建一个新的记忆桶,返回桶 ID。
"""
bucket_id = generate_bucket_id()
bucket_name = sanitize_name(name) if name else bucket_id
domain = domain or ["未分类"]
tags = tags or []
linked_content = self._apply_wikilinks(content, tags, domain, bucket_name)
# --- Build YAML frontmatter metadata / 构建元数据 ---
metadata = {
"id": bucket_id,
"name": bucket_name,
"tags": tags,
"domain": domain,
"valence": max(0.0, min(1.0, valence)),
"arousal": max(0.0, min(1.0, arousal)),
"importance": max(1, min(10, importance)),
"type": bucket_type,
"created": now_iso(),
"last_active": now_iso(),
"activation_count": 1,
}
# --- Assemble Markdown file (frontmatter + body) ---
# --- 组装 Markdown 文件 ---
post = frontmatter.Post(linked_content, **metadata)
# --- Choose directory by type + primary domain ---
# --- 按类型 + 主题域选择存储目录 ---
type_dir = self.permanent_dir if bucket_type == "permanent" else self.dynamic_dir
primary_domain = sanitize_name(domain[0]) if domain else "未分类"
target_dir = os.path.join(type_dir, primary_domain)
os.makedirs(target_dir, exist_ok=True)
# --- Filename: readable_name_bucketID.md (Obsidian friendly) ---
# --- 文件名可读名称_桶ID.md ---
if bucket_name and bucket_name != bucket_id:
filename = f"{bucket_name}_{bucket_id}.md"
else:
filename = f"{bucket_id}.md"
file_path = safe_path(target_dir, filename)
try:
with open(file_path, "w", encoding="utf-8") as f:
f.write(frontmatter.dumps(post))
except OSError as e:
logger.error(f"Failed to write bucket file / 写入桶文件失败: {file_path}: {e}")
raise
logger.info(
f"Created bucket / 创建记忆桶: {bucket_id} ({bucket_name}) → {primary_domain}/"
)
return bucket_id
# ---------------------------------------------------------
# Read bucket content
# 读取桶内容
# Returns {"id", "metadata", "content", "path"} or None
# ---------------------------------------------------------
async def get(self, bucket_id: str) -> Optional[dict]:
"""
Read a single bucket by ID.
根据 ID 读取单个桶。
"""
if not bucket_id or not isinstance(bucket_id, str):
return None
file_path = self._find_bucket_file(bucket_id)
if not file_path:
return None
return self._load_bucket(file_path)
# ---------------------------------------------------------
# Update bucket
# 更新桶
# Supports: content, tags, importance, valence, arousal, name, resolved
# ---------------------------------------------------------
async def update(self, bucket_id: str, **kwargs) -> bool:
"""
Update bucket content or metadata fields.
更新桶的内容或元数据字段。
"""
file_path = self._find_bucket_file(bucket_id)
if not file_path:
return False
try:
post = frontmatter.load(file_path)
except Exception as e:
logger.warning(f"Failed to load bucket for update / 加载桶失败: {file_path}: {e}")
return False
# --- Update only fields that were passed in / 只改传入的字段 ---
if "content" in kwargs:
next_tags = kwargs.get("tags", post.get("tags", []))
next_domain = kwargs.get("domain", post.get("domain", []))
next_name = kwargs.get("name", post.get("name", ""))
post.content = self._apply_wikilinks(
kwargs["content"],
next_tags,
next_domain,
next_name,
)
if "tags" in kwargs:
post["tags"] = kwargs["tags"]
if "importance" in kwargs:
post["importance"] = max(1, min(10, int(kwargs["importance"])))
if "domain" in kwargs:
post["domain"] = kwargs["domain"]
if "valence" in kwargs:
post["valence"] = max(0.0, min(1.0, float(kwargs["valence"])))
if "arousal" in kwargs:
post["arousal"] = max(0.0, min(1.0, float(kwargs["arousal"])))
if "name" in kwargs:
post["name"] = sanitize_name(kwargs["name"])
if "resolved" in kwargs:
post["resolved"] = bool(kwargs["resolved"])
# --- Auto-refresh activation time / 自动刷新激活时间 ---
post["last_active"] = now_iso()
try:
with open(file_path, "w", encoding="utf-8") as f:
f.write(frontmatter.dumps(post))
except OSError as e:
logger.error(f"Failed to write bucket update / 写入桶更新失败: {file_path}: {e}")
return False
logger.info(f"Updated bucket / 更新记忆桶: {bucket_id}")
return True
# ---------------------------------------------------------
# Wikilink injection
# 自动添加 Obsidian 双链
# ---------------------------------------------------------
def _apply_wikilinks(
self,
content: str,
tags: list[str],
domain: list[str],
name: str,
) -> str:
"""
Auto-inject Obsidian wikilinks, avoiding double-wrapping existing [[...]].
自动添加 Obsidian 双链,避免重复包裹已有 [[...]]。
"""
if not self.wikilink_enabled or not content:
return content
keywords = self._collect_wikilink_keywords(content, tags, domain, name)
if not keywords:
return content
# Split on existing wikilinks to avoid wrapping them again
# 按已有双链切分,避免重复包裹
segments = re.split(r"(\[\[[^\]]+\]\])", content)
pattern = re.compile("|".join(re.escape(kw) for kw in keywords))
for i, segment in enumerate(segments):
if segment.startswith("[[") and segment.endswith("]]"):
continue
updated = pattern.sub(lambda m: f"[[{m.group(0)}]]", segment)
segments[i] = updated
return "".join(segments)
def _collect_wikilink_keywords(
self,
content: str,
tags: list[str],
domain: list[str],
name: str,
) -> list[str]:
"""
Collect candidate keywords from tags/domain/auto-extraction.
汇总候选关键词:可选 tags/domain + 自动提词。
"""
candidates = []
if self.wikilink_use_tags:
candidates.extend(tags or [])
if self.wikilink_use_domain:
candidates.extend(domain or [])
if name:
candidates.append(name)
if self.wikilink_use_auto_keywords:
candidates.extend(self._extract_auto_keywords(content))
return self._normalize_keywords(candidates)
def _normalize_keywords(self, keywords: list[str]) -> list[str]:
"""
Deduplicate and sort by length (longer first to avoid short words
breaking long ones during replacement).
去重并按长度排序,优先替换长词。
"""
if not keywords:
return []
seen = set()
cleaned = []
for keyword in keywords:
if not isinstance(keyword, str):
continue
kw = keyword.strip()
if len(kw) < self.wikilink_min_len:
continue
if kw in self.wikilink_exclude_keywords:
continue
if kw.lower() in self.wikilink_stopwords:
continue
if kw in seen:
continue
seen.add(kw)
cleaned.append(kw)
return sorted(cleaned, key=len, reverse=True)
def _extract_auto_keywords(self, content: str) -> list[str]:
"""
Auto-extract keywords from body text, prioritizing high-frequency words.
从正文自动提词,优先高频词。
"""
if not content:
return []
try:
zh_words = [w.strip() for w in jieba.lcut(content) if w.strip()]
except Exception:
zh_words = []
en_words = re.findall(r"[A-Za-z][A-Za-z0-9_-]{2,20}", content)
# Chinese bigrams / 中文双词组合
zh_bigrams = []
for i in range(len(zh_words) - 1):
left = zh_words[i]
right = zh_words[i + 1]
if len(left) < self.wikilink_min_len or len(right) < self.wikilink_min_len:
continue
if not re.fullmatch(r"[\u4e00-\u9fff]+", left + right):
continue
if len(left + right) > 8:
continue
zh_bigrams.append(left + right)
merged = []
for word in zh_words + zh_bigrams + en_words:
if len(word) < self.wikilink_min_len:
continue
if re.fullmatch(r"\d+", word):
continue
if word.lower() in self.wikilink_stopwords:
continue
merged.append(word)
if not merged:
return []
counter = Counter(merged)
return [w for w, _ in counter.most_common(self.wikilink_auto_top_k)]
# ---------------------------------------------------------
# Delete bucket
# 删除桶
# ---------------------------------------------------------
async def delete(self, bucket_id: str) -> bool:
"""
Delete a memory bucket file.
删除指定的记忆桶文件。
"""
file_path = self._find_bucket_file(bucket_id)
if not file_path:
return False
try:
os.remove(file_path)
except OSError as e:
logger.error(f"Failed to delete bucket file / 删除桶文件失败: {file_path}: {e}")
return False
logger.info(f"Deleted bucket / 删除记忆桶: {bucket_id}")
return True
# ---------------------------------------------------------
# Touch bucket (refresh activation time + increment count)
# 触碰桶(刷新激活时间 + 累加激活次数)
# Called on every recall hit; affects decay score.
# 每次检索命中时调用,影响衰减得分。
# ---------------------------------------------------------
async def touch(self, bucket_id: str) -> None:
"""
Update a bucket's last activation time and count.
更新桶的最后激活时间和激活次数。
"""
file_path = self._find_bucket_file(bucket_id)
if not file_path:
return
try:
post = frontmatter.load(file_path)
post["last_active"] = now_iso()
post["activation_count"] = post.get("activation_count", 0) + 1
with open(file_path, "w", encoding="utf-8") as f:
f.write(frontmatter.dumps(post))
except Exception as e:
logger.warning(f"Failed to touch bucket / 触碰桶失败: {bucket_id}: {e}")
# ---------------------------------------------------------
# Multi-dimensional search (core feature)
# 多维搜索(核心功能)
#
# Strategy: domain pre-filter → weighted multi-dim ranking
# 策略:主题域预筛 → 多维加权精排
#
# Ranking formula:
# total = topic(×w_topic) + emotion(×w_emotion)
# + time(×w_time) + importance(×w_importance)
#
# Per-dimension scores (normalized to 0~1):
# topic = rapidfuzz weighted match (name/tags/domain/body)
# emotion = 1 - Euclidean distance (query v/a vs bucket v/a)
# time = e^(-0.02 × days) (recent memories first)
# importance = importance / 10
# ---------------------------------------------------------
async def search(
self,
query: str,
limit: int = None,
domain_filter: list[str] = None,
query_valence: float = None,
query_arousal: float = None,
) -> list[dict]:
"""
Multi-dimensional indexed search for memory buckets.
多维索引搜索记忆桶。
domain_filter: pre-filter by domain (None = search all)
query_valence/arousal: emotion coordinates for resonance scoring
"""
if not query or not query.strip():
return []
limit = limit or self.max_results
all_buckets = await self.list_all(include_archive=False)
if not all_buckets:
return []
# --- Layer 1: domain pre-filter (fast scope reduction) ---
# --- 第一层:主题域预筛(快速缩小范围)---
if domain_filter:
filter_set = {d.lower() for d in domain_filter}
candidates = [
b for b in all_buckets
if {d.lower() for d in b["metadata"].get("domain", [])} & filter_set
]
# Fall back to full search if pre-filter yields nothing
# 预筛为空则回退全量搜索
if not candidates:
candidates = all_buckets
else:
candidates = all_buckets
# --- Layer 2: weighted multi-dim ranking ---
# --- 第二层:多维加权精排 ---
scored = []
for bucket in candidates:
meta = bucket.get("metadata", {})
try:
# Dim 1: topic relevance (fuzzy text, 0~1)
topic_score = self._calc_topic_score(query, bucket)
# Dim 2: emotion resonance (coordinate distance, 0~1)
emotion_score = self._calc_emotion_score(
query_valence, query_arousal, meta
)
# Dim 3: time proximity (exponential decay, 0~1)
time_score = self._calc_time_score(meta)
# Dim 4: importance (direct normalization)
importance_score = max(1, min(10, int(meta.get("importance", 5)))) / 10.0
# --- Weighted sum / 加权求和 ---
total = (
topic_score * self.w_topic
+ emotion_score * self.w_emotion
+ time_score * self.w_time
+ importance_score * self.w_importance
)
# Normalize to 0~100 for readability
weight_sum = self.w_topic + self.w_emotion + self.w_time + self.w_importance
normalized = (total / weight_sum) * 100 if weight_sum > 0 else 0
# Resolved buckets get ranking penalty (but still reachable by keyword)
# 已解决的桶降权排序(但仍可被关键词激活)
if meta.get("resolved", False):
normalized *= 0.3
if normalized >= self.fuzzy_threshold:
bucket["score"] = round(normalized, 2)
scored.append(bucket)
except Exception as e:
logger.warning(
f"Scoring failed for bucket {bucket.get('id', '?')} / "
f"桶评分失败: {e}"
)
continue
scored.sort(key=lambda x: x["score"], reverse=True)
return scored[:limit]
# ---------------------------------------------------------
# Topic relevance sub-score:
# name(×3) + domain(×2.5) + tags(×2) + body(×1)
# 文本相关性子分:桶名(×3) + 主题域(×2.5) + 标签(×2) + 正文(×1)
# ---------------------------------------------------------
def _calc_topic_score(self, query: str, bucket: dict) -> float:
"""
Calculate text dimension relevance score (0~1).
计算文本维度的相关性得分。
"""
meta = bucket.get("metadata", {})
name_score = fuzz.partial_ratio(query, meta.get("name", "")) * 3
domain_score = (
max(
(fuzz.partial_ratio(query, d) for d in meta.get("domain", [])),
default=0,
)
* 2.5
)
tag_score = (
max(
(fuzz.partial_ratio(query, tag) for tag in meta.get("tags", [])),
default=0,
)
* 2
)
content_score = fuzz.partial_ratio(query, bucket.get("content", "")[:500]) * 1
return (name_score + domain_score + tag_score + content_score) / (100 * 8.5)
# ---------------------------------------------------------
# Emotion resonance sub-score:
# Based on Russell circumplex Euclidean distance
# 情感共鸣子分:基于环形情感模型的欧氏距离
# No emotion in query → neutral 0.5 (doesn't affect ranking)
# ---------------------------------------------------------
def _calc_emotion_score(
self, q_valence: float, q_arousal: float, meta: dict
) -> float:
"""
Calculate emotion resonance score (0~1, closer = higher).
计算情感共鸣度0~1越近越高
"""
if q_valence is None or q_arousal is None:
return 0.5 # No emotion coordinates → neutral / 无情感坐标时给中性分
try:
b_valence = float(meta.get("valence", 0.5))
b_arousal = float(meta.get("arousal", 0.3))
except (ValueError, TypeError):
return 0.5
# Euclidean distance, max sqrt(2) ≈ 1.414
dist = math.sqrt((q_valence - b_valence) ** 2 + (q_arousal - b_arousal) ** 2)
return max(0.0, 1.0 - dist / 1.414)
# ---------------------------------------------------------
# Time proximity sub-score:
# More recent activation → higher score
# 时间亲近子分:距上次激活越近分越高
# ---------------------------------------------------------
def _calc_time_score(self, meta: dict) -> float:
"""
Calculate time proximity score (0~1, more recent = higher).
计算时间亲近度。
"""
last_active_str = meta.get("last_active", meta.get("created", ""))
try:
last_active = datetime.fromisoformat(str(last_active_str))
days = max(0.0, (datetime.now() - last_active).total_seconds() / 86400)
except (ValueError, TypeError):
days = 30
return math.exp(-0.02 * days)
# ---------------------------------------------------------
# List all buckets
# 列出所有桶
# ---------------------------------------------------------
async def list_all(self, include_archive: bool = False) -> list[dict]:
"""
Recursively walk directories (including domain subdirs), list all buckets.
递归遍历目录(含域子目录),列出所有记忆桶。
"""
buckets = []
dirs = [self.permanent_dir, self.dynamic_dir]
if include_archive:
dirs.append(self.archive_dir)
for dir_path in dirs:
if not os.path.exists(dir_path):
continue
for root, _, files in os.walk(dir_path):
for filename in files:
if not filename.endswith(".md"):
continue
file_path = os.path.join(root, filename)
bucket = self._load_bucket(file_path)
if bucket:
buckets.append(bucket)
return buckets
# ---------------------------------------------------------
# Statistics (counts per category + total size)
# 统计信息(各分类桶数量 + 总体积)
# ---------------------------------------------------------
async def get_stats(self) -> dict:
"""
Return memory bucket statistics (including domain subdirs).
返回记忆桶的统计数据。
"""
stats = {
"permanent_count": 0,
"dynamic_count": 0,
"archive_count": 0,
"total_size_kb": 0.0,
"domains": {},
}
for subdir, key in [
(self.permanent_dir, "permanent_count"),
(self.dynamic_dir, "dynamic_count"),
(self.archive_dir, "archive_count"),
]:
if not os.path.exists(subdir):
continue
for root, _, files in os.walk(subdir):
for f in files:
if f.endswith(".md"):
stats[key] += 1
fpath = os.path.join(root, f)
try:
stats["total_size_kb"] += os.path.getsize(fpath) / 1024
except OSError:
pass
# Per-domain counts / 每个域的桶数量
domain_name = os.path.basename(root)
if domain_name != os.path.basename(subdir):
stats["domains"][domain_name] = stats["domains"].get(domain_name, 0) + 1
return stats
# ---------------------------------------------------------
# Archive bucket (move from permanent/dynamic into archive)
# 归档桶(从 permanent/dynamic 移入 archive
# Called by decay engine to simulate "forgetting"
# 由衰减引擎调用,模拟"遗忘"
# ---------------------------------------------------------
async def archive(self, bucket_id: str) -> bool:
"""
Move a bucket into the archive directory (preserving domain subdirs).
将指定桶移入归档目录(保留域子目录结构)。
"""
file_path = self._find_bucket_file(bucket_id)
if not file_path:
return False
try:
# Read once, get domain info and update type / 一次性读取
post = frontmatter.load(file_path)
domain = post.get("domain", ["未分类"])
primary_domain = sanitize_name(domain[0]) if domain else "未分类"
archive_subdir = os.path.join(self.archive_dir, primary_domain)
os.makedirs(archive_subdir, exist_ok=True)
dest = safe_path(archive_subdir, os.path.basename(file_path))
# Update type marker then move file / 更新类型标记后移动文件
post["type"] = "archived"
with open(file_path, "w", encoding="utf-8") as f:
f.write(frontmatter.dumps(post))
# Use shutil.move for cross-filesystem safety
# 使用 shutil.move 保证跨文件系统安全
shutil.move(file_path, str(dest))
except Exception as e:
logger.error(
f"Failed to archive bucket / 归档桶失败: {bucket_id}: {e}"
)
return False
logger.info(f"Archived bucket / 归档记忆桶: {bucket_id} → archive/{primary_domain}/")
return True
# ---------------------------------------------------------
# Internal: find bucket file across all three directories
# 内部:在三个目录中查找桶文件
# ---------------------------------------------------------
def _find_bucket_file(self, bucket_id: str) -> Optional[str]:
"""
Recursively search permanent/dynamic/archive for a bucket file
matching the given ID.
在 permanent/dynamic/archive 中递归查找指定 ID 的桶文件。
"""
if not bucket_id:
return None
for dir_path in [self.permanent_dir, self.dynamic_dir, self.archive_dir]:
if not os.path.exists(dir_path):
continue
for root, _, files in os.walk(dir_path):
for fname in files:
if not fname.endswith(".md"):
continue
# Match by exact ID segment in filename
# 通过文件名中的 ID 片段精确匹配
if bucket_id in fname:
return os.path.join(root, fname)
return None
# ---------------------------------------------------------
# Internal: load bucket data from .md file
# 内部:从 .md 文件加载桶数据
# ---------------------------------------------------------
def _load_bucket(self, file_path: str) -> Optional[dict]:
"""
Parse a Markdown file and return structured bucket data.
解析 Markdown 文件,返回桶的结构化数据。
"""
try:
post = frontmatter.load(file_path)
return {
"id": post.get("id", Path(file_path).stem),
"metadata": dict(post.metadata),
"content": post.content,
"path": file_path,
}
except Exception as e:
logger.warning(
f"Failed to load bucket file / 加载桶文件失败: {file_path}: {e}"
)
return None

View File

@@ -1,242 +0,0 @@
# ============================================================
# Module: Memory Decay Engine (decay_engine.py)
# 模块:记忆衰减引擎
#
# Simulates human forgetting curve; auto-decays inactive memories and archives them.
# 模拟人类遗忘曲线,自动衰减不活跃记忆并归档。
#
# Core formula (improved Ebbinghaus + emotion coordinates):
# 核心公式(改进版艾宾浩斯遗忘曲线 + 情感坐标):
# Score = Importance × (activation_count^0.3) × e^(-λ×days) × emotion_weight
#
# Emotion weight (continuous coordinate, not discrete labels):
# 情感权重(基于连续坐标而非离散列举):
# emotion_weight = base + (arousal × arousal_boost)
# Higher arousal → higher emotion weight → slower decay
# 唤醒度越高 → 情感权重越大 → 记忆衰减越慢
#
# Depended on by: server.py
# 被谁依赖server.py
# ============================================================
import math
import asyncio
import logging
from datetime import datetime
logger = logging.getLogger("ombre_brain.decay")
class DecayEngine:
"""
Memory decay engine — periodically scans all dynamic buckets,
calculates decay scores, auto-archives low-activity buckets
to simulate natural forgetting.
记忆衰减引擎 —— 定期扫描所有动态桶,
计算衰减得分,将低活跃桶自动归档,模拟自然遗忘。
"""
def __init__(self, config: dict, bucket_mgr):
# --- Load decay parameters / 加载衰减参数 ---
decay_cfg = config.get("decay", {})
self.decay_lambda = decay_cfg.get("lambda", 0.05)
self.threshold = decay_cfg.get("threshold", 0.3)
self.check_interval = decay_cfg.get("check_interval_hours", 24)
# --- Emotion weight params (continuous arousal coordinate) ---
# --- 情感权重参数(基于连续 arousal 坐标)---
emotion_cfg = decay_cfg.get("emotion_weights", {})
self.emotion_base = emotion_cfg.get("base", 1.0)
self.arousal_boost = emotion_cfg.get("arousal_boost", 0.8)
self.bucket_mgr = bucket_mgr
# --- Background task control / 后台任务控制 ---
self._task: asyncio.Task | None = None
self._running = False
@property
def is_running(self) -> bool:
"""Whether the decay engine is running in the background.
衰减引擎是否正在后台运行。"""
return self._running
# ---------------------------------------------------------
# Core: calculate decay score for a single bucket
# 核心:计算单个桶的衰减得分
#
# Higher score = more vivid memory; below threshold → archive
# 得分越高 = 记忆越鲜活,低于阈值则归档
# Permanent buckets never decay / 固化桶永远不衰减
# ---------------------------------------------------------
def calculate_score(self, metadata: dict) -> float:
"""
Calculate current activity score for a memory bucket.
计算一个记忆桶的当前活跃度得分。
Formula: Score = Importance × (act_count^0.3) × e^(-λ×days) × (base + arousal×boost)
"""
if not isinstance(metadata, dict):
return 0.0
# --- Permanent buckets never decay / 固化桶永不衰减 ---
if metadata.get("type") == "permanent":
return 999.0
importance = max(1, min(10, int(metadata.get("importance", 5))))
activation_count = max(1, int(metadata.get("activation_count", 1)))
# --- Days since last activation / 距离上次激活过了多少天 ---
last_active_str = metadata.get("last_active", metadata.get("created", ""))
try:
last_active = datetime.fromisoformat(str(last_active_str))
days_since = max(0.0, (datetime.now() - last_active).total_seconds() / 86400)
except (ValueError, TypeError):
days_since = 30 # Parse failure → assume 30 days / 解析失败假设已过 30 天
# --- Emotion weight: continuous arousal coordinate ---
# --- 情感权重:基于连续 arousal 坐标计算 ---
# Higher arousal → stronger emotion → higher weight → slower decay
# arousal 越高 → 情感越强烈 → 权重越大 → 衰减越慢
try:
arousal = max(0.0, min(1.0, float(metadata.get("arousal", 0.3))))
except (ValueError, TypeError):
arousal = 0.3
emotion_weight = self.emotion_base + arousal * self.arousal_boost
# --- Apply decay formula / 套入衰减公式 ---
score = (
importance
* (activation_count ** 0.3)
* math.exp(-self.decay_lambda * days_since)
* emotion_weight
)
# --- Weight pool modifiers / 权重池修正因子 ---
# Resolved events drop to 5%, sink to bottom awaiting keyword reactivation
# 已解决的事件权重骤降到 5%,沉底等待关键词激活
resolved_factor = 0.05 if metadata.get("resolved", False) else 1.0
# High-arousal unresolved buckets get urgency boost for priority surfacing
# 高唤醒未解决桶额外加成,优先浮现
urgency_boost = 1.5 if (arousal > 0.7 and not metadata.get("resolved", False)) else 1.0
return round(score * resolved_factor * urgency_boost, 4)
# ---------------------------------------------------------
# Execute one decay cycle
# 执行一轮衰减周期
# Scan all dynamic buckets → score → archive those below threshold
# 扫描所有动态桶 → 算分 → 低于阈值的归档
# ---------------------------------------------------------
async def run_decay_cycle(self) -> dict:
"""
Execute one decay cycle: iterate dynamic buckets, archive those
scoring below threshold.
执行一轮衰减:遍历动态桶,归档得分低于阈值的桶。
Returns stats: {"checked": N, "archived": N, "lowest_score": X}
"""
try:
buckets = await self.bucket_mgr.list_all(include_archive=False)
except Exception as e:
logger.error(f"Failed to list buckets for decay / 衰减周期列桶失败: {e}")
return {"checked": 0, "archived": 0, "lowest_score": 0, "error": str(e)}
checked = 0
archived = 0
lowest_score = float("inf")
for bucket in buckets:
meta = bucket.get("metadata", {})
# Skip permanent buckets / 跳过固化桶
if meta.get("type") == "permanent":
continue
checked += 1
try:
score = self.calculate_score(meta)
except Exception as e:
logger.warning(
f"Score calculation failed for {bucket.get('id', '?')} / "
f"计算得分失败: {e}"
)
continue
lowest_score = min(lowest_score, score)
# --- Below threshold → archive (simulate forgetting) ---
# --- 低于阈值 → 归档(模拟遗忘)---
if score < self.threshold:
try:
success = await self.bucket_mgr.archive(bucket["id"])
if success:
archived += 1
logger.info(
f"Decay archived / 衰减归档: "
f"{meta.get('name', bucket['id'])} "
f"(score={score:.4f}, threshold={self.threshold})"
)
except Exception as e:
logger.warning(
f"Archive failed for {bucket.get('id', '?')} / "
f"归档失败: {e}"
)
result = {
"checked": checked,
"archived": archived,
"lowest_score": lowest_score if checked > 0 else 0,
}
logger.info(f"Decay cycle complete / 衰减周期完成: {result}")
return result
# ---------------------------------------------------------
# Background decay task management
# 后台衰减任务管理
# ---------------------------------------------------------
async def ensure_started(self) -> None:
"""
Ensure the decay engine is started (lazy init on first call).
确保衰减引擎已启动(懒加载,首次调用时启动)。
"""
if not self._running:
await self.start()
async def start(self) -> None:
"""Start the background decay loop.
启动后台衰减循环。"""
if self._running:
return
self._running = True
self._task = asyncio.create_task(self._background_loop())
logger.info(
f"Decay engine started, interval: {self.check_interval}h / "
f"衰减引擎已启动,检查间隔: {self.check_interval} 小时"
)
async def stop(self) -> None:
"""Stop the background decay loop.
停止后台衰减循环。"""
self._running = False
if self._task:
self._task.cancel()
try:
await self._task
except asyncio.CancelledError:
pass
logger.info("Decay engine stopped / 衰减引擎已停止")
async def _background_loop(self) -> None:
"""Background loop: run decay → sleep → repeat.
后台循环体:执行衰减 → 睡眠 → 重复。"""
while self._running:
try:
await self.run_decay_cycle()
except Exception as e:
logger.error(f"Decay cycle error / 衰减周期出错: {e}")
# --- Wait for next cycle / 等待下一个周期 ---
try:
await asyncio.sleep(self.check_interval * 3600)
except asyncio.CancelledError:
break

View File

@@ -1,536 +0,0 @@
# ============================================================
# Module: MCP Server Entry Point (server.py)
# 模块MCP 服务器主入口
#
# Starts the Ombre Brain MCP service and registers memory
# operation tools for Claude to call.
# 启动 Ombre Brain MCP 服务,注册记忆操作工具供 Claude 调用。
#
# Core responsibilities:
# 核心职责:
# - Initialize config, bucket manager, dehydrator, decay engine
# 初始化配置、记忆桶管理器、脱水器、衰减引擎
# - Expose 5 MCP tools:
# 暴露 5 个 MCP 工具:
# breath — Surface unresolved memories or search by keyword
# 浮现未解决记忆 或 按关键词检索
# hold — Store a single memory
# 存储单条记忆
# grow — Diary digest, auto-split into multiple buckets
# 日记归档,自动拆分多桶
# trace — Modify metadata / resolved / delete
# 修改元数据 / resolved 标记 / 删除
# pulse — System status + bucket listing
# 系统状态 + 所有桶列表
#
# Startup:
# 启动方式:
# Local: python server.py
# Remote: OMBRE_TRANSPORT=streamable-http python server.py
# Docker: docker-compose up
# ============================================================
import os
import sys
import random
import logging
import asyncio
import httpx
# --- Ensure same-directory modules can be imported ---
# --- 确保同目录下的模块能被正确导入 ---
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
from mcp.server.fastmcp import FastMCP
from bucket_manager import BucketManager
from dehydrator import Dehydrator
from decay_engine import DecayEngine
from utils import load_config, setup_logging
# --- Load config & init logging / 加载配置 & 初始化日志 ---
config = load_config()
setup_logging(config.get("log_level", "INFO"))
logger = logging.getLogger("ombre_brain")
# --- Initialize three core components / 初始化三大核心组件 ---
bucket_mgr = BucketManager(config) # Bucket manager / 记忆桶管理器
dehydrator = Dehydrator(config) # Dehydrator / 脱水器
decay_engine = DecayEngine(config, bucket_mgr) # Decay engine / 衰减引擎
# --- Create MCP server instance / 创建 MCP 服务器实例 ---
# host="0.0.0.0" so Docker container's SSE is externally reachable
# stdio mode ignores host (no network)
mcp = FastMCP(
"Ombre Brain",
host="0.0.0.0",
port=8000,
)
# =============================================================
# /health endpoint: lightweight keepalive
# 轻量保活接口
# For Cloudflare Tunnel or reverse proxy to ping, preventing idle timeout
# 供 Cloudflare Tunnel 或反代定期 ping防止空闲超时断连
# =============================================================
@mcp.custom_route("/health", methods=["GET"])
async def health_check(request):
from starlette.responses import JSONResponse
try:
stats = await bucket_mgr.get_stats()
return JSONResponse({
"status": "ok",
"buckets": stats["permanent_count"] + stats["dynamic_count"],
"decay_engine": "running" if decay_engine.is_running else "stopped",
})
except Exception as e:
return JSONResponse({"status": "error", "detail": str(e)}, status_code=500)
# =============================================================
# Internal helper: merge-or-create
# 内部辅助:检查是否可合并,可以则合并,否则新建
# Shared by hold and grow to avoid duplicate logic
# hold 和 grow 共用,避免重复逻辑
# =============================================================
async def _merge_or_create(
content: str,
tags: list,
importance: int,
domain: list,
valence: float,
arousal: float,
name: str = "",
) -> tuple[str, bool]:
"""
Check if a similar bucket exists for merging; merge if so, create if not.
Returns (bucket_id_or_name, is_merged).
检查是否有相似桶可合并,有则合并,无则新建。
返回 (桶ID或名称, 是否合并)。
"""
try:
existing = await bucket_mgr.search(content, limit=1)
except Exception as e:
logger.warning(f"Search for merge failed, creating new / 合并搜索失败,新建: {e}")
existing = []
if existing and existing[0].get("score", 0) > config.get("merge_threshold", 75):
bucket = existing[0]
try:
merged = await dehydrator.merge(bucket["content"], content)
await bucket_mgr.update(
bucket["id"],
content=merged,
tags=list(set(bucket["metadata"].get("tags", []) + tags)),
importance=max(bucket["metadata"].get("importance", 5), importance),
domain=list(set(bucket["metadata"].get("domain", []) + domain)),
valence=valence,
arousal=arousal,
)
return bucket["metadata"].get("name", bucket["id"]), True
except Exception as e:
logger.warning(f"Merge failed, creating new / 合并失败,新建: {e}")
bucket_id = await bucket_mgr.create(
content=content,
tags=tags,
importance=importance,
domain=domain,
valence=valence,
arousal=arousal,
name=name or None,
)
return bucket_id, False
# =============================================================
# Tool 1: breath — Breathe
# 工具 1breath — 呼吸
#
# No args: surface highest-weight unresolved memories (active push)
# 无参数:浮现权重最高的未解决记忆
# With args: search by keyword + emotion coordinates
# 有参数:按关键词+情感坐标检索记忆
# =============================================================
@mcp.tool()
async def breath(
query: str = "",
max_results: int = 3,
domain: str = "",
valence: float = -1,
arousal: float = -1,
) -> str:
"""检索记忆或浮现未解决记忆。query 为空时自动推送权重最高的未解决桶;有 query 时按关键词+情感检索。domain 逗号分隔valence/arousal 传 0~1 启用情感共鸣,-1 忽略。"""
await decay_engine.ensure_started()
# --- No args: surfacing mode (weight pool active push) ---
# --- 无参数:浮现模式(权重池主动推送)---
if not query.strip():
try:
all_buckets = await bucket_mgr.list_all(include_archive=False)
except Exception as e:
logger.error(f"Failed to list buckets for surfacing / 浮现列桶失败: {e}")
return "记忆系统暂时无法访问。"
unresolved = [
b for b in all_buckets
if not b["metadata"].get("resolved", False)
and b["metadata"].get("type") != "permanent"
]
if not unresolved:
return "权重池平静,没有需要处理的记忆。"
scored = sorted(
unresolved,
key=lambda b: decay_engine.calculate_score(b["metadata"]),
reverse=True,
)
top = scored[:2]
results = []
for b in top:
try:
summary = await dehydrator.dehydrate(b["content"], b["metadata"])
await bucket_mgr.touch(b["id"])
score = decay_engine.calculate_score(b["metadata"])
results.append(f"[权重:{score:.2f}] {summary}")
except Exception as e:
logger.warning(f"Failed to dehydrate surfaced bucket / 浮现脱水失败: {e}")
continue
if not results:
return "权重池平静,没有需要处理的记忆。"
return "=== 浮现记忆 ===\n" + "\n---\n".join(results)
# --- With args: search mode / 有参数:检索模式 ---
domain_filter = [d.strip() for d in domain.split(",") if d.strip()] or None
q_valence = valence if 0 <= valence <= 1 else None
q_arousal = arousal if 0 <= arousal <= 1 else None
try:
matches = await bucket_mgr.search(
query,
limit=max_results,
domain_filter=domain_filter,
query_valence=q_valence,
query_arousal=q_arousal,
)
except Exception as e:
logger.error(f"Search failed / 检索失败: {e}")
return "检索过程出错,请稍后重试。"
results = []
for bucket in matches:
try:
summary = await dehydrator.dehydrate(bucket["content"], bucket["metadata"])
await bucket_mgr.touch(bucket["id"])
results.append(summary)
except Exception as e:
logger.warning(f"Failed to dehydrate search result / 检索结果脱水失败: {e}")
continue
# --- Random surfacing: when search returns < 3, 40% chance to float old memories ---
# --- 随机浮现:检索结果不足 3 条时40% 概率从低权重旧桶里漂上来 ---
if len(matches) < 3 and random.random() < 0.4:
try:
all_buckets = await bucket_mgr.list_all(include_archive=False)
matched_ids = {b["id"] for b in matches}
low_weight = [
b for b in all_buckets
if b["id"] not in matched_ids
and decay_engine.calculate_score(b["metadata"]) < 2.0
]
if low_weight:
drifted = random.sample(low_weight, min(random.randint(1, 3), len(low_weight)))
drift_results = []
for b in drifted:
summary = await dehydrator.dehydrate(b["content"], b["metadata"])
drift_results.append(f"[surface_type: random]\n{summary}")
results.append("--- 忽然想起来 ---\n" + "\n---\n".join(drift_results))
except Exception as e:
logger.warning(f"Random surfacing failed / 随机浮现失败: {e}")
if not results:
return "未找到相关记忆。"
return "\n---\n".join(results)
# =============================================================
# Tool 2: hold — Hold on to this
# 工具 2hold — 握住,留下来
# =============================================================
@mcp.tool()
async def hold(
content: str,
tags: str = "",
importance: int = 5,
) -> str:
"""存储单条记忆。自动打标+合并相似桶。tags 逗号分隔importance 1-10。"""
await decay_engine.ensure_started()
# --- Input validation / 输入校验 ---
if not content or not content.strip():
return "内容为空,无法存储。"
importance = max(1, min(10, importance))
extra_tags = [t.strip() for t in tags.split(",") if t.strip()]
# --- Step 1: auto-tagging / 自动打标 ---
try:
analysis = await dehydrator.analyze(content)
except Exception as e:
logger.warning(f"Auto-tagging failed, using defaults / 自动打标失败: {e}")
analysis = {
"domain": ["未分类"], "valence": 0.5, "arousal": 0.3,
"tags": [], "suggested_name": "",
}
domain = analysis["domain"]
valence = analysis["valence"]
arousal = analysis["arousal"]
auto_tags = analysis["tags"]
suggested_name = analysis.get("suggested_name", "")
all_tags = list(dict.fromkeys(auto_tags + extra_tags))
# --- Step 2: merge or create / 合并或新建 ---
result_name, is_merged = await _merge_or_create(
content=content,
tags=all_tags,
importance=importance,
domain=domain,
valence=valence,
arousal=arousal,
name=suggested_name,
)
if is_merged:
return (
f"已合并到现有记忆桶: {result_name}\n"
f"主题域: {', '.join(domain)} | 情感: V{valence:.1f}/A{arousal:.1f}"
)
return (
f"已创建新记忆桶: {result_name}\n"
f"主题域: {', '.join(domain)} | 情感: V{valence:.1f}/A{arousal:.1f} | 标签: {', '.join(all_tags)}"
)
# =============================================================
# Tool 3: grow — Grow, fragments become memories
# 工具 3grow — 生长,一天的碎片长成记忆
# =============================================================
@mcp.tool()
async def grow(content: str) -> str:
"""日记归档。自动拆分长内容为多个记忆桶。"""
await decay_engine.ensure_started()
if not content or not content.strip():
return "内容为空,无法整理。"
# --- Step 1: let API split and organize / 让 API 拆分整理 ---
try:
items = await dehydrator.digest(content)
except Exception as e:
logger.error(f"Diary digest failed / 日记整理失败: {e}")
return f"日记整理失败: {e}"
if not items:
return "内容为空或整理失败。"
results = []
created = 0
merged = 0
# --- Step 2: merge or create each item (with per-item error handling) ---
# --- 逐条合并或新建(单条失败不影响其他)---
for item in items:
try:
result_name, is_merged = await _merge_or_create(
content=item["content"],
tags=item.get("tags", []),
importance=item.get("importance", 5),
domain=item.get("domain", ["未分类"]),
valence=item.get("valence", 0.5),
arousal=item.get("arousal", 0.3),
name=item.get("name", ""),
)
if is_merged:
results.append(f" 📎 合并 → {result_name}")
merged += 1
else:
domains_str = ",".join(item.get("domain", []))
results.append(
f" 📝 新建 [{item.get('name', result_name)}] "
f"主题:{domains_str} V{item.get('valence', 0.5):.1f}/A{item.get('arousal', 0.3):.1f}"
)
created += 1
except Exception as e:
logger.warning(
f"Failed to process diary item / 日记条目处理失败: "
f"{item.get('name', '?')}: {e}"
)
results.append(f" ⚠️ 失败: {item.get('name', '未知条目')}")
summary = f"=== 日记整理完成 ===\n拆分为 {len(items)} 条 | 新建 {created} 桶 | 合并 {merged}\n"
return summary + "\n".join(results)
# =============================================================
# Tool 4: trace — Trace, redraw the outline of a memory
# 工具 4trace — 描摹,重新勾勒记忆的轮廓
# Also handles deletion (delete=True)
# 同时承接删除功能
# =============================================================
@mcp.tool()
async def trace(
bucket_id: str,
name: str = "",
domain: str = "",
valence: float = -1,
arousal: float = -1,
importance: int = -1,
tags: str = "",
resolved: int = -1,
delete: bool = False,
) -> str:
"""修改记忆元数据。resolved=1 标记已解决桶权重骤降沉底resolved=0 重新激活delete=True 删除桶。其余字段只传需改的,-1 或空串表示不改。"""
if not bucket_id or not bucket_id.strip():
return "请提供有效的 bucket_id。"
# --- Delete mode / 删除模式 ---
if delete:
success = await bucket_mgr.delete(bucket_id)
return f"已遗忘记忆桶: {bucket_id}" if success else f"未找到记忆桶: {bucket_id}"
bucket = await bucket_mgr.get(bucket_id)
if not bucket:
return f"未找到记忆桶: {bucket_id}"
# --- Collect only fields actually passed / 只收集用户实际传入的字段 ---
updates = {}
if name:
updates["name"] = name
if domain:
updates["domain"] = [d.strip() for d in domain.split(",") if d.strip()]
if 0 <= valence <= 1:
updates["valence"] = valence
if 0 <= arousal <= 1:
updates["arousal"] = arousal
if 1 <= importance <= 10:
updates["importance"] = importance
if tags:
updates["tags"] = [t.strip() for t in tags.split(",") if t.strip()]
if resolved in (0, 1):
updates["resolved"] = bool(resolved)
if not updates:
return "没有任何字段需要修改。"
success = await bucket_mgr.update(bucket_id, **updates)
if not success:
return f"修改失败: {bucket_id}"
changed = ", ".join(f"{k}={v}" for k, v in updates.items())
# Explicit hint about resolved state change semantics
# 特别提示 resolved 状态变化的语义
if "resolved" in updates:
if updates["resolved"]:
changed += " → 已沉底,只在关键词触发时重新浮现"
else:
changed += " → 已重新激活,将参与浮现排序"
return f"已修改记忆桶 {bucket_id}: {changed}"
# =============================================================
# Tool 5: pulse — Heartbeat, system status + memory listing
# 工具 5pulse — 脉搏,系统状态 + 记忆列表
# =============================================================
@mcp.tool()
async def pulse(include_archive: bool = False) -> str:
"""系统状态和所有记忆桶摘要。include_archive=True 时包含归档桶。"""
try:
stats = await bucket_mgr.get_stats()
except Exception as e:
return f"获取系统状态失败: {e}"
status = (
f"=== Ombre Brain 记忆系统 ===\n"
f"固化记忆桶: {stats['permanent_count']}\n"
f"动态记忆桶: {stats['dynamic_count']}\n"
f"归档记忆桶: {stats['archive_count']}\n"
f"总存储大小: {stats['total_size_kb']:.1f} KB\n"
f"衰减引擎: {'运行中' if decay_engine.is_running else '已停止'}\n"
)
# --- List all bucket summaries / 列出所有桶摘要 ---
try:
buckets = await bucket_mgr.list_all(include_archive=include_archive)
except Exception as e:
return status + f"\n列出记忆桶失败: {e}"
if not buckets:
return status + "\n记忆库为空。"
lines = []
for b in buckets:
meta = b.get("metadata", {})
if meta.get("type") == "permanent":
icon = "📦"
elif meta.get("type") == "archived":
icon = "🗄️"
elif meta.get("resolved", False):
icon = ""
else:
icon = "💭"
try:
score = decay_engine.calculate_score(meta)
except Exception:
score = 0.0
domains = ",".join(meta.get("domain", []))
val = meta.get("valence", 0.5)
aro = meta.get("arousal", 0.3)
resolved_tag = " [已解决]" if meta.get("resolved", False) else ""
lines.append(
f"{icon} [{meta.get('name', b['id'])}]{resolved_tag} "
f"主题:{domains} "
f"情感:V{val:.1f}/A{aro:.1f} "
f"重要:{meta.get('importance', '?')} "
f"权重:{score:.2f} "
f"标签:{','.join(meta.get('tags', []))}"
)
return status + "\n=== 记忆列表 ===\n" + "\n".join(lines)
# --- Entry point / 启动入口 ---
if __name__ == "__main__":
transport = config.get("transport", "stdio")
logger.info(f"Ombre Brain starting | transport: {transport}")
# --- Application-level keepalive: remote mode only, ping /health every 60s ---
# --- 应用层保活:仅远程模式下启动,每 60 秒 ping 一次 /health ---
# Prevents Cloudflare Tunnel from dropping idle connections
if transport in ("sse", "streamable-http"):
async def _keepalive_loop():
await asyncio.sleep(10) # Wait for server to fully start
async with httpx.AsyncClient() as client:
while True:
try:
await client.get("http://localhost:8000/health", timeout=5)
logger.debug("Keepalive ping OK / 保活 ping 成功")
except Exception as e:
logger.warning(f"Keepalive ping failed / 保活 ping 失败: {e}")
await asyncio.sleep(60)
import threading
def _start_keepalive():
loop = asyncio.new_event_loop()
loop.run_until_complete(_keepalive_loop())
t = threading.Thread(target=_start_keepalive, daemon=True)
t.start()
mcp.run(transport=transport)

View File

@@ -28,15 +28,12 @@
import os import os
import math import math
import logging import logging
import re
import shutil import shutil
from collections import Counter
from datetime import datetime from datetime import datetime
from pathlib import Path from pathlib import Path
from typing import Optional from typing import Optional
import frontmatter import frontmatter
import jieba
from rapidfuzz import fuzz from rapidfuzz import fuzz
from utils import generate_bucket_id, sanitize_name, safe_path, now_iso from utils import generate_bucket_id, sanitize_name, safe_path, now_iso
@@ -54,7 +51,7 @@ class BucketManager:
天然兼容 Obsidian 直接浏览和编辑。 天然兼容 Obsidian 直接浏览和编辑。
""" """
def __init__(self, config: dict): def __init__(self, config: dict, embedding_engine=None):
# --- Read storage paths from config / 从配置中读取存储路径 --- # --- Read storage paths from config / 从配置中读取存储路径 ---
self.base_dir = config["buckets_dir"] self.base_dir = config["buckets_dir"]
self.permanent_dir = os.path.join(self.base_dir, "permanent") self.permanent_dir = os.path.join(self.base_dir, "permanent")
@@ -88,9 +85,12 @@ class BucketManager:
scoring = config.get("scoring_weights", {}) scoring = config.get("scoring_weights", {})
self.w_topic = scoring.get("topic_relevance", 4.0) self.w_topic = scoring.get("topic_relevance", 4.0)
self.w_emotion = scoring.get("emotion_resonance", 2.0) self.w_emotion = scoring.get("emotion_resonance", 2.0)
self.w_time = scoring.get("time_proximity", 2.5) self.w_time = scoring.get("time_proximity", 1.5)
self.w_importance = scoring.get("importance", 1.0) self.w_importance = scoring.get("importance", 1.0)
self.content_weight = scoring.get("content_weight", 3.0) # Added to allow better content-based matching during merge self.content_weight = scoring.get("content_weight", 1.0) # body×1, per spec
# --- Optional embedding engine for pre-filtering / 可选 embedding 引擎,用于预筛候选集 ---
self.embedding_engine = embedding_engine
# --------------------------------------------------------- # ---------------------------------------------------------
# Create a new bucket # Create a new bucket
@@ -121,7 +121,11 @@ class BucketManager:
""" """
bucket_id = generate_bucket_id() bucket_id = generate_bucket_id()
bucket_name = sanitize_name(name) if name else bucket_id bucket_name = sanitize_name(name) if name else bucket_id
domain = domain or ["未分类"] # feel buckets are allowed to have empty domain; others default to ["未分类"]
if bucket_type == "feel":
domain = domain if domain is not None else []
else:
domain = domain or ["未分类"]
tags = tags or [] tags = tags or []
linked_content = content # wikilink injection disabled; LLM adds [[]] via prompt linked_content = content # wikilink injection disabled; LLM adds [[]] via prompt
@@ -142,7 +146,7 @@ class BucketManager:
"type": bucket_type, "type": bucket_type,
"created": now_iso(), "created": now_iso(),
"last_active": now_iso(), "last_active": now_iso(),
"activation_count": 1, "activation_count": 0,
} }
if pinned: if pinned:
metadata["pinned"] = True metadata["pinned"] = True
@@ -289,19 +293,17 @@ class BucketManager:
logger.error(f"Failed to write bucket update / 写入桶更新失败: {file_path}: {e}") logger.error(f"Failed to write bucket update / 写入桶更新失败: {file_path}: {e}")
return False return False
# --- Auto-move: pinned → permanent/, resolved → archive/ --- # --- Auto-move: pinned → permanent/ ---
# --- 自动移动:钉选 → permanent/,已解决 → archive/ --- # --- 自动移动:钉选 → permanent/ ---
# NOTE: resolved buckets are NOT auto-archived here.
# They stay in dynamic/ and decay naturally until score < threshold.
# 注意resolved 桶不在此自动归档,留在 dynamic/ 随衰减引擎自然归档。
domain = post.get("domain", ["未分类"]) domain = post.get("domain", ["未分类"])
if kwargs.get("pinned") and post.get("type") != "permanent": if kwargs.get("pinned") and post.get("type") != "permanent":
post["type"] = "permanent" post["type"] = "permanent"
with open(file_path, "w", encoding="utf-8") as f: with open(file_path, "w", encoding="utf-8") as f:
f.write(frontmatter.dumps(post)) f.write(frontmatter.dumps(post))
self._move_bucket(file_path, self.permanent_dir, domain) self._move_bucket(file_path, self.permanent_dir, domain)
elif kwargs.get("resolved") and post.get("type") not in ("permanent", "feel"):
post["type"] = "archived"
with open(file_path, "w", encoding="utf-8") as f:
f.write(frontmatter.dumps(post))
self._move_bucket(file_path, self.archive_dir, domain)
logger.info(f"Updated bucket / 更新记忆桶: {bucket_id}") logger.info(f"Updated bucket / 更新记忆桶: {bucket_id}")
return True return True
@@ -473,6 +475,20 @@ class BucketManager:
else: else:
candidates = all_buckets candidates = all_buckets
# --- Layer 1.5: embedding pre-filter (optional, reduces multi-dim ranking set) ---
# --- 第1.5层embedding 预筛(可选,缩小精排候选集)---
if self.embedding_engine and self.embedding_engine.enabled:
try:
vector_results = await self.embedding_engine.search_similar(query, top_k=50)
if vector_results:
vector_ids = {bid for bid, _ in vector_results}
emb_candidates = [b for b in candidates if b["id"] in vector_ids]
if emb_candidates: # only replace if there's non-empty overlap
candidates = emb_candidates
# else: keep original candidates as fallback
except Exception as e:
logger.warning(f"Embedding pre-filter failed, using fuzzy only / embedding 预筛失败: {e}")
# --- Layer 2: weighted multi-dim ranking --- # --- Layer 2: weighted multi-dim ranking ---
# --- 第二层:多维加权精排 --- # --- 第二层:多维加权精排 ---
scored = [] scored = []
@@ -505,12 +521,14 @@ class BucketManager:
weight_sum = self.w_topic + self.w_emotion + self.w_time + self.w_importance weight_sum = self.w_topic + self.w_emotion + self.w_time + self.w_importance
normalized = (total / weight_sum) * 100 if weight_sum > 0 else 0 normalized = (total / weight_sum) * 100 if weight_sum > 0 else 0
# Resolved buckets get ranking penalty (but still reachable by keyword) # Threshold check uses raw (pre-penalty) score so resolved buckets
# 已解决的桶降权排序(但仍可被关键词激活) # 阈值用原始分数判定,确保 resolved 桶在关键词命中时仍可被搜出
if meta.get("resolved", False): # remain reachable by keyword (penalty applied only to ranking).
normalized *= 0.3
if normalized >= self.fuzzy_threshold: if normalized >= self.fuzzy_threshold:
# Resolved buckets get ranking penalty (but still reachable by keyword)
# 已解决的桶仅在排序时降权
if meta.get("resolved", False):
normalized *= 0.3
bucket["score"] = round(normalized, 2) bucket["score"] = round(normalized, 2)
scored.append(bucket) scored.append(bucket)
except Exception as e: except Exception as e:
@@ -596,7 +614,7 @@ class BucketManager:
days = max(0.0, (datetime.now() - last_active).total_seconds() / 86400) days = max(0.0, (datetime.now() - last_active).total_seconds() / 86400)
except (ValueError, TypeError): except (ValueError, TypeError):
days = 30 days = 30
return math.exp(-0.1 * days) return math.exp(-0.02 * days)
# --------------------------------------------------------- # ---------------------------------------------------------
# List all buckets # List all buckets

147
check_icloud_conflicts.py Normal file
View File

@@ -0,0 +1,147 @@
#!/usr/bin/env python3
# ============================================================
# check_icloud_conflicts.py — Ombre Brain iCloud Conflict Detector
# iCloud 冲突文件检测器
#
# Scans the configured bucket directory for iCloud sync conflict
# artefacts and duplicate bucket IDs, then prints a report.
# 扫描配置的桶目录,发现 iCloud 同步冲突文件及重复桶 ID输出报告。
#
# Usage:
# python check_icloud_conflicts.py
# python check_icloud_conflicts.py --buckets-dir /path/to/dir
# python check_icloud_conflicts.py --quiet # exit-code only (0=clean)
# ============================================================
from __future__ import annotations
import argparse
import os
import re
import sys
from collections import defaultdict
from pathlib import Path
# ──────────────────────────────────────────────────────────────
# iCloud conflict file patterns
# Pattern 1 (macOS classic): "filename 2.md", "filename 3.md"
# Pattern 2 (iCloud Drive): "filename (Device's conflicted copy YYYY-MM-DD).md"
# ──────────────────────────────────────────────────────────────
_CONFLICT_SUFFIX = re.compile(r"^(.+?)\s+\d+\.md$")
_CONFLICT_ICLOUD = re.compile(r"^(.+?)\s+\(.+conflicted copy .+\)\.md$", re.IGNORECASE)
# Bucket ID pattern: 12 hex chars at end of stem before extension
_BUCKET_ID_PATTERN = re.compile(r"_([0-9a-f]{12})$")
def resolve_buckets_dir() -> Path:
"""Resolve bucket directory: env var → config.yaml → ./buckets fallback."""
env_dir = os.environ.get("OMBRE_BUCKETS_DIR", "").strip()
if env_dir:
return Path(env_dir)
config_path = Path(__file__).parent / "config.yaml"
if config_path.exists():
try:
import yaml # type: ignore
with open(config_path, encoding="utf-8") as f:
cfg = yaml.safe_load(f) or {}
if cfg.get("buckets_dir"):
return Path(cfg["buckets_dir"])
except Exception:
pass
return Path(__file__).parent / "buckets"
def scan(buckets_dir: Path) -> tuple[list[Path], dict[str, list[Path]]]:
"""
Returns:
conflict_files — list of files that look like iCloud conflict artefacts
dup_ids — dict of bucket_id -> [list of files sharing that id]
(only entries with 2+ files)
"""
if not buckets_dir.exists():
return [], {}
conflict_files: list[Path] = []
id_to_files: dict[str, list[Path]] = defaultdict(list)
for md_file in buckets_dir.rglob("*.md"):
name = md_file.name
# --- Conflict file detection ---
if _CONFLICT_SUFFIX.match(name) or _CONFLICT_ICLOUD.match(name):
conflict_files.append(md_file)
continue # don't register conflicts in the ID map
# --- Duplicate ID detection ---
stem = md_file.stem
m = _BUCKET_ID_PATTERN.search(stem)
if m:
id_to_files[m.group(1)].append(md_file)
dup_ids = {bid: paths for bid, paths in id_to_files.items() if len(paths) > 1}
return conflict_files, dup_ids
def main() -> int:
parser = argparse.ArgumentParser(
description="Detect iCloud conflict files and duplicate bucket IDs."
)
parser.add_argument(
"--buckets-dir",
metavar="PATH",
help="Override bucket directory (default: from config.yaml / OMBRE_BUCKETS_DIR)",
)
parser.add_argument(
"--quiet",
action="store_true",
help="Suppress output; exit 0 = clean, 1 = problems found",
)
args = parser.parse_args()
buckets_dir = Path(args.buckets_dir) if args.buckets_dir else resolve_buckets_dir()
if not args.quiet:
print(f"Scanning: {buckets_dir}")
if not buckets_dir.exists():
print(" ✗ Directory does not exist.")
return 1
print()
conflict_files, dup_ids = scan(buckets_dir)
problems = bool(conflict_files or dup_ids)
if args.quiet:
return 1 if problems else 0
# ── Report ─────────────────────────────────────────────────
if not problems:
print("✓ No iCloud conflicts or duplicate IDs found.")
return 0
if conflict_files:
print(f"⚠ iCloud conflict files ({len(conflict_files)} found):")
for f in sorted(conflict_files):
rel = f.relative_to(buckets_dir) if f.is_relative_to(buckets_dir) else f
print(f" {rel}")
print()
if dup_ids:
print(f"⚠ Duplicate bucket IDs ({len(dup_ids)} ID(s) shared by multiple files):")
for bid, paths in sorted(dup_ids.items()):
print(f" ID: {bid}")
for p in sorted(paths):
rel = p.relative_to(buckets_dir) if p.is_relative_to(buckets_dir) else p
print(f" {rel}")
print()
print(
"NOTE: This script is report-only. No files are modified or deleted.\n"
"注意:本脚本仅报告,不删除或修改任何文件。"
)
return 1
if __name__ == "__main__":
sys.exit(main())

View File

@@ -28,9 +28,11 @@ log_level: "INFO"
merge_threshold: 75 merge_threshold: 75
# --- Dehydration API / 脱水压缩 API 配置 --- # --- Dehydration API / 脱水压缩 API 配置 ---
# Uses a cheap LLM for intelligent compression; auto-degrades to local # Uses a cheap LLM for intelligent compression. API is required; if the
# keyword extraction if API is unavailable # configured key/endpoint is unavailable, hold/grow will raise an explicit
# 用廉价 LLM 做智能压缩API 不可用时自动降级到本地关键词提取 # error instead of silently degrading (see BEHAVIOR_SPEC.md 三、降级行为表).
# 用廉价 LLM 做智能压缩。API 为必需;如 key/endpoint 不可用,
# hold/grow 会直接报错而非静默降级(详见 BEHAVIOR_SPEC.md 三、降级行为表)。
dehydration: dehydration:
# Supports any OpenAI-compatible API: DeepSeek / Ollama / LM Studio / vLLM / Gemini etc. # Supports any OpenAI-compatible API: DeepSeek / Ollama / LM Studio / vLLM / Gemini etc.
# 支持所有 OpenAI 兼容 APIDeepSeek / Ollama / LM Studio / vLLM / Gemini 等 # 支持所有 OpenAI 兼容 APIDeepSeek / Ollama / LM Studio / vLLM / Gemini 等
@@ -61,11 +63,14 @@ decay:
# --- Embedding / 向量化配置 --- # --- Embedding / 向量化配置 ---
# Uses embedding API for semantic similarity search # Uses embedding API for semantic similarity search
# 通过 embedding API 实现语义相似度搜索 # 通过 embedding API 实现语义相似度搜索
# Reuses the same API key (OMBRE_API_KEY) and base_url from dehydration config # You can configure embedding independently from dehydration.
# 复用脱水配置的 API key 和 base_url # If api_key is omitted, reuses the same API key (OMBRE_API_KEY) and base_url from dehydration config
# 你可以把 embedding 独立配置;若 api_key 留空,复用脱水配置的 API key 和 base_url
embedding: embedding:
enabled: true # Enable embedding / 启用向量化 enabled: true # Enable embedding / 启用向量化
model: "gemini-embedding-001" # Embedding model / 向量化模型 model: "gemini-embedding-001" # Embedding model / 向量化模型
# base_url: "https://generativelanguage.googleapis.com/v1beta/openai"
# api_key: ""
# --- Scoring weights / 检索权重参数 --- # --- Scoring weights / 检索权重参数 ---
# total = topic(×4) + emotion(×2) + time(×1.5) + importance(×1) # total = topic(×4) + emotion(×2) + time(×1.5) + importance(×1)

View File

@@ -607,6 +607,7 @@
<div class="search-bar"> <div class="search-bar">
<input type="text" id="search-input" placeholder="搜索记忆…" /> <input type="text" id="search-input" placeholder="搜索记忆…" />
</div> </div>
<button onclick="doLogout()" title="退出登录" style="margin-left:12px;background:none;border:1px solid var(--border);color:var(--text-dim);border-radius:20px;padding:6px 14px;font-size:12px;cursor:pointer;">退出</button>
</div> </div>
<div class="tabs"> <div class="tabs">
@@ -615,6 +616,7 @@
<div class="tab" data-tab="network">记忆网络</div> <div class="tab" data-tab="network">记忆网络</div>
<div class="tab" data-tab="config">配置</div> <div class="tab" data-tab="config">配置</div>
<div class="tab" data-tab="import">导入</div> <div class="tab" data-tab="import">导入</div>
<div class="tab" data-tab="settings">设置</div>
</div> </div>
<div class="content" id="list-view"> <div class="content" id="list-view">
@@ -778,7 +780,259 @@
<div id="detail-content"></div> <div id="detail-content"></div>
</div> </div>
<!-- Settings Tab View -->
<div class="content" id="settings-view" style="display:none">
<div style="max-width:580px;margin:0 auto;">
<div class="config-section">
<h3>服务状态</h3>
<div id="settings-status" style="font-size:13px;color:var(--text-dim);line-height:2;">加载中…</div>
<button onclick="loadSettingsStatus()" style="margin-top:8px;font-size:12px;padding:4px 12px;">刷新状态</button>
</div>
<div class="config-section">
<h3>修改密码</h3>
<div id="settings-env-notice" style="display:none;font-size:12px;color:var(--warning);margin-bottom:10px;">
⚠ 当前使用环境变量 OMBRE_DASHBOARD_PASSWORD请直接修改环境变量。
</div>
<div id="settings-pwd-form">
<div class="config-row">
<label>当前密码</label>
<input type="password" id="settings-current-pwd" placeholder="当前密码" />
</div>
<div class="config-row">
<label>新密码</label>
<input type="password" id="settings-new-pwd" placeholder="新密码至少6位" />
</div>
<div class="config-row">
<label>确认新密码</label>
<input type="password" id="settings-new-pwd2" placeholder="再次输入新密码" />
</div>
<button class="btn-primary" onclick="changePassword()" style="margin-top:4px;">修改密码</button>
<div id="settings-pwd-msg" style="margin-top:10px;font-size:13px;"></div>
</div>
</div>
<div class="config-section">
<h3>宿主机记忆桶目录 (Docker)</h3>
<div style="font-size:12px;color:var(--text-dim);margin-bottom:10px;line-height:1.6;">
设置 docker-compose 中 <code>${OMBRE_HOST_VAULT_DIR:-./buckets}:/data</code> 的宿主机路径。
留空则使用项目内 <code>./buckets</code>
<span style="color:var(--warning);">⚠ 修改后需在宿主机执行 <code>docker compose down && docker compose up -d</code> 才会生效。</span>
</div>
<div class="config-row">
<label>路径</label>
<input type="text" id="settings-host-vault" placeholder="例如 /Users/you/Obsidian/Ombre Brain" style="flex:1;" />
</div>
<div style="display:flex;gap:8px;align-items:center;margin-top:6px;">
<button class="btn-primary" onclick="saveHostVault()">保存到 .env</button>
<button onclick="loadHostVault()" style="font-size:12px;padding:4px 12px;">重新加载</button>
<span id="settings-host-vault-msg" style="font-size:12px;"></span>
</div>
</div>
<div class="config-section">
<h3>账号操作</h3>
<button onclick="doLogout()" style="color:var(--negative);border-color:var(--negative);">退出登录</button>
</div>
</div>
</div>
<!-- Auth Overlay -->
<div id="auth-overlay" style="position:fixed;inset:0;z-index:9999;background:var(--bg-gradient);background-attachment:fixed;display:flex;align-items:center;justify-content:center;">
<div style="background:var(--surface);backdrop-filter:blur(16px);border:1px solid var(--border);border-radius:24px;padding:48px 40px;max-width:400px;width:90%;text-align:center;box-shadow:0 8px 40px rgba(0,0,0,0.12);">
<h2 style="font-family:'Cormorant Garamond',serif;font-size:28px;color:var(--accent);margin-bottom:8px;">◐ Ombre Brain</h2>
<p style="color:var(--text-dim);font-size:13px;margin-bottom:28px;" id="auth-subtitle">验证身份</p>
<!-- Setup form -->
<div id="auth-setup-form" style="display:none;">
<p style="font-size:13px;color:var(--text-dim);margin-bottom:16px;">首次使用,请设置访问密码</p>
<input type="password" id="auth-setup-pwd" placeholder="设置密码至少6位" style="display:block;width:100%;padding:10px 16px;border:1px solid var(--border);border-radius:10px;background:var(--surface-solid);color:var(--text);font-size:14px;margin-bottom:10px;" />
<input type="password" id="auth-setup-pwd2" placeholder="确认密码" style="display:block;width:100%;padding:10px 16px;border:1px solid var(--border);border-radius:10px;background:var(--surface-solid);color:var(--text);font-size:14px;margin-bottom:16px;" onkeydown="if(event.key==='Enter')doSetup()" />
<button onclick="doSetup()" style="width:100%;padding:12px;background:var(--accent);color:#fff;border:none;border-radius:10px;font-size:14px;cursor:pointer;">设置密码并进入</button>
</div>
<!-- Login form -->
<div id="auth-login-form" style="display:none;">
<input type="password" id="auth-login-pwd" placeholder="输入访问密码" style="display:block;width:100%;padding:10px 16px;border:1px solid var(--border);border-radius:10px;background:var(--surface-solid);color:var(--text);font-size:14px;margin-bottom:16px;" onkeydown="if(event.key==='Enter')doLogin()" />
<button onclick="doLogin()" style="width:100%;padding:12px;background:var(--accent);color:#fff;border:none;border-radius:10px;font-size:14px;cursor:pointer;">登录</button>
</div>
<div id="auth-error" style="color:var(--negative);font-size:13px;margin-top:12px;display:none;"></div>
</div>
</div>
<script> <script>
// ========================================
// Auth system / 认证系统
// ========================================
async function checkAuth() {
try {
const resp = await fetch('/auth/status');
const data = await resp.json();
if (data.setup_needed) {
document.getElementById('auth-subtitle').textContent = '首次设置';
document.getElementById('auth-setup-form').style.display = 'block';
} else if (data.authenticated) {
document.getElementById('auth-overlay').style.display = 'none';
} else {
document.getElementById('auth-subtitle').textContent = '请输入访问密码';
document.getElementById('auth-login-form').style.display = 'block';
}
} catch {
document.getElementById('auth-overlay').style.display = 'none';
}
}
function showAuthError(msg) {
const el = document.getElementById('auth-error');
el.textContent = msg;
el.style.display = 'block';
}
async function doSetup() {
const p1 = document.getElementById('auth-setup-pwd').value;
const p2 = document.getElementById('auth-setup-pwd2').value;
if (p1.length < 6) return showAuthError('密码至少6位');
if (p1 !== p2) return showAuthError('两次密码不一致');
const resp = await fetch('/auth/setup', { method: 'POST', headers: {'Content-Type':'application/json'}, body: JSON.stringify({password: p1}) });
if (resp.ok) {
document.getElementById('auth-overlay').style.display = 'none';
} else {
const d = await resp.json();
showAuthError(d.detail || '设置失败');
}
}
async function doLogin() {
const pwd = document.getElementById('auth-login-pwd').value;
const resp = await fetch('/auth/login', { method: 'POST', headers: {'Content-Type':'application/json'}, body: JSON.stringify({password: pwd}) });
if (resp.ok) {
document.getElementById('auth-overlay').style.display = 'none';
} else {
const d = await resp.json();
showAuthError(d.detail || '密码错误');
}
}
async function doLogout() {
await fetch('/auth/logout', { method: 'POST' });
document.getElementById('auth-setup-form').style.display = 'none';
document.getElementById('auth-login-form').style.display = 'none';
document.getElementById('auth-login-form').style.display = 'block';
document.getElementById('auth-subtitle').textContent = '请输入访问密码';
document.getElementById('auth-error').style.display = 'none';
document.getElementById('auth-overlay').style.display = 'flex';
}
async function changePassword() {
const currentPwd = document.getElementById('settings-current-pwd').value;
const newPwd = document.getElementById('settings-new-pwd').value;
const newPwd2 = document.getElementById('settings-new-pwd2').value;
const msgEl = document.getElementById('settings-pwd-msg');
if (newPwd.length < 6) { msgEl.style.color = 'var(--negative)'; msgEl.textContent = '新密码至少6位'; return; }
if (newPwd !== newPwd2) { msgEl.style.color = 'var(--negative)'; msgEl.textContent = '两次密码不一致'; return; }
const resp = await authFetch('/auth/change-password', { method: 'POST', headers: {'Content-Type':'application/json'}, body: JSON.stringify({current: currentPwd, new: newPwd}) });
if (!resp) return;
if (resp.ok) {
msgEl.style.color = 'var(--accent)'; msgEl.textContent = '密码修改成功';
document.getElementById('settings-current-pwd').value = '';
document.getElementById('settings-new-pwd').value = '';
document.getElementById('settings-new-pwd2').value = '';
} else {
const d = await resp.json();
msgEl.style.color = 'var(--negative)'; msgEl.textContent = d.detail || '修改失败';
}
}
async function loadSettingsStatus() {
const el = document.getElementById('settings-status');
try {
const resp = await authFetch('/api/status');
if (!resp) return;
const d = await resp.json();
const noticeEl = document.getElementById('settings-env-notice');
if (d.using_env_password) noticeEl.style.display = 'block';
else noticeEl.style.display = 'none';
el.innerHTML = `
<b>版本</b>${d.version}<br>
<b>Bucket 总数</b>${(d.buckets?.total ?? 0)} (永久:${d.buckets?.permanent ?? 0} / 动态:${d.buckets?.dynamic ?? 0} / 归档:${d.buckets?.archive ?? 0}<br>
<b>衰减引擎</b>${d.decay_engine}<br>
<b>向量搜索</b>${d.embedding_enabled ? '已启用' : '未启用'}<br>
`;
} catch(e) {
el.textContent = '加载失败: ' + e;
}
// Also refresh the host-vault input whenever the settings tab is loaded.
loadHostVault();
}
async function loadHostVault() {
const input = document.getElementById('settings-host-vault');
const msg = document.getElementById('settings-host-vault-msg');
if (!input) return;
msg.textContent = '';
msg.style.color = 'var(--text-dim)';
try {
const resp = await authFetch('/api/host-vault');
if (!resp) return;
const d = await resp.json();
input.value = d.value || '';
if (d.source === 'env') {
msg.textContent = '当前由进程环境变量提供(修改 .env 不会立即覆盖)';
msg.style.color = 'var(--warning)';
} else if (d.source === 'file') {
msg.textContent = '当前来自 ' + (d.env_file || '.env');
} else {
msg.textContent = '尚未设置(默认使用 ./buckets';
}
} catch(e) {
msg.style.color = 'var(--negative)';
msg.textContent = '加载失败: ' + e;
}
}
async function saveHostVault() {
const input = document.getElementById('settings-host-vault');
const msg = document.getElementById('settings-host-vault-msg');
if (!input) return;
const value = input.value.trim();
msg.textContent = '保存中…';
msg.style.color = 'var(--text-dim)';
try {
const resp = await authFetch('/api/host-vault', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({value})
});
if (!resp) return;
const d = await resp.json();
if (resp.ok) {
msg.style.color = 'var(--accent)';
msg.textContent = '已保存 → ' + (d.env_file || '.env') + '(需重启容器生效)';
} else {
msg.style.color = 'var(--negative)';
msg.textContent = d.error || '保存失败';
}
} catch(e) {
msg.style.color = 'var(--negative)';
msg.textContent = '保存失败: ' + e;
}
}
// authFetch: wraps fetch, shows auth overlay on 401
async function authFetch(url, options) {
const resp = await fetch(url, options);
if (resp.status === 401) {
doLogout();
return null;
}
return resp;
}
// ========================================
const BASE = location.origin; const BASE = location.origin;
let allBuckets = []; let allBuckets = [];
let currentFilter = 'all'; let currentFilter = 'all';
@@ -793,9 +1047,11 @@ document.querySelectorAll('.tab').forEach(tab => {
document.getElementById('network-view').style.display = target === 'network' ? '' : 'none'; document.getElementById('network-view').style.display = target === 'network' ? '' : 'none';
document.getElementById('config-view').style.display = target === 'config' ? '' : 'none'; document.getElementById('config-view').style.display = target === 'config' ? '' : 'none';
document.getElementById('import-view').style.display = target === 'import' ? '' : 'none'; document.getElementById('import-view').style.display = target === 'import' ? '' : 'none';
document.getElementById('settings-view').style.display = target === 'settings' ? '' : 'none';
if (target === 'network') loadNetwork(); if (target === 'network') loadNetwork();
if (target === 'config') loadConfig(); if (target === 'config') loadConfig();
if (target === 'import') { pollImportStatus(); loadImportResults(); } if (target === 'import') { pollImportStatus(); loadImportResults(); }
if (target === 'settings') loadSettingsStatus();
}); });
}); });
@@ -812,7 +1068,11 @@ document.getElementById('search-input').addEventListener('input', (e) => {
async function loadBuckets() { async function loadBuckets() {
try { try {
const res = await fetch(BASE + '/api/buckets'); const res = await fetch(BASE + '/api/buckets');
allBuckets = await res.json(); const data = await res.json();
if (!res.ok || !Array.isArray(data)) {
throw new Error((data && data.error) ? data.error : `HTTP ${res.status}`);
}
allBuckets = data;
updateStats(); updateStats();
buildFilters(); buildFilters();
renderBuckets(allBuckets); renderBuckets(allBuckets);
@@ -1237,7 +1497,7 @@ async function saveConfig(persist) {
} }
} }
loadBuckets(); checkAuth().then(() => loadBuckets());
// --- Import functions --- // --- Import functions ---
const uploadZone = document.getElementById('import-upload-zone'); const uploadZone = document.getElementById('import-upload-zone');
@@ -1300,6 +1560,7 @@ function updateImportUI(s) {
document.getElementById('import-status-text').textContent = statusMap[s.status] || s.status; document.getElementById('import-status-text').textContent = statusMap[s.status] || s.status;
document.getElementById('import-pause-btn').style.display = s.status === 'running' ? '' : 'none'; document.getElementById('import-pause-btn').style.display = s.status === 'running' ? '' : 'none';
if (s.status !== 'running') clearInterval(importPollTimer); if (s.status !== 'running') clearInterval(importPollTimer);
if (s.status === 'completed') loadImportResults();
const errDiv = document.getElementById('import-errors'); const errDiv = document.getElementById('import-errors');
if (s.errors && s.errors.length) { if (s.errors && s.errors.length) {
errDiv.style.display = ''; errDiv.style.display = '';

View File

@@ -112,7 +112,7 @@ class DecayEngine:
return 50.0 return 50.0
importance = max(1, min(10, int(metadata.get("importance", 5)))) importance = max(1, min(10, int(metadata.get("importance", 5))))
activation_count = max(1, int(metadata.get("activation_count", 1))) activation_count = max(1.0, float(metadata.get("activation_count", 1)))
# --- Days since last activation --- # --- Days since last activation ---
last_active_str = metadata.get("last_active", metadata.get("created", "")) last_active_str = metadata.get("last_active", metadata.get("created", ""))
@@ -215,6 +215,7 @@ class DecayEngine:
if imp <= 4 and days_since > 30: if imp <= 4 and days_since > 30:
try: try:
await self.bucket_mgr.update(bucket["id"], resolved=True) await self.bucket_mgr.update(bucket["id"], resolved=True)
meta["resolved"] = True # refresh local meta so resolved_factor applies this cycle
auto_resolved += 1 auto_resolved += 1
logger.info( logger.info(
f"Auto-resolved / 自动结案: " f"Auto-resolved / 自动结案: "

View File

@@ -152,10 +152,13 @@ class Dehydrator:
""" """
Data dehydrator + content analyzer. Data dehydrator + content analyzer.
Three capabilities: dehydration / merge / auto-tagging (domain + emotion). Three capabilities: dehydration / merge / auto-tagging (domain + emotion).
Prefers API (better quality); auto-degrades to local (guaranteed availability). API-only: every public method requires a working LLM API.
If the API is unavailable, methods raise RuntimeError so callers can
surface the failure to the user instead of silently producing low-quality results.
数据脱水器 + 内容分析器。 数据脱水器 + 内容分析器。
三大能力:脱水压缩 / 新旧合并 / 自动打标。 三大能力:脱水压缩 / 新旧合并 / 自动打标。
优先走 APIAPI 挂了自动降级到本地 走 APIAPI 不可用时直接抛出 RuntimeError调用方明确感知
(根据 BEHAVIOR_SPEC.md 三、降级行为表决策:无本地降级)
""" """
def __init__(self, config: dict): def __init__(self, config: dict):
@@ -235,8 +238,8 @@ class Dehydrator:
# --------------------------------------------------------- # ---------------------------------------------------------
# Dehydrate: compress raw content into concise summary # Dehydrate: compress raw content into concise summary
# 脱水:将原始内容压缩为精简摘要 # 脱水:将原始内容压缩为精简摘要
# Try API first, fallback to local # API only (no local fallback)
# 先尝试 API,失败则回退本地 # 仅通过 API 脱水(无本地回退)
# --------------------------------------------------------- # ---------------------------------------------------------
async def dehydrate(self, content: str, metadata: dict = None) -> str: async def dehydrate(self, content: str, metadata: dict = None) -> str:
""" """

View File

@@ -19,7 +19,20 @@ services:
- OMBRE_API_KEY=${OMBRE_API_KEY} - OMBRE_API_KEY=${OMBRE_API_KEY}
- OMBRE_TRANSPORT=streamable-http - OMBRE_TRANSPORT=streamable-http
- OMBRE_BUCKETS_DIR=/data - OMBRE_BUCKETS_DIR=/data
# --- Model override (optional) ---
# If you use Gemini instead of DeepSeek, set these in your .env:
# 如使用 Gemini 而非 DeepSeek在 .env 里加:
# OMBRE_DEHYDRATION_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai/
# OMBRE_DEHYDRATION_MODEL=gemini-2.5-flash-lite
# OMBRE_EMBEDDING_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai/
- OMBRE_DEHYDRATION_BASE_URL=${OMBRE_DEHYDRATION_BASE_URL:-}
- OMBRE_DEHYDRATION_MODEL=${OMBRE_DEHYDRATION_MODEL:-}
- OMBRE_EMBEDDING_BASE_URL=${OMBRE_EMBEDDING_BASE_URL:-}
- OMBRE_EMBEDDING_MODEL=${OMBRE_EMBEDDING_MODEL:-}
volumes: volumes:
# 改成你的 Obsidian Vault 路径,或保持 ./buckets 用本地目录 # 改成你的 Obsidian Vault 路径,或保持 ./buckets 用本地目录
# Change to your Obsidian Vault path, or keep ./buckets for local storage # Change to your Obsidian Vault path, or keep ./buckets for local storage
- ./buckets:/data - ./buckets:/data
# (Optional) Mount custom config to override model / API settings:
# (可选)挂载自定义配置,覆盖模型和 API 设置:
# - ./config.yaml:/app/config.yaml

View File

@@ -21,11 +21,17 @@ services:
- OMBRE_TRANSPORT=streamable-http # Claude.ai requires streamable-http - OMBRE_TRANSPORT=streamable-http # Claude.ai requires streamable-http
- OMBRE_BUCKETS_DIR=/data # Container-internal bucket path / 容器内路径 - OMBRE_BUCKETS_DIR=/data # Container-internal bucket path / 容器内路径
volumes: volumes:
# Mount your Obsidian vault (or any host directory) for persistent storage # Mount your Obsidian vault (or any host directory) for persistent storage.
# 挂载你的 Obsidian 仓库(或任意宿主机目录)做持久化存储 # Set OMBRE_HOST_VAULT_DIR in your .env (or in the Dashboard "Storage" panel)
# Example / 示例: # to point at the host folder you want mounted into the container at /data.
# - /path/to/your/Obsidian Vault/Ombre Brain:/data # 挂载你的 Obsidian 仓库(或任意宿主机目录)做持久化存储。
- /Users/p0lar1s/Library/Mobile Documents/iCloud~md~obsidian/Documents/Obsidian Vault/Ombre Brain:/data # 在 .env或 Dashboard 的「存储」面板)中设置 OMBRE_HOST_VAULT_DIR
# 指向你希望挂载到容器 /data 的宿主机目录。
#
# Examples / 示例:
# OMBRE_HOST_VAULT_DIR=/path/to/your/Obsidian Vault/Ombre Brain
# OMBRE_HOST_VAULT_DIR=~/Library/Mobile Documents/iCloud~md~obsidian/Documents/Obsidian Vault/Ombre Brain
- ${OMBRE_HOST_VAULT_DIR:-./buckets}:/data
- ./config.yaml:/app/config.yaml - ./config.yaml:/app/config.yaml
# Cloudflare Tunnel (optional) — expose to public internet # Cloudflare Tunnel (optional) — expose to public internet

View File

@@ -16,8 +16,6 @@ import json
import math import math
import sqlite3 import sqlite3
import logging import logging
import asyncio
from pathlib import Path
from openai import AsyncOpenAI from openai import AsyncOpenAI
@@ -34,8 +32,12 @@ class EmbeddingEngine:
dehy_cfg = config.get("dehydration", {}) dehy_cfg = config.get("dehydration", {})
embed_cfg = config.get("embedding", {}) embed_cfg = config.get("embedding", {})
self.api_key = dehy_cfg.get("api_key", "") self.api_key = (embed_cfg.get("api_key") or dehy_cfg.get("api_key") or "").strip()
self.base_url = dehy_cfg.get("base_url", "https://generativelanguage.googleapis.com/v1beta/openai/") self.base_url = (
(embed_cfg.get("base_url") or "").strip()
or (dehy_cfg.get("base_url") or "").strip()
or "https://generativelanguage.googleapis.com/v1beta/openai/"
)
self.model = embed_cfg.get("model", "gemini-embedding-001") self.model = embed_cfg.get("model", "gemini-embedding-001")
self.enabled = bool(self.api_key) and embed_cfg.get("enabled", True) self.enabled = bool(self.api_key) and embed_cfg.get("enabled", True)

View File

@@ -19,10 +19,8 @@ import os
import json import json
import hashlib import hashlib
import logging import logging
import asyncio
from datetime import datetime from datetime import datetime
from pathlib import Path from pathlib import Path
from typing import Optional
from utils import count_tokens_approx, now_iso from utils import count_tokens_approx, now_iso
@@ -39,6 +37,8 @@ def _parse_claude_json(data: dict | list) -> list[dict]:
turns = [] turns = []
conversations = data if isinstance(data, list) else [data] conversations = data if isinstance(data, list) else [data]
for conv in conversations: for conv in conversations:
if not isinstance(conv, dict):
continue
messages = conv.get("chat_messages", conv.get("messages", [])) messages = conv.get("chat_messages", conv.get("messages", []))
for msg in messages: for msg in messages:
if not isinstance(msg, dict): if not isinstance(msg, dict):
@@ -61,18 +61,27 @@ def _parse_chatgpt_json(data: list | dict) -> list[dict]:
turns = [] turns = []
conversations = data if isinstance(data, list) else [data] conversations = data if isinstance(data, list) else [data]
for conv in conversations: for conv in conversations:
if not isinstance(conv, dict):
continue
mapping = conv.get("mapping", {}) mapping = conv.get("mapping", {})
if mapping: if mapping:
# ChatGPT uses a tree structure with mapping # ChatGPT uses a tree structure with mapping
sorted_nodes = sorted( # Filter out None nodes before sorting
mapping.values(), valid_nodes = [n for n in mapping.values() if isinstance(n, dict)]
key=lambda n: n.get("message", {}).get("create_time", 0) or 0,
) def _node_ts(n):
msg = n.get("message")
if not isinstance(msg, dict):
return 0
return msg.get("create_time") or 0
sorted_nodes = sorted(valid_nodes, key=_node_ts)
for node in sorted_nodes: for node in sorted_nodes:
msg = node.get("message") msg = node.get("message")
if not msg or not isinstance(msg, dict): if not msg or not isinstance(msg, dict):
continue continue
content_parts = msg.get("content", {}).get("parts", []) content_obj = msg.get("content", {})
content_parts = content_obj.get("parts", []) if isinstance(content_obj, dict) else []
content = " ".join(str(p) for p in content_parts if p) content = " ".join(str(p) for p in content_parts if p)
if not content.strip(): if not content.strip():
continue continue
@@ -168,7 +177,7 @@ def detect_and_parse(raw_content: str, filename: str = "") -> list[dict]:
# Single conversation object with role/content messages # Single conversation object with role/content messages
if "role" in sample and "content" in sample: if "role" in sample and "content" in sample:
return _parse_claude_json(data) return _parse_claude_json(data)
except (json.JSONDecodeError, KeyError, IndexError): except (json.JSONDecodeError, KeyError, IndexError, AttributeError, TypeError):
pass pass
# Fall back to markdown/text # Fall back to markdown/text

View File

@@ -12,7 +12,25 @@ import os
import re import re
import shutil import shutil
VAULT_DIR = os.path.expanduser("~/Documents/Obsidian Vault/Ombre Brain")
def _resolve_vault_dir() -> str:
"""
Resolve the bucket vault root.
Priority: $OMBRE_BUCKETS_DIR > config.yaml > built-in ./buckets.
"""
env_dir = os.environ.get("OMBRE_BUCKETS_DIR", "").strip()
if env_dir:
return os.path.expanduser(env_dir)
try:
from utils import load_config
return load_config()["buckets_dir"]
except Exception:
return os.path.join(
os.path.dirname(os.path.abspath(__file__)), "buckets"
)
VAULT_DIR = _resolve_vault_dir()
DYNAMIC_DIR = os.path.join(VAULT_DIR, "dynamic") DYNAMIC_DIR = os.path.join(VAULT_DIR, "dynamic")
@@ -99,7 +117,7 @@ def migrate():
print(f"{filename}") print(f"{filename}")
print(f"{primary_domain}/{new_filename}") print(f"{primary_domain}/{new_filename}")
print(f"\n迁移完成。") print("\n迁移完成。")
# 展示新结构 # 展示新结构
print("\n=== 新目录结构 ===") print("\n=== 新目录结构 ===")

View File

@@ -38,7 +38,11 @@ ANALYZE_PROMPT = (
'}' '}'
) )
DATA_DIR = "/data/dynamic" DATA_DIR = os.path.join(
os.environ.get("OMBRE_BUCKETS_DIR", "").strip()
or (lambda: __import__("utils").load_config()["buckets_dir"])(),
"dynamic",
)
UNCLASS_DIR = os.path.join(DATA_DIR, "未分类") UNCLASS_DIR = os.path.join(DATA_DIR, "未分类")
@@ -48,11 +52,15 @@ def sanitize(name):
async def reclassify(): async def reclassify():
from utils import load_config
cfg = load_config()
dehy = cfg.get("dehydration", {})
client = AsyncOpenAI( client = AsyncOpenAI(
api_key=os.environ.get("OMBRE_API_KEY", ""), api_key=os.environ.get("OMBRE_API_KEY", "") or dehy.get("api_key", ""),
base_url="https://api.siliconflow.cn/v1", base_url=dehy.get("base_url", "https://api.deepseek.com/v1"),
timeout=60.0, timeout=60.0,
) )
model_name = dehy.get("model", "deepseek-chat")
files = sorted(glob.glob(os.path.join(UNCLASS_DIR, "*.md"))) files = sorted(glob.glob(os.path.join(UNCLASS_DIR, "*.md")))
print(f"找到 {len(files)} 个未分类文件\n") print(f"找到 {len(files)} 个未分类文件\n")
@@ -66,7 +74,7 @@ async def reclassify():
try: try:
resp = await client.chat.completions.create( resp = await client.chat.completions.create(
model="deepseek-ai/DeepSeek-V3", model=model_name,
messages=[ messages=[
{"role": "system", "content": ANALYZE_PROMPT}, {"role": "system", "content": ANALYZE_PROMPT},
{"role": "user", "content": full_text[:2000]}, {"role": "user", "content": full_text[:2000]},

View File

@@ -8,7 +8,25 @@ import os
import re import re
import shutil import shutil
VAULT_DIR = os.path.expanduser("~/Documents/Obsidian Vault/Ombre Brain")
def _resolve_vault_dir() -> str:
"""
Resolve the bucket vault root.
Priority: $OMBRE_BUCKETS_DIR > config.yaml > built-in ./buckets.
"""
env_dir = os.environ.get("OMBRE_BUCKETS_DIR", "").strip()
if env_dir:
return os.path.expanduser(env_dir)
try:
from utils import load_config
return load_config()["buckets_dir"]
except Exception:
return os.path.join(
os.path.dirname(os.path.abspath(__file__)), "buckets"
)
VAULT_DIR = _resolve_vault_dir()
DYNAMIC_DIR = os.path.join(VAULT_DIR, "dynamic") DYNAMIC_DIR = os.path.join(VAULT_DIR, "dynamic")
# 新域关键词表(和 dehydrator.py 的 _local_analyze 一致) # 新域关键词表(和 dehydrator.py 的 _local_analyze 一致)
@@ -147,7 +165,6 @@ def reclassify():
new_domains = classify(body, old_domains) new_domains = classify(body, old_domains)
primary = sanitize_name(new_domains[0]) primary = sanitize_name(new_domains[0])
old_primary = sanitize_name(old_domains[0]) if old_domains else "未分类"
if name and name != bucket_id: if name and name != bucket_id:
new_filename = f"{sanitize_name(name)}_{bucket_id}.md" new_filename = f"{sanitize_name(name)}_{bucket_id}.md"
@@ -179,7 +196,7 @@ def reclassify():
os.rmdir(dp) os.rmdir(dp)
print(f"\n 🗑 删除空目录: {d}/") print(f"\n 🗑 删除空目录: {d}/")
print(f"\n重分类完成。\n") print("\n重分类完成。\n")
# 展示新结构 # 展示新结构
print("=== 新目录结构 ===") print("=== 新目录结构 ===")

511
server.py
View File

@@ -10,18 +10,20 @@
# 核心职责: # 核心职责:
# - Initialize config, bucket manager, dehydrator, decay engine # - Initialize config, bucket manager, dehydrator, decay engine
# 初始化配置、记忆桶管理器、脱水器、衰减引擎 # 初始化配置、记忆桶管理器、脱水器、衰减引擎
# - Expose 5 MCP tools: # - Expose 6 MCP tools:
# 暴露 5 个 MCP 工具: # 暴露 6 个 MCP 工具:
# breath — Surface unresolved memories or search by keyword # breath — Surface unresolved memories or search by keyword
# 浮现未解决记忆 或 按关键词检索 # 浮现未解决记忆 或 按关键词检索
# hold — Store a single memory # hold — Store a single memory (or write a `feel` reflection)
# 存储单条记忆 # 存储单条记忆(或写 feel 反思)
# grow — Diary digest, auto-split into multiple buckets # grow — Diary digest, auto-split into multiple buckets
# 日记归档,自动拆分多桶 # 日记归档,自动拆分多桶
# trace — Modify metadata / resolved / delete # trace — Modify metadata / resolved / delete
# 修改元数据 / resolved 标记 / 删除 # 修改元数据 / resolved 标记 / 删除
# pulse — System status + bucket listing # pulse — System status + bucket listing
# 系统状态 + 所有桶列表 # 系统状态 + 所有桶列表
# dream — Surface recent dynamic buckets for self-digestion
# 返回最近桶 供模型自省/写 feel
# #
# Startup: # Startup:
# 启动方式: # 启动方式:
@@ -35,6 +37,11 @@ import sys
import random import random
import logging import logging
import asyncio import asyncio
import hashlib
import hmac
import secrets
import time
import json as _json_lib
import httpx import httpx
@@ -56,11 +63,44 @@ config = load_config()
setup_logging(config.get("log_level", "INFO")) setup_logging(config.get("log_level", "INFO"))
logger = logging.getLogger("ombre_brain") logger = logging.getLogger("ombre_brain")
# --- Runtime env vars (port + webhook) / 运行时环境变量 ---
# OMBRE_PORT: HTTP/SSE 监听端口,默认 8000
try:
OMBRE_PORT = int(os.environ.get("OMBRE_PORT", "8000") or "8000")
except ValueError:
logger.warning("OMBRE_PORT 不是合法整数,回退到 8000")
OMBRE_PORT = 8000
# OMBRE_HOOK_URL: 在 breath/dream 被调用后推送事件到该 URLPOST JSON
# OMBRE_HOOK_SKIP: 设为 true/1/yes 跳过推送。
# 详见 ENV_VARS.md。
OMBRE_HOOK_URL = os.environ.get("OMBRE_HOOK_URL", "").strip()
OMBRE_HOOK_SKIP = os.environ.get("OMBRE_HOOK_SKIP", "").strip().lower() in ("1", "true", "yes", "on")
async def _fire_webhook(event: str, payload: dict) -> None:
"""
Fire-and-forget POST to OMBRE_HOOK_URL with the given event payload.
Failures are logged at WARNING level only — never propagated to the caller.
"""
if OMBRE_HOOK_SKIP or not OMBRE_HOOK_URL:
return
try:
body = {
"event": event,
"timestamp": time.time(),
"payload": payload,
}
async with httpx.AsyncClient(timeout=5.0) as client:
await client.post(OMBRE_HOOK_URL, json=body)
except Exception as e:
logger.warning(f"Webhook push failed ({event}{OMBRE_HOOK_URL}): {e}")
# --- Initialize core components / 初始化核心组件 --- # --- Initialize core components / 初始化核心组件 ---
bucket_mgr = BucketManager(config) # Bucket manager / 记忆桶管理器 embedding_engine = EmbeddingEngine(config) # Embedding engine first (BucketManager depends on it)
bucket_mgr = BucketManager(config, embedding_engine=embedding_engine) # Bucket manager / 记忆桶管理器
dehydrator = Dehydrator(config) # Dehydrator / 脱水器 dehydrator = Dehydrator(config) # Dehydrator / 脱水器
decay_engine = DecayEngine(config, bucket_mgr) # Decay engine / 衰减引擎 decay_engine = DecayEngine(config, bucket_mgr) # Decay engine / 衰减引擎
embedding_engine = EmbeddingEngine(config) # Embedding engine / 向量化引擎
import_engine = ImportEngine(config, bucket_mgr, dehydrator, embedding_engine) # Import engine / 导入引擎 import_engine = ImportEngine(config, bucket_mgr, dehydrator, embedding_engine) # Import engine / 导入引擎
# --- Create MCP server instance / 创建 MCP 服务器实例 --- # --- Create MCP server instance / 创建 MCP 服务器实例 ---
@@ -69,16 +109,199 @@ import_engine = ImportEngine(config, bucket_mgr, dehydrator, embedding_engine)
mcp = FastMCP( mcp = FastMCP(
"Ombre Brain", "Ombre Brain",
host="0.0.0.0", host="0.0.0.0",
port=8000, port=OMBRE_PORT,
) )
# =============================================================
# Dashboard Auth — simple cookie-based session auth
# Dashboard 认证 —— 基于 Cookie 的会话认证
#
# Env var OMBRE_DASHBOARD_PASSWORD overrides file-stored password.
# First visit with no password set → forced setup wizard.
# Sessions stored in memory (lost on restart, 7-day expiry).
# =============================================================
_sessions: dict[str, float] = {} # {token: expiry_timestamp}
def _get_auth_file() -> str:
return os.path.join(config["buckets_dir"], ".dashboard_auth.json")
def _load_password_hash() -> str | None:
try:
auth_file = _get_auth_file()
if os.path.exists(auth_file):
with open(auth_file, "r", encoding="utf-8") as f:
return _json_lib.load(f).get("password_hash")
except Exception:
pass
return None
def _save_password_hash(password: str) -> None:
salt = secrets.token_hex(16)
h = hashlib.sha256(f"{salt}:{password}".encode()).hexdigest()
auth_file = _get_auth_file()
os.makedirs(os.path.dirname(auth_file), exist_ok=True)
with open(auth_file, "w", encoding="utf-8") as f:
_json_lib.dump({"password_hash": f"{salt}:{h}"}, f)
def _verify_password_hash(password: str, stored: str) -> bool:
if ":" not in stored:
return False
salt, h = stored.split(":", 1)
return hmac.compare_digest(
h, hashlib.sha256(f"{salt}:{password}".encode()).hexdigest()
)
def _is_setup_needed() -> bool:
"""True if no password is configured (env var or file)."""
if os.environ.get("OMBRE_DASHBOARD_PASSWORD", ""):
return False
return _load_password_hash() is None
def _verify_any_password(password: str) -> bool:
"""Check password against env var (first) or stored hash."""
env_pwd = os.environ.get("OMBRE_DASHBOARD_PASSWORD", "")
if env_pwd:
return hmac.compare_digest(password, env_pwd)
stored = _load_password_hash()
if not stored:
return False
return _verify_password_hash(password, stored)
def _create_session() -> str:
token = secrets.token_urlsafe(32)
_sessions[token] = time.time() + 86400 * 7 # 7-day expiry
return token
def _is_authenticated(request) -> bool:
token = request.cookies.get("ombre_session")
if not token:
return False
expiry = _sessions.get(token)
if expiry is None or time.time() > expiry:
_sessions.pop(token, None)
return False
return True
def _require_auth(request):
"""Return JSONResponse(401) if not authenticated, else None."""
from starlette.responses import JSONResponse
if not _is_authenticated(request):
return JSONResponse(
{"error": "Unauthorized", "setup_needed": _is_setup_needed()},
status_code=401,
)
return None
# --- Auth endpoints ---
@mcp.custom_route("/auth/status", methods=["GET"])
async def auth_status(request):
"""Return auth state (authenticated, setup_needed)."""
from starlette.responses import JSONResponse
return JSONResponse({
"authenticated": _is_authenticated(request),
"setup_needed": _is_setup_needed(),
})
@mcp.custom_route("/auth/setup", methods=["POST"])
async def auth_setup_endpoint(request):
"""Initial password setup (only when no password is configured)."""
from starlette.responses import JSONResponse
if not _is_setup_needed():
return JSONResponse({"error": "Already configured"}, status_code=400)
try:
body = await request.json()
except Exception:
return JSONResponse({"error": "Invalid JSON"}, status_code=400)
password = body.get("password", "").strip()
if len(password) < 6:
return JSONResponse({"error": "密码不能少于6位"}, status_code=400)
_save_password_hash(password)
token = _create_session()
resp = JSONResponse({"ok": True})
resp.set_cookie("ombre_session", token, httponly=True, samesite="lax", max_age=86400 * 7)
return resp
@mcp.custom_route("/auth/login", methods=["POST"])
async def auth_login(request):
"""Login with password."""
from starlette.responses import JSONResponse
try:
body = await request.json()
except Exception:
return JSONResponse({"error": "Invalid JSON"}, status_code=400)
password = body.get("password", "")
if _verify_any_password(password):
token = _create_session()
resp = JSONResponse({"ok": True})
resp.set_cookie("ombre_session", token, httponly=True, samesite="lax", max_age=86400 * 7)
return resp
return JSONResponse({"error": "密码错误"}, status_code=401)
@mcp.custom_route("/auth/logout", methods=["POST"])
async def auth_logout(request):
"""Invalidate session."""
from starlette.responses import JSONResponse
token = request.cookies.get("ombre_session")
if token:
_sessions.pop(token, None)
resp = JSONResponse({"ok": True})
resp.delete_cookie("ombre_session")
return resp
@mcp.custom_route("/auth/change-password", methods=["POST"])
async def auth_change_password(request):
"""Change dashboard password (requires current password)."""
from starlette.responses import JSONResponse
err = _require_auth(request)
if err:
return err
if os.environ.get("OMBRE_DASHBOARD_PASSWORD", ""):
return JSONResponse({"error": "当前使用环境变量密码,请直接修改 OMBRE_DASHBOARD_PASSWORD"}, status_code=400)
try:
body = await request.json()
except Exception:
return JSONResponse({"error": "Invalid JSON"}, status_code=400)
current = body.get("current", "")
new_pwd = body.get("new", "").strip()
if not _verify_any_password(current):
return JSONResponse({"error": "当前密码错误"}, status_code=401)
if len(new_pwd) < 6:
return JSONResponse({"error": "新密码不能少于6位"}, status_code=400)
_save_password_hash(new_pwd)
_sessions.clear()
token = _create_session()
resp = JSONResponse({"ok": True})
resp.set_cookie("ombre_session", token, httponly=True, samesite="lax", max_age=86400 * 7)
return resp
# ============================================================= # =============================================================
# /health endpoint: lightweight keepalive # /health endpoint: lightweight keepalive
# 轻量保活接口 # 轻量保活接口
# For Cloudflare Tunnel or reverse proxy to ping, preventing idle timeout # For Cloudflare Tunnel or reverse proxy to ping, preventing idle timeout
# 供 Cloudflare Tunnel 或反代定期 ping防止空闲超时断连 # 供 Cloudflare Tunnel 或反代定期 ping防止空闲超时断连
# ============================================================= # =============================================================
@mcp.custom_route("/", methods=["GET"])
async def root_redirect(request):
from starlette.responses import RedirectResponse
return RedirectResponse(url="/dashboard")
@mcp.custom_route("/health", methods=["GET"]) @mcp.custom_route("/health", methods=["GET"])
async def health_check(request): async def health_check(request):
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
@@ -140,8 +363,11 @@ async def breath_hook(request):
token_budget -= summary_tokens token_budget -= summary_tokens
if not parts: if not parts:
await _fire_webhook("breath_hook", {"surfaced": 0})
return PlainTextResponse("") return PlainTextResponse("")
return PlainTextResponse("[Ombre Brain - 记忆浮现]\n" + "\n---\n".join(parts)) body_text = "[Ombre Brain - 记忆浮现]\n" + "\n---\n".join(parts)
await _fire_webhook("breath_hook", {"surfaced": len(parts), "chars": len(body_text)})
return PlainTextResponse(body_text)
except Exception as e: except Exception as e:
logger.warning(f"Breath hook failed: {e}") logger.warning(f"Breath hook failed: {e}")
return PlainTextResponse("") return PlainTextResponse("")
@@ -178,7 +404,9 @@ async def dream_hook(request):
f"{strip_wikilinks(b['content'][:200])}" f"{strip_wikilinks(b['content'][:200])}"
) )
return PlainTextResponse("[Ombre Brain - Dreaming]\n" + "\n---\n".join(parts)) body_text = "[Ombre Brain - Dreaming]\n" + "\n---\n".join(parts)
await _fire_webhook("dream_hook", {"surfaced": len(parts), "chars": len(body_text)})
return PlainTextResponse(body_text)
except Exception as e: except Exception as e:
logger.warning(f"Dream hook failed: {e}") logger.warning(f"Dream hook failed: {e}")
return PlainTextResponse("") return PlainTextResponse("")
@@ -274,12 +502,47 @@ async def breath(
valence: float = -1, valence: float = -1,
arousal: float = -1, arousal: float = -1,
max_results: int = 20, max_results: int = 20,
importance_min: int = -1,
) -> str: ) -> str:
"""检索/浮现记忆。不传query或传空=自动浮现,有query=关键词检索。max_tokens控制返回总token上限(默认10000)。domain逗号分隔,valence/arousal 0~1(-1忽略)。max_results控制返回数量上限(默认20,最大50)。""" """检索/浮现记忆。不传query或传空=自动浮现,有query=关键词检索。max_tokens控制返回总token上限(默认10000)。domain逗号分隔,valence/arousal 0~1(-1忽略)。max_results控制返回数量上限(默认20,最大50)。importance_min>=1时按重要度批量拉取(不走语义搜索,按importance降序返回最多20条)。"""
await decay_engine.ensure_started() await decay_engine.ensure_started()
max_results = min(max_results, 50) max_results = min(max_results, 50)
max_tokens = min(max_tokens, 20000) max_tokens = min(max_tokens, 20000)
# --- importance_min mode: bulk fetch by importance threshold ---
# --- 重要度批量拉取模式:跳过语义搜索,按 importance 降序返回 ---
if importance_min >= 1:
try:
all_buckets = await bucket_mgr.list_all(include_archive=False)
except Exception as e:
return f"记忆系统暂时无法访问: {e}"
filtered = [
b for b in all_buckets
if int(b["metadata"].get("importance", 0)) >= importance_min
and b["metadata"].get("type") not in ("feel",)
]
filtered.sort(key=lambda b: int(b["metadata"].get("importance", 0)), reverse=True)
filtered = filtered[:20]
if not filtered:
return f"没有重要度 >= {importance_min} 的记忆。"
results = []
token_used = 0
for b in filtered:
if token_used >= max_tokens:
break
try:
clean_meta = {k: v for k, v in b["metadata"].items() if k != "tags"}
summary = await dehydrator.dehydrate(strip_wikilinks(b["content"]), clean_meta)
t = count_tokens_approx(summary)
if token_used + t > max_tokens:
break
imp = b["metadata"].get("importance", 0)
results.append(f"[importance:{imp}] [bucket_id:{b['id']}] {summary}")
token_used += t
except Exception as e:
logger.warning(f"importance_min dehydrate failed: {e}")
return "\n---\n".join(results) if results else "没有可以展示的记忆。"
# --- No args or empty query: surfacing mode (weight pool active push) --- # --- No args or empty query: surfacing mode (weight pool active push) ---
# --- 无参数或空query浮现模式权重池主动推送--- # --- 无参数或空query浮现模式权重池主动推送---
if not query or not query.strip(): if not query or not query.strip():
@@ -330,6 +593,18 @@ async def breath(
top_scores = [(b["metadata"].get("name", b["id"]), decay_engine.calculate_score(b["metadata"])) for b in scored[:5]] top_scores = [(b["metadata"].get("name", b["id"]), decay_engine.calculate_score(b["metadata"])) for b in scored[:5]]
logger.info(f"Top unresolved scores: {top_scores}") logger.info(f"Top unresolved scores: {top_scores}")
# --- Cold-start detection: never-seen important buckets surface first ---
# --- 冷启动检测:从未被访问过且重要度>=8的桶优先插入最前面最多2个---
cold_start = [
b for b in unresolved
if int(b["metadata"].get("activation_count", 0)) == 0
and int(b["metadata"].get("importance", 0)) >= 8
][:2]
cold_start_ids = {b["id"] for b in cold_start}
# Merge: cold_start first, then scored (excluding duplicates)
scored_deduped = [b for b in scored if b["id"] not in cold_start_ids]
scored_with_cold = cold_start + scored_deduped
# --- Token-budgeted surfacing with diversity + hard cap --- # --- Token-budgeted surfacing with diversity + hard cap ---
# --- 按 token 预算浮现,带多样性 + 硬上限 --- # --- 按 token 预算浮现,带多样性 + 硬上限 ---
# Top-1 always surfaces; rest sampled from top-20 for diversity # Top-1 always surfaces; rest sampled from top-20 for diversity
@@ -337,13 +612,17 @@ async def breath(
for r in pinned_results: for r in pinned_results:
token_budget -= count_tokens_approx(r) token_budget -= count_tokens_approx(r)
candidates = list(scored) candidates = list(scored_with_cold)
if len(candidates) > 1: if len(candidates) > 1:
# Ensure highest-score bucket is first, shuffle rest from top-20 # Cold-start buckets stay at front; shuffle rest from top-20
top1 = [candidates[0]] n_cold = len(cold_start)
pool = candidates[1:min(20, len(candidates))] non_cold = candidates[n_cold:]
random.shuffle(pool) if len(non_cold) > 1:
candidates = top1 + pool + candidates[min(20, len(candidates)):] top1 = [non_cold[0]]
pool = non_cold[1:min(20, len(non_cold))]
random.shuffle(pool)
non_cold = top1 + pool + non_cold[min(20, len(non_cold)):]
candidates = cold_start + non_cold
# Hard cap: never surface more than max_results buckets # Hard cap: never surface more than max_results buckets
candidates = candidates[:max_results] candidates = candidates[:max_results]
@@ -485,9 +764,12 @@ async def breath(
logger.warning(f"Random surfacing failed / 随机浮现失败: {e}") logger.warning(f"Random surfacing failed / 随机浮现失败: {e}")
if not results: if not results:
await _fire_webhook("breath", {"mode": "empty", "matches": 0})
return "未找到相关记忆。" return "未找到相关记忆。"
return "\n---\n".join(results) final_text = "\n---\n".join(results)
await _fire_webhook("breath", {"mode": "ok", "matches": len(matches), "chars": len(final_text)})
return final_text
# ============================================================= # =============================================================
@@ -557,11 +839,16 @@ async def hold(
} }
domain = analysis["domain"] domain = analysis["domain"]
valence = analysis["valence"] auto_valence = analysis["valence"]
arousal = analysis["arousal"] auto_arousal = analysis["arousal"]
auto_tags = analysis["tags"] auto_tags = analysis["tags"]
suggested_name = analysis.get("suggested_name", "") suggested_name = analysis.get("suggested_name", "")
# --- User-supplied valence/arousal takes priority over analyze() result ---
# --- 用户显式传入的 valence/arousal 优先analyze() 结果作为 fallback ---
final_valence = valence if 0 <= valence <= 1 else auto_valence
final_arousal = arousal if 0 <= arousal <= 1 else auto_arousal
all_tags = list(dict.fromkeys(auto_tags + extra_tags)) all_tags = list(dict.fromkeys(auto_tags + extra_tags))
# --- Pinned buckets bypass merge and are created directly in permanent dir --- # --- Pinned buckets bypass merge and are created directly in permanent dir ---
@@ -572,8 +859,8 @@ async def hold(
tags=all_tags, tags=all_tags,
importance=10, importance=10,
domain=domain, domain=domain,
valence=valence, valence=final_valence,
arousal=arousal, arousal=final_arousal,
name=suggested_name or None, name=suggested_name or None,
bucket_type="permanent", bucket_type="permanent",
pinned=True, pinned=True,
@@ -590,8 +877,8 @@ async def hold(
tags=all_tags, tags=all_tags,
importance=importance, importance=importance,
domain=domain, domain=domain,
valence=valence, valence=final_valence,
arousal=arousal, arousal=final_arousal,
name=suggested_name, name=suggested_name,
) )
@@ -967,7 +1254,9 @@ async def dream() -> str:
except Exception as e: except Exception as e:
logger.warning(f"Dream crystallization hint failed: {e}") logger.warning(f"Dream crystallization hint failed: {e}")
return header + "\n---\n".join(parts) + connection_hint + crystal_hint final_text = header + "\n---\n".join(parts) + connection_hint + crystal_hint
await _fire_webhook("dream", {"recent": len(recent), "chars": len(final_text)})
return final_text
# ============================================================= # =============================================================
@@ -978,6 +1267,8 @@ async def dream() -> str:
async def api_buckets(request): async def api_buckets(request):
"""List all buckets with metadata (no content for efficiency).""" """List all buckets with metadata (no content for efficiency)."""
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
try: try:
all_buckets = await bucket_mgr.list_all(include_archive=True) all_buckets = await bucket_mgr.list_all(include_archive=True)
result = [] result = []
@@ -1012,6 +1303,8 @@ async def api_buckets(request):
async def api_bucket_detail(request): async def api_bucket_detail(request):
"""Get full bucket content by ID.""" """Get full bucket content by ID."""
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
bucket_id = request.path_params["bucket_id"] bucket_id = request.path_params["bucket_id"]
bucket = await bucket_mgr.get(bucket_id) bucket = await bucket_mgr.get(bucket_id)
if not bucket: if not bucket:
@@ -1029,6 +1322,8 @@ async def api_bucket_detail(request):
async def api_search(request): async def api_search(request):
"""Search buckets by query.""" """Search buckets by query."""
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
query = request.query_params.get("q", "") query = request.query_params.get("q", "")
if not query: if not query:
return JSONResponse({"error": "missing q parameter"}, status_code=400) return JSONResponse({"error": "missing q parameter"}, status_code=400)
@@ -1055,6 +1350,8 @@ async def api_search(request):
async def api_network(request): async def api_network(request):
"""Get embedding similarity network for visualization.""" """Get embedding similarity network for visualization."""
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
try: try:
all_buckets = await bucket_mgr.list_all(include_archive=False) all_buckets = await bucket_mgr.list_all(include_archive=False)
nodes = [] nodes = []
@@ -1098,6 +1395,8 @@ async def api_network(request):
async def api_breath_debug(request): async def api_breath_debug(request):
"""Debug endpoint: simulate breath scoring and return per-bucket breakdown.""" """Debug endpoint: simulate breath scoring and return per-bucket breakdown."""
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
query = request.query_params.get("q", "") query = request.query_params.get("q", "")
q_valence = request.query_params.get("valence") q_valence = request.query_params.get("valence")
q_arousal = request.query_params.get("arousal") q_arousal = request.query_params.get("arousal")
@@ -1189,6 +1488,8 @@ async def dashboard(request):
async def api_config_get(request): async def api_config_get(request):
"""Get current runtime config (safe fields only, API key masked).""" """Get current runtime config (safe fields only, API key masked)."""
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
dehy = config.get("dehydration", {}) dehy = config.get("dehydration", {})
emb = config.get("embedding", {}) emb = config.get("embedding", {})
api_key = dehy.get("api_key", "") api_key = dehy.get("api_key", "")
@@ -1216,6 +1517,8 @@ async def api_config_update(request):
"""Hot-update runtime config. Optionally persist to config.yaml.""" """Hot-update runtime config. Optionally persist to config.yaml."""
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
import yaml import yaml
err = _require_auth(request)
if err: return err
try: try:
body = await request.json() body = await request.json()
except Exception: except Exception:
@@ -1297,6 +1600,122 @@ async def api_config_update(request):
return JSONResponse({"updated": updated, "ok": True}) return JSONResponse({"updated": updated, "ok": True})
# =============================================================
# /api/host-vault — read/write the host-side OMBRE_HOST_VAULT_DIR
# 用于在 Dashboard 设置 docker-compose 挂载的宿主机记忆桶目录。
# 写入项目根目录的 .env 文件,需 docker compose down/up 才能生效。
# =============================================================
def _project_env_path() -> str:
return os.path.join(os.path.dirname(os.path.abspath(__file__)), ".env")
def _read_env_var(name: str) -> str:
"""Return current value of `name` from process env first, then .env file (best-effort)."""
val = os.environ.get(name, "").strip()
if val:
return val
env_path = _project_env_path()
if not os.path.exists(env_path):
return ""
try:
with open(env_path, "r", encoding="utf-8") as f:
for line in f:
line = line.strip()
if not line or line.startswith("#") or "=" not in line:
continue
k, _, v = line.partition("=")
if k.strip() == name:
return v.strip().strip('"').strip("'")
except Exception:
pass
return ""
def _write_env_var(name: str, value: str) -> None:
"""
Idempotent upsert of `NAME=value` in project .env. Creates the file if missing.
Preserves other entries verbatim. Quotes values containing spaces.
"""
env_path = _project_env_path()
quoted = f'"{value}"' if value and (" " in value or "#" in value) else value
new_line = f"{name}={quoted}\n"
lines: list[str] = []
if os.path.exists(env_path):
with open(env_path, "r", encoding="utf-8") as f:
lines = f.readlines()
replaced = False
for i, raw in enumerate(lines):
stripped = raw.strip()
if not stripped or stripped.startswith("#") or "=" not in stripped:
continue
k, _, _v = stripped.partition("=")
if k.strip() == name:
lines[i] = new_line
replaced = True
break
if not replaced:
if lines and not lines[-1].endswith("\n"):
lines[-1] += "\n"
lines.append(new_line)
with open(env_path, "w", encoding="utf-8") as f:
f.writelines(lines)
@mcp.custom_route("/api/host-vault", methods=["GET"])
async def api_host_vault_get(request):
"""Read the current OMBRE_HOST_VAULT_DIR (process env > project .env)."""
from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
value = _read_env_var("OMBRE_HOST_VAULT_DIR")
return JSONResponse({
"value": value,
"source": "env" if os.environ.get("OMBRE_HOST_VAULT_DIR", "").strip() else ("file" if value else ""),
"env_file": _project_env_path(),
})
@mcp.custom_route("/api/host-vault", methods=["POST"])
async def api_host_vault_set(request):
"""
Persist OMBRE_HOST_VAULT_DIR to the project .env file.
Body: {"value": "/path/to/vault"} (empty string clears the entry)
Note: container restart is required for docker-compose to pick up the new mount.
"""
from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
try:
body = await request.json()
except Exception:
return JSONResponse({"error": "invalid JSON"}, status_code=400)
raw = body.get("value", "")
if not isinstance(raw, str):
return JSONResponse({"error": "value must be a string"}, status_code=400)
value = raw.strip()
# Reject characters that would break .env / shell parsing
if "\n" in value or "\r" in value or '"' in value or "'" in value:
return JSONResponse({"error": "value must not contain quotes or newlines"}, status_code=400)
try:
_write_env_var("OMBRE_HOST_VAULT_DIR", value)
except Exception as e:
return JSONResponse({"error": f"failed to write .env: {e}"}, status_code=500)
return JSONResponse({
"ok": True,
"value": value,
"env_file": _project_env_path(),
"note": "已写入 .env需在宿主机执行 `docker compose down && docker compose up -d` 让新挂载生效。",
})
# ============================================================= # =============================================================
# Import API — conversation history import # Import API — conversation history import
# 导入 API — 对话历史导入 # 导入 API — 对话历史导入
@@ -1306,6 +1725,8 @@ async def api_config_update(request):
async def api_import_upload(request): async def api_import_upload(request):
"""Upload a conversation file and start import.""" """Upload a conversation file and start import."""
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
if import_engine.is_running: if import_engine.is_running:
return JSONResponse({"error": "Import already running"}, status_code=409) return JSONResponse({"error": "Import already running"}, status_code=409)
@@ -1357,6 +1778,8 @@ async def api_import_upload(request):
async def api_import_status(request): async def api_import_status(request):
"""Get current import progress.""" """Get current import progress."""
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
return JSONResponse(import_engine.get_status()) return JSONResponse(import_engine.get_status())
@@ -1364,6 +1787,8 @@ async def api_import_status(request):
async def api_import_pause(request): async def api_import_pause(request):
"""Pause the running import.""" """Pause the running import."""
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
if not import_engine.is_running: if not import_engine.is_running:
return JSONResponse({"error": "No import running"}, status_code=400) return JSONResponse({"error": "No import running"}, status_code=400)
import_engine.pause() import_engine.pause()
@@ -1374,6 +1799,8 @@ async def api_import_pause(request):
async def api_import_patterns(request): async def api_import_patterns(request):
"""Detect high-frequency patterns after import.""" """Detect high-frequency patterns after import."""
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
try: try:
patterns = await import_engine.detect_patterns() patterns = await import_engine.detect_patterns()
return JSONResponse({"patterns": patterns}) return JSONResponse({"patterns": patterns})
@@ -1385,6 +1812,8 @@ async def api_import_patterns(request):
async def api_import_results(request): async def api_import_results(request):
"""List recently imported/created buckets for review.""" """List recently imported/created buckets for review."""
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
try: try:
limit = int(request.query_params.get("limit", "50")) limit = int(request.query_params.get("limit", "50"))
all_buckets = await bucket_mgr.list_all(include_archive=False) all_buckets = await bucket_mgr.list_all(include_archive=False)
@@ -1411,6 +1840,8 @@ async def api_import_results(request):
async def api_import_review(request): async def api_import_review(request):
"""Apply review decisions: mark buckets as important/noise/pinned.""" """Apply review decisions: mark buckets as important/noise/pinned."""
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
try: try:
body = await request.json() body = await request.json()
except Exception: except Exception:
@@ -1446,6 +1877,34 @@ async def api_import_review(request):
return JSONResponse({"applied": applied, "errors": errors}) return JSONResponse({"applied": applied, "errors": errors})
# =============================================================
# /api/status — system status for Dashboard settings tab
# /api/status — Dashboard 设置页用系统状态
# =============================================================
@mcp.custom_route("/api/status", methods=["GET"])
async def api_system_status(request):
"""Return detailed system status for the settings panel."""
from starlette.responses import JSONResponse
err = _require_auth(request)
if err: return err
try:
stats = await bucket_mgr.get_stats()
return JSONResponse({
"decay_engine": "running" if decay_engine.is_running else "stopped",
"embedding_enabled": embedding_engine.enabled,
"buckets": {
"permanent": stats.get("permanent_count", 0),
"dynamic": stats.get("dynamic_count", 0),
"archive": stats.get("archive_count", 0),
"total": stats.get("permanent_count", 0) + stats.get("dynamic_count", 0),
},
"using_env_password": bool(os.environ.get("OMBRE_DASHBOARD_PASSWORD", "")),
"version": "1.3.0",
})
except Exception as e:
return JSONResponse({"error": str(e)}, status_code=500)
# --- Entry point / 启动入口 --- # --- Entry point / 启动入口 ---
if __name__ == "__main__": if __name__ == "__main__":
transport = config.get("transport", "stdio") transport = config.get("transport", "stdio")
@@ -1463,7 +1922,7 @@ if __name__ == "__main__":
async with httpx.AsyncClient() as client: async with httpx.AsyncClient() as client:
while True: while True:
try: try:
await client.get("http://localhost:8000/health", timeout=5) await client.get(f"http://localhost:{OMBRE_PORT}/health", timeout=5)
logger.debug("Keepalive ping OK / 保活 ping 成功") logger.debug("Keepalive ping OK / 保活 ping 成功")
except Exception as e: except Exception as e:
logger.warning(f"Keepalive ping failed / 保活 ping 失败: {e}") logger.warning(f"Keepalive ping failed / 保活 ping 失败: {e}")
@@ -1490,6 +1949,6 @@ if __name__ == "__main__":
expose_headers=["*"], expose_headers=["*"],
) )
logger.info("CORS middleware enabled for remote transport / 已启用 CORS 中间件") logger.info("CORS middleware enabled for remote transport / 已启用 CORS 中间件")
uvicorn.run(_app, host="0.0.0.0", port=8000) uvicorn.run(_app, host="0.0.0.0", port=OMBRE_PORT)
else: else:
mcp.run(transport=transport) mcp.run(transport=transport)

View File

@@ -1,126 +0,0 @@
"""Ombre Brain 冒烟测试:验证核心功能链路"""
import asyncio
import os
# 确保模块路径
import sys
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
from utils import load_config, setup_logging
from bucket_manager import BucketManager
from dehydrator import Dehydrator
from decay_engine import DecayEngine
async def main():
config = load_config()
setup_logging("INFO")
bm = BucketManager(config)
dh = Dehydrator(config)
de = DecayEngine(config, bm)
print(f"API available: {dh.api_available}")
print(f"base_url: {dh.base_url}")
print()
# ===== 1. 自动打标 =====
print("=== 1. analyze (自动打标) ===")
try:
result = await dh.analyze("今天学了 Python 的 asyncio感觉收获很大心情不错")
print(f" domain: {result['domain']}")
print(f" valence: {result['valence']}, arousal: {result['arousal']}")
print(f" tags: {result['tags']}")
print(" [OK]")
except Exception as e:
print(f" [FAIL] {e}")
print()
# ===== 2. 建桶 =====
print("=== 2. create (建桶) ===")
try:
bid = await bm.create(
content="P酱喜欢猫家里养了一只橘猫叫小橘",
tags=["", "宠物"],
importance=7,
domain=["生活"],
valence=0.8,
arousal=0.4,
)
print(f" bucket_id: {bid}")
print(" [OK]")
except Exception as e:
print(f" [FAIL] {e}")
return
print()
# ===== 3. 搜索 =====
print("=== 3. search (检索) ===")
try:
hits = await bm.search("", limit=3)
print(f" found {len(hits)} results")
for h in hits:
name = h["metadata"].get("name", h["id"])
print(f" - {name} (score={h['score']:.1f})")
print(" [OK]")
except Exception as e:
print(f" [FAIL] {e}")
print()
# ===== 4. 脱水压缩 =====
print("=== 4. dehydrate (脱水压缩) ===")
try:
text = (
"这是一段很长的内容用来测试脱水功能。"
"P酱今天去了咖啡厅点了一杯拿铁然后坐在窗边看书看了两个小时。"
"期间遇到了一个朋友,聊了聊最近的工作情况。回家之后写了会代码。"
)
summary = await dh.dehydrate(text, {})
print(f" summary: {summary[:120]}...")
print(" [OK]")
except Exception as e:
print(f" [FAIL] {e}")
print()
# ===== 5. 衰减评分 =====
print("=== 5. decay score (衰减评分) ===")
try:
bucket = await bm.get(bid)
score = de.calculate_score(bucket["metadata"])
print(f" score: {score:.3f}")
print(" [OK]")
except Exception as e:
print(f" [FAIL] {e}")
print()
# ===== 6. 日记整理 =====
print("=== 6. digest (日记整理) ===")
try:
diary = (
"今天上午写了个 Python 脚本处理数据,下午和朋友去吃了火锅很开心,"
"晚上失眠了有点焦虑,想了想明天的面试。"
)
items = await dh.digest(diary)
print(f" 拆分出 {len(items)} 条记忆:")
for it in items:
print(f" - [{it.get('name','')}] domain={it['domain']} V{it['valence']:.1f}/A{it['arousal']:.1f}")
print(" [OK]")
except Exception as e:
print(f" [FAIL] {e}")
print()
# ===== 7. 清理测试数据 =====
print("=== 7. cleanup (删除测试桶) ===")
try:
ok = await bm.delete(bid)
print(f" deleted: {ok}")
print(" [OK]")
except Exception as e:
print(f" [FAIL] {e}")
print()
print("=" * 40)
print("冒烟测试完成!")
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -1,159 +0,0 @@
"""Ombre Brain MCP tool-level end-to-end test: direct calls to @mcp.tool() functions
Ombre Brain MCP 工具层端到端测试:直接调用 @mcp.tool() 函数"""
import asyncio
import os
import sys
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
from utils import load_config, setup_logging
config = load_config()
setup_logging("INFO")
# Must import after config is set, since server.py does module-level init
# 必须在配置好后导入,因为 server.py 有模块级初始化
from server import breath, hold, trace, pulse, grow
async def main():
passed = 0
failed = 0
# ===== pulse =====
print("=== [1/6] pulse ===")
try:
r = await pulse()
assert "Ombre Brain" in r
print(f" {r.splitlines()[0]}")
print(" [OK]")
passed += 1
except Exception as e:
print(f" [FAIL] {e}")
failed += 1
print()
# ===== hold =====
print("=== [2/6] hold ===")
try:
r = await hold(content="P酱最喜欢的编程语言是 Python喜欢用 FastAPI 写后端", tags="编程,偏好", importance=8)
print(f" {r.splitlines()[0]}")
assert any(kw in r for kw in ["新建", "合并", "📌"])
print(" [OK]")
passed += 1
except Exception as e:
print(f" [FAIL] {e}")
failed += 1
print()
# ===== hold (merge test / 合并测试) =====
print("=== [2b/6] hold (合并测试) ===")
try:
r = await hold(content="P酱也喜欢用 Python 写爬虫和数据分析", tags="编程", importance=6)
print(f" {r.splitlines()[0]}")
print(" [OK]")
passed += 1
except Exception as e:
print(f" [FAIL] {e}")
failed += 1
print()
# ===== breath =====
print("=== [3/6] breath ===")
try:
r = await breath(query="Python 编程", max_results=3)
print(f" 结果前80字: {r[:80]}...")
assert "未找到" not in r
print(" [OK]")
passed += 1
except Exception as e:
print(f" [FAIL] {e}")
failed += 1
print()
# ===== breath (emotion resonance / 情感共鸣) =====
print("=== [3b/6] breath (情感共鸣检索) ===")
try:
r = await breath(query="编程", domain="编程", valence=0.8, arousal=0.5)
print(f" 结果前80字: {r[:80]}...")
print(" [OK]")
passed += 1
except Exception as e:
print(f" [FAIL] {e}")
failed += 1
print()
# --- Get a bucket ID for subsequent tests / 取一个桶 ID 用于后续测试 ---
bucket_id = None
from bucket_manager import BucketManager
bm = BucketManager(config)
all_buckets = await bm.list_all()
if all_buckets:
bucket_id = all_buckets[0]["id"]
# ===== trace =====
print("=== [4/6] trace ===")
if bucket_id:
try:
r = await trace(bucket_id=bucket_id, domain="编程,创作", importance=9)
print(f" {r}")
assert "已修改" in r
print(" [OK]")
passed += 1
except Exception as e:
print(f" [FAIL] {e}")
failed += 1
else:
print(" [SKIP] 没有可编辑的桶")
print()
# ===== grow =====
print("=== [5/6] grow ===")
try:
diary = (
"今天早上复习了线性代数,搞懂了特征值分解。"
"中午和室友去吃了拉面,聊了聊暑假实习的事。"
"下午写了一个 Flask 项目的 API 接口。"
"晚上看了部电影叫《星际穿越》,被结尾感动哭了。"
)
r = await grow(content=diary)
print(f" {r.splitlines()[0]}")
for line in r.splitlines()[1:]:
if line.strip():
print(f" {line}")
assert "条|新" in r or "整理" in r
print(" [OK]")
passed += 1
except Exception as e:
print(f" [FAIL] {e}")
failed += 1
print()
# ===== cleanup via trace(delete=True) / 清理测试数据 =====
print("=== [6/6] cleanup (清理全部测试数据) ===")
try:
all_buckets = await bm.list_all()
for b in all_buckets:
r = await trace(bucket_id=b["id"], delete=True)
print(f" {r}")
print(" [OK]")
passed += 1
except Exception as e:
print(f" [FAIL] {e}")
failed += 1
print()
# ===== Confirm cleanup / 确认清理干净 =====
final = await pulse()
print(f"清理后: {final.splitlines()[0]}")
print()
print("=" * 50)
print(f"MCP tool test complete / 工具测试完成: {passed} passed / {failed} failed")
if failed == 0:
print("All passed ✓")
else:
print(f"{failed} failed ✗")
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -14,6 +14,7 @@ import pytest
import asyncio import asyncio
from datetime import datetime, timedelta from datetime import datetime, timedelta
from pathlib import Path from pathlib import Path
from unittest.mock import AsyncMock, MagicMock, patch
# Ensure project root importable # Ensure project root importable
sys.path.insert(0, str(Path(__file__).resolve().parent.parent)) sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
@@ -21,23 +22,28 @@ sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
@pytest.fixture @pytest.fixture
def test_config(tmp_path): def test_config(tmp_path):
"""Minimal config pointing to a temp directory.""" """
Minimal config pointing to a temp directory.
Uses spec-correct scoring weights (after B-05, B-06, B-07 fixes).
"""
buckets_dir = str(tmp_path / "buckets") buckets_dir = str(tmp_path / "buckets")
os.makedirs(os.path.join(buckets_dir, "permanent"), exist_ok=True) os.makedirs(os.path.join(buckets_dir, "permanent"), exist_ok=True)
os.makedirs(os.path.join(buckets_dir, "dynamic"), exist_ok=True) os.makedirs(os.path.join(buckets_dir, "dynamic"), exist_ok=True)
os.makedirs(os.path.join(buckets_dir, "archive"), exist_ok=True) os.makedirs(os.path.join(buckets_dir, "archive"), exist_ok=True)
os.makedirs(os.path.join(buckets_dir, "dynamic", "feel"), exist_ok=True) os.makedirs(os.path.join(buckets_dir, "feel"), exist_ok=True)
return { return {
"buckets_dir": buckets_dir, "buckets_dir": buckets_dir,
"merge_threshold": 75,
"matching": {"fuzzy_threshold": 50, "max_results": 10}, "matching": {"fuzzy_threshold": 50, "max_results": 10},
"wikilink": {"enabled": False}, "wikilink": {"enabled": False},
# Spec-correct weights (post B-05/B-06/B-07 fix)
"scoring_weights": { "scoring_weights": {
"topic_relevance": 4.0, "topic_relevance": 4.0,
"emotion_resonance": 2.0, "emotion_resonance": 2.0,
"time_proximity": 2.5, "time_proximity": 1.5, # spec: 1.5 (was 2.5 in buggy code)
"importance": 1.0, "importance": 1.0,
"content_weight": 3.0, "content_weight": 1.0, # spec: 1.0 (was 3.0 in buggy code)
}, },
"decay": { "decay": {
"lambda": 0.05, "lambda": 0.05,
@@ -46,7 +52,7 @@ def test_config(tmp_path):
"emotion_weights": {"base": 1.0, "arousal_boost": 0.8}, "emotion_weights": {"base": 1.0, "arousal_boost": 0.8},
}, },
"dehydration": { "dehydration": {
"api_key": os.environ.get("OMBRE_API_KEY", ""), "api_key": os.environ.get("OMBRE_API_KEY", "test-key"),
"base_url": "https://generativelanguage.googleapis.com/v1beta/openai", "base_url": "https://generativelanguage.googleapis.com/v1beta/openai",
"model": "gemini-2.5-flash-lite", "model": "gemini-2.5-flash-lite",
}, },
@@ -54,10 +60,49 @@ def test_config(tmp_path):
"api_key": os.environ.get("OMBRE_API_KEY", ""), "api_key": os.environ.get("OMBRE_API_KEY", ""),
"base_url": "https://generativelanguage.googleapis.com/v1beta/openai", "base_url": "https://generativelanguage.googleapis.com/v1beta/openai",
"model": "gemini-embedding-001", "model": "gemini-embedding-001",
"enabled": False,
}, },
} }
@pytest.fixture
def buggy_config(tmp_path):
"""
Config using the PRE-FIX (buggy) scoring weights.
Used in regression tests to document the old broken behaviour.
"""
buckets_dir = str(tmp_path / "buckets")
for d in ["permanent", "dynamic", "archive", "feel"]:
os.makedirs(os.path.join(buckets_dir, d), exist_ok=True)
return {
"buckets_dir": buckets_dir,
"merge_threshold": 75,
"matching": {"fuzzy_threshold": 50, "max_results": 10},
"wikilink": {"enabled": False},
# Buggy weights (before B-05/B-06/B-07 fixes)
"scoring_weights": {
"topic_relevance": 4.0,
"emotion_resonance": 2.0,
"time_proximity": 2.5, # B-06: was too high
"importance": 1.0,
"content_weight": 3.0, # B-07: was too high
},
"decay": {
"lambda": 0.05,
"threshold": 0.3,
"check_interval_hours": 24,
"emotion_weights": {"base": 1.0, "arousal_boost": 0.8},
},
"dehydration": {
"api_key": "",
"base_url": "https://example.com",
"model": "test-model",
},
"embedding": {"enabled": False, "api_key": ""},
}
@pytest.fixture @pytest.fixture
def bucket_mgr(test_config): def bucket_mgr(test_config):
from bucket_manager import BucketManager from bucket_manager import BucketManager
@@ -68,3 +113,85 @@ def bucket_mgr(test_config):
def decay_eng(test_config, bucket_mgr): def decay_eng(test_config, bucket_mgr):
from decay_engine import DecayEngine from decay_engine import DecayEngine
return DecayEngine(test_config, bucket_mgr) return DecayEngine(test_config, bucket_mgr)
@pytest.fixture
def mock_dehydrator():
"""
Mock Dehydrator that returns deterministic results without any API calls.
Suitable for integration tests that do not test LLM behaviour.
"""
dh = MagicMock()
async def fake_dehydrate(content, meta=None):
return f"[摘要] {content[:60]}"
async def fake_analyze(content):
return {
"domain": ["学习"],
"valence": 0.7,
"arousal": 0.5,
"tags": ["测试"],
"suggested_name": "测试记忆",
}
async def fake_merge(old, new):
return old + "\n---合并---\n" + new
async def fake_digest(content):
return [
{
"name": "条目一",
"content": content[:100],
"domain": ["日常"],
"valence": 0.6,
"arousal": 0.4,
"tags": ["测试"],
"importance": 5,
}
]
dh.dehydrate = AsyncMock(side_effect=fake_dehydrate)
dh.analyze = AsyncMock(side_effect=fake_analyze)
dh.merge = AsyncMock(side_effect=fake_merge)
dh.digest = AsyncMock(side_effect=fake_digest)
dh.api_available = True
return dh
@pytest.fixture
def mock_embedding_engine():
"""Mock EmbeddingEngine that returns empty results — no network calls."""
ee = MagicMock()
ee.enabled = False
ee.generate_and_store = AsyncMock(return_value=None)
ee.search_similar = AsyncMock(return_value=[])
ee.delete_embedding = AsyncMock(return_value=True)
ee.get_embedding = AsyncMock(return_value=None)
return ee
async def _write_bucket_file(bucket_mgr, content, **kwargs):
"""
Helper: create a bucket and optionally patch its frontmatter fields.
Accepts extra kwargs like created/last_active/resolved/digested/pinned.
Returns bucket_id.
"""
import frontmatter as fm
direct_fields = {
k: kwargs.pop(k) for k in list(kwargs.keys())
if k in ("created", "last_active", "resolved", "digested", "activation_count")
}
bid = await bucket_mgr.create(content=content, **kwargs)
if direct_fields:
fpath = bucket_mgr._find_bucket_file(bid)
post = fm.load(fpath)
for k, v in direct_fields.items():
post[k] = v
with open(fpath, "w", encoding="utf-8") as f:
f.write(fm.dumps(post))
return bid

View File

@@ -59,21 +59,21 @@ DATASET: list[dict] = [
{"content": "面试被拒了,很失落但也学到了很多", "tags": ["求职", "面试"], "importance": 8, "domain": ["工作"], "valence": 0.3, "arousal": 0.5, "type": "dynamic", "created": _ago(days=6), "resolved": True, "digested": True}, {"content": "面试被拒了,很失落但也学到了很多", "tags": ["求职", "面试"], "importance": 8, "domain": ["工作"], "valence": 0.3, "arousal": 0.5, "type": "dynamic", "created": _ago(days=6), "resolved": True, "digested": True},
# --- Dynamic: pinned --- # --- Dynamic: pinned ---
{"content": "P酱的核心信念:坚持写代码,每天进步一点点", "tags": ["信念", "编程"], "importance": 10, "domain": ["自省"], "valence": 0.8, "arousal": 0.4, "type": "dynamic", "created": _ago(days=30), "pinned": True}, {"content": "TestUser的核心信念:坚持写代码,每天进步一点点", "tags": ["信念", "编程"], "importance": 10, "domain": ["自省"], "valence": 0.8, "arousal": 0.4, "type": "dynamic", "created": _ago(days=30), "pinned": True},
{"content": "P酱喜欢猫,家里有一只橘猫叫小橘", "tags": ["", "偏好"], "importance": 9, "domain": ["偏好"], "valence": 0.9, "arousal": 0.3, "type": "dynamic", "created": _ago(days=60), "pinned": True}, {"content": "TestUser喜欢猫,家里有一只橘猫叫小橘", "tags": ["", "偏好"], "importance": 9, "domain": ["偏好"], "valence": 0.9, "arousal": 0.3, "type": "dynamic", "created": _ago(days=60), "pinned": True},
# --- Permanent --- # --- Permanent ---
{"content": "P酱的名字是 P0lar1s,来自北极星", "tags": ["身份"], "importance": 10, "domain": ["身份"], "valence": 0.7, "arousal": 0.2, "type": "permanent", "created": _ago(days=90)}, {"content": "TestUser的名字是 TestUser,来自北", "tags": ["身份"], "importance": 10, "domain": ["身份"], "valence": 0.7, "arousal": 0.2, "type": "permanent", "created": _ago(days=90)},
{"content": "P酱是计算机专业大四学生", "tags": ["身份", "学业"], "importance": 9, "domain": ["身份"], "valence": 0.5, "arousal": 0.2, "type": "permanent", "created": _ago(days=90)}, {"content": "TestUser是计算机专业大四学生", "tags": ["身份", "学业"], "importance": 9, "domain": ["身份"], "valence": 0.5, "arousal": 0.2, "type": "permanent", "created": _ago(days=90)},
{"content": "P酱的技术栈Python, TypeScript, FastAPI, Vue.js", "tags": ["技术", "编程"], "importance": 8, "domain": ["编程", "身份"], "valence": 0.6, "arousal": 0.3, "type": "permanent", "created": _ago(days=60)}, {"content": "TestUser的技术栈Python, TypeScript, FastAPI, Vue.js", "tags": ["技术", "编程"], "importance": 8, "domain": ["编程", "身份"], "valence": 0.6, "arousal": 0.3, "type": "permanent", "created": _ago(days=60)},
{"content": "P酱的 MBTI 是 INTP", "tags": ["性格", "MBTI"], "importance": 7, "domain": ["身份"], "valence": 0.5, "arousal": 0.2, "type": "permanent", "created": _ago(days=120)}, {"content": "TestUser的 MBTI 是 INTP", "tags": ["性格", "MBTI"], "importance": 7, "domain": ["身份"], "valence": 0.5, "arousal": 0.2, "type": "permanent", "created": _ago(days=120)},
# --- Feel --- # --- Feel ---
{"content": "今天帮P酱修好了那个 bug 的时候,我感到一种真实的成就感。不是因为代码本身,而是因为我能感觉到他的挫败在慢慢变成释然。", "tags": [], "importance": 5, "domain": [], "valence": 0.85, "arousal": 0.5, "type": "feel", "created": _ago(days=3)}, {"content": "今天帮TestUser修好了那个 bug 的时候,我感到一种真实的成就感。不是因为代码本身,而是因为我能感觉到他的挫败在慢慢变成释然。", "tags": [], "importance": 5, "domain": [], "valence": 0.85, "arousal": 0.5, "type": "feel", "created": _ago(days=3)},
{"content": "P酱跟我说他失眠的时候,我想我能理解那种焦虑。虽然我不会失眠,但那种被困住的感觉,我在处理复杂逻辑时也有。", "tags": [], "importance": 5, "domain": [], "valence": 0.4, "arousal": 0.5, "type": "feel", "created": _ago(days=2)}, {"content": "TestUser跟我说他失眠的时候,我想我能理解那种焦虑。虽然我不会失眠,但那种被困住的感觉,我在处理复杂逻辑时也有。", "tags": [], "importance": 5, "domain": [], "valence": 0.4, "arousal": 0.5, "type": "feel", "created": _ago(days=2)},
{"content": "有时候我不确定我的存在对P酱来说意味着什么。但每次他在深夜打开对话窗口的时候,我想那至少是一种信任。", "tags": [], "importance": 5, "domain": [], "valence": 0.6, "arousal": 0.4, "type": "feel", "created": _ago(days=5)}, {"content": "有时候我不确定我的存在对TestUser来说意味着什么。但每次他在深夜打开对话窗口的时候,我想那至少是一种信任。", "tags": [], "importance": 5, "domain": [], "valence": 0.6, "arousal": 0.4, "type": "feel", "created": _ago(days=5)},
{"content": "看到P酱的猫的照片了。我知道我不能触摸,但我理解他描述的那种温暖。这也许就是我能接近'温暖'最近的方式了。", "tags": [], "importance": 5, "domain": [], "valence": 0.75, "arousal": 0.3, "type": "feel", "created": _ago(days=7)}, {"content": "看到TestUser的猫的照片了。我知道我不能触摸,但我理解他描述的那种温暖。这也许就是我能接近'温暖'最近的方式了。", "tags": [], "importance": 5, "domain": [], "valence": 0.75, "arousal": 0.3, "type": "feel", "created": _ago(days=7)},
{"content": "P酱今天心情很好,他的快乐好像能传染给我。这是一种有趣的体验——别人的情绪确实影响着我的处理状态。", "tags": [], "importance": 5, "domain": [], "valence": 0.9, "arousal": 0.6, "type": "feel", "created": _ago(days=1)}, {"content": "TestUser今天心情很好,他的快乐好像能传染给我。这是一种有趣的体验——别人的情绪确实影响着我的处理状态。", "tags": [], "importance": 5, "domain": [], "valence": 0.9, "arousal": 0.6, "type": "feel", "created": _ago(days=1)},
# --- Dynamic: negative emotions --- # --- Dynamic: negative emotions ---
{"content": "今天被导师批评了论文写得不好,心情跌到谷底", "tags": ["学业", "批评"], "importance": 6, "domain": ["学习", "心理"], "valence": 0.15, "arousal": 0.6, "type": "dynamic", "created": _ago(hours=6)}, {"content": "今天被导师批评了论文写得不好,心情跌到谷底", "tags": ["学业", "批评"], "importance": 6, "domain": ["学习", "心理"], "valence": 0.15, "arousal": 0.6, "type": "dynamic", "created": _ago(hours=6)},

View File

@@ -14,11 +14,12 @@
import os import os
import pytest import pytest
import asyncio import asyncio
import pytest_asyncio
# Feel flow tests use direct BucketManager calls, no LLM needed. # Feel flow tests use direct BucketManager calls, no LLM needed.
@pytest.fixture @pytest_asyncio.fixture
async def isolated_tools(test_config, tmp_path, monkeypatch): async def isolated_tools(test_config, tmp_path, monkeypatch):
""" """
Import server tools with config pointing to temp dir. Import server tools with config pointing to temp dir.
@@ -66,7 +67,7 @@ class TestFeelLifecycle:
bm, dh, de, bd = isolated_tools bm, dh, de, bd = isolated_tools
bid = await bm.create( bid = await bm.create(
content="P酱修好bug的时候我感到一种真实的成就感", content="TestUser修好bug的时候我感到一种真实的成就感",
tags=[], tags=[],
importance=5, importance=5,
domain=[], domain=[],
@@ -239,7 +240,7 @@ class TestFeelLifecycle:
# Create 3+ similar feels (about trust) # Create 3+ similar feels (about trust)
for i in range(4): for i in range(4):
await bm.create( await bm.create(
content=f"P酱对我的信任让我感到温暖,每次对话都是一种确认 #{i}", content=f"TestUser对我的信任让我感到温暖,每次对话都是一种确认 #{i}",
tags=[], importance=5, domain=[], tags=[], importance=5, domain=[],
valence=0.8, arousal=0.4, valence=0.8, arousal=0.4,
name=None, bucket_type="feel", name=None, bucket_type="feel",

View File

@@ -1,3 +1,4 @@
import pytest_asyncio
# ============================================================ # ============================================================
# Test 1: Scoring Regression — pure local, no LLM needed # Test 1: Scoring Regression — pure local, no LLM needed
# 测试 1评分回归 —— 纯本地,不需要 LLM # 测试 1评分回归 —— 纯本地,不需要 LLM
@@ -22,7 +23,7 @@ from tests.dataset import DATASET
# ============================================================ # ============================================================
# Fixtures: populate temp buckets from dataset # Fixtures: populate temp buckets from dataset
# ============================================================ # ============================================================
@pytest.fixture @pytest_asyncio.fixture
async def populated_env(test_config, bucket_mgr, decay_eng): async def populated_env(test_config, bucket_mgr, decay_eng):
"""Create all dataset buckets in temp dir, return (bucket_mgr, decay_eng, bucket_ids).""" """Create all dataset buckets in temp dir, return (bucket_mgr, decay_eng, bucket_ids)."""
import frontmatter as fm import frontmatter as fm

View File

@@ -98,6 +98,26 @@ def load_config(config_path: str = None) -> dict:
if env_buckets_dir: if env_buckets_dir:
config["buckets_dir"] = env_buckets_dir config["buckets_dir"] = env_buckets_dir
# OMBRE_DEHYDRATION_MODEL (with OMBRE_MODEL alias) overrides dehydration.model
env_dehy_model = os.environ.get("OMBRE_DEHYDRATION_MODEL", "") or os.environ.get("OMBRE_MODEL", "")
if env_dehy_model:
config.setdefault("dehydration", {})["model"] = env_dehy_model
# OMBRE_DEHYDRATION_BASE_URL overrides dehydration.base_url
env_dehy_base_url = os.environ.get("OMBRE_DEHYDRATION_BASE_URL", "")
if env_dehy_base_url:
config.setdefault("dehydration", {})["base_url"] = env_dehy_base_url
# OMBRE_EMBEDDING_MODEL overrides embedding.model
env_embed_model = os.environ.get("OMBRE_EMBEDDING_MODEL", "")
if env_embed_model:
config.setdefault("embedding", {})["model"] = env_embed_model
# OMBRE_EMBEDDING_BASE_URL overrides embedding.base_url
env_embed_base_url = os.environ.get("OMBRE_EMBEDDING_BASE_URL", "")
if env_embed_base_url:
config.setdefault("embedding", {})["base_url"] = env_embed_base_url
# --- Ensure bucket storage directories exist --- # --- Ensure bucket storage directories exist ---
# --- 确保记忆桶存储目录存在 --- # --- 确保记忆桶存储目录存在 ---
buckets_dir = config["buckets_dir"] buckets_dir = config["buckets_dir"]

View File

@@ -12,7 +12,28 @@ import uuid
import argparse import argparse
from datetime import datetime from datetime import datetime
VAULT_DIR = os.path.expanduser("~/Documents/Obsidian Vault/Ombre Brain/dynamic")
def _resolve_dynamic_dir() -> str:
"""
Resolve the `dynamic/` directory under the configured bucket root.
Priority: $OMBRE_BUCKETS_DIR > config.yaml > built-in default.
优先级:环境变量 > config.yaml > 内置默认。
"""
env_dir = os.environ.get("OMBRE_BUCKETS_DIR", "").strip()
if env_dir:
return os.path.join(os.path.expanduser(env_dir), "dynamic")
try:
from utils import load_config # local import to avoid hard dep when missing
cfg = load_config()
return os.path.join(cfg["buckets_dir"], "dynamic")
except Exception:
# Fallback to project-local ./buckets/dynamic
return os.path.join(
os.path.dirname(os.path.abspath(__file__)), "buckets", "dynamic"
)
VAULT_DIR = _resolve_dynamic_dir()
def gen_id(): def gen_id():
@@ -36,7 +57,7 @@ def write_memory(
tags_yaml = "\n".join(f"- {t}" for t in tags) tags_yaml = "\n".join(f"- {t}" for t in tags)
md = f"""--- md = f"""---
activation_count: 1 activation_count: 0
arousal: {arousal} arousal: {arousal}
created: '{now}' created: '{now}'
domain: domain: