Add semantic-index service, deployment assets, and tests
This commit is contained in:
@@ -0,0 +1,76 @@
|
||||
# Semantic Index Production Notes
|
||||
|
||||
These notes capture the current production direction for the Redmine semantic
|
||||
index. The service is still local-agent oriented, but the refresh command is now
|
||||
shaped so it can later be run by cron or systemd without changing the command.
|
||||
Use `docs/semantic_index_deployment_runbook.md` for the full deploy, validation,
|
||||
and rollback checklist.
|
||||
|
||||
## Routine Refresh
|
||||
|
||||
Use the wrapper from the repository root:
|
||||
|
||||
```sh
|
||||
semantic_index/refresh.sh
|
||||
```
|
||||
|
||||
By default this is a dry-run. It does not call OpenAI for document embeddings
|
||||
and does not write to Qdrant. To apply a rolling refresh:
|
||||
|
||||
```sh
|
||||
semantic_index/refresh.sh --apply
|
||||
```
|
||||
|
||||
The wrapper writes a timestamped log under `.cache/semantic_index/logs` and uses
|
||||
`.cache/semantic_index/refresh_state.json` for rolling refresh state.
|
||||
|
||||
## Production Overrides
|
||||
|
||||
Use environment variables rather than editing the script:
|
||||
|
||||
```sh
|
||||
SEMANTIC_INDEX_PROJECT_LIMITS='customer-service=500,hiring=200,todo-jason=200,sales-inbox=100,business-development=100,dock-scheduling=100,prep-standardization=100'
|
||||
SEMANTIC_INDEX_LOG_DIR=/var/log/semantic-index
|
||||
SEMANTIC_INDEX_STATE_PATH=/var/lib/semantic-index/refresh_state.json
|
||||
SEMANTIC_INDEX_OVERLAP_MINUTES=15
|
||||
```
|
||||
|
||||
Keep `OPENAI_API_KEY`, `QDRANT_URL`, `REDMINE_URL`, and `REDMINE_API_KEY` in the
|
||||
existing `.env` workflow or in the service manager environment.
|
||||
|
||||
For production-style deployment, use `/opt/semantic-index` for code,
|
||||
`/etc/semantic-index.env` for service environment, `/var/lib/semantic-index`
|
||||
for refresh state, and `/var/log/semantic-index` for refresh logs. Systemd
|
||||
templates live in `deploy/semantic-index/`.
|
||||
|
||||
## Embedding Cost Guard
|
||||
|
||||
Normal refresh embeds only documents that are new or whose Redmine-derived
|
||||
`source_hash` changed. Unchanged documents are left alone. Stale indexed
|
||||
documents for refreshed issues are deleted without embedding.
|
||||
|
||||
Do not schedule `--force-rebuild`. Use it only as a manual maintenance action
|
||||
when intentionally re-embedding unchanged documents.
|
||||
|
||||
## Cron Shape
|
||||
|
||||
A later cron entry can call the same wrapper:
|
||||
|
||||
```cron
|
||||
*/30 * * * * cd /home/iadnah/redmine && semantic_index/refresh.sh --apply
|
||||
```
|
||||
|
||||
Before adding a real schedule, run the wrapper manually and confirm the log
|
||||
shows expected `embedded_documents`, `unchanged_documents`, and
|
||||
`skipped_issues` counts.
|
||||
|
||||
For a quick wrapper smoke check, reduce the project limits:
|
||||
|
||||
```sh
|
||||
SEMANTIC_INDEX_PROJECT_LIMITS='customer-service=5' semantic_index/refresh.sh
|
||||
```
|
||||
|
||||
After refresh state exists, routine dry-runs should show old issues as
|
||||
`skipped_issues` without matching `detail_fetched_issues`. That indicates the
|
||||
refresh is avoiding unnecessary Redmine detail requests before it reaches the
|
||||
embedding cost guard.
|
||||
Reference in New Issue
Block a user