2.7 KiB
Semantic Index Production Notes
These notes capture the current production direction for the Redmine semantic
index. The service is still local-agent oriented, but the refresh command is now
shaped so it can later be run by cron or systemd without changing the command.
Use docs/semantic_index_deployment_runbook.md for the full deploy, validation,
and rollback checklist.
Routine Refresh
Use the wrapper from the repository root:
semantic_index/refresh.sh
By default this is a dry-run. It does not call OpenAI for document embeddings and does not write to Qdrant. To apply a rolling refresh:
semantic_index/refresh.sh --apply
The wrapper writes a timestamped log under .cache/semantic_index/logs and uses
.cache/semantic_index/refresh_state.json for rolling refresh state.
Production Overrides
Use environment variables rather than editing the script:
SEMANTIC_INDEX_PROJECT_LIMITS='customer-service=500,hiring=200,todo-jason=200,sales-inbox=100,business-development=100,dock-scheduling=100,prep-standardization=100'
SEMANTIC_INDEX_LOG_DIR=/var/log/semantic-index
SEMANTIC_INDEX_STATE_PATH=/var/lib/semantic-index/refresh_state.json
SEMANTIC_INDEX_OVERLAP_MINUTES=15
Keep OPENAI_API_KEY, QDRANT_URL, REDMINE_URL, and REDMINE_API_KEY in the
existing .env workflow or in the service manager environment.
For production-style deployment, use /opt/semantic-index for code,
/etc/semantic-index.env for service environment, /var/lib/semantic-index
for refresh state, and /var/log/semantic-index for refresh logs. Systemd
templates live in deploy/semantic-index/.
Embedding Cost Guard
Normal refresh embeds only documents that are new or whose Redmine-derived
source_hash changed. Unchanged documents are left alone. Stale indexed
documents for refreshed issues are deleted without embedding.
Do not schedule --force-rebuild. Use it only as a manual maintenance action
when intentionally re-embedding unchanged documents.
Cron Shape
A later cron entry can call the same wrapper:
*/30 * * * * cd /home/iadnah/redmine && semantic_index/refresh.sh --apply
Before adding a real schedule, run the wrapper manually and confirm the log
shows expected embedded_documents, unchanged_documents, and
skipped_issues counts.
For a quick wrapper smoke check, reduce the project limits:
SEMANTIC_INDEX_PROJECT_LIMITS='customer-service=5' semantic_index/refresh.sh
After refresh state exists, routine dry-runs should show old issues as
skipped_issues without matching detail_fetched_issues. That indicates the
refresh is avoiding unnecessary Redmine detail requests before it reaches the
embedding cost guard.