Files
redmine/docs/semantic_index_production_notes.md
T
2026-05-04 09:50:03 -04:00

2.7 KiB

Semantic Index Production Notes

These notes capture the current production direction for the Redmine semantic index. The service is still local-agent oriented, but the refresh command is now shaped so it can later be run by cron or systemd without changing the command. Use docs/semantic_index_deployment_runbook.md for the full deploy, validation, and rollback checklist.

Routine Refresh

Use the wrapper from the repository root:

semantic_index/refresh.sh

By default this is a dry-run. It does not call OpenAI for document embeddings and does not write to Qdrant. To apply a rolling refresh:

semantic_index/refresh.sh --apply

The wrapper writes a timestamped log under .cache/semantic_index/logs and uses .cache/semantic_index/refresh_state.json for rolling refresh state.

Production Overrides

Use environment variables rather than editing the script:

SEMANTIC_INDEX_PROJECT_LIMITS='customer-service=500,hiring=200,todo-jason=200,sales-inbox=100,business-development=100,dock-scheduling=100,prep-standardization=100'
SEMANTIC_INDEX_LOG_DIR=/var/log/semantic-index
SEMANTIC_INDEX_STATE_PATH=/var/lib/semantic-index/refresh_state.json
SEMANTIC_INDEX_OVERLAP_MINUTES=15

Keep OPENAI_API_KEY, QDRANT_URL, REDMINE_URL, and REDMINE_API_KEY in the existing .env workflow or in the service manager environment.

For production-style deployment, use /opt/semantic-index for code, /etc/semantic-index.env for service environment, /var/lib/semantic-index for refresh state, and /var/log/semantic-index for refresh logs. Systemd templates live in deploy/semantic-index/.

Embedding Cost Guard

Normal refresh embeds only documents that are new or whose Redmine-derived source_hash changed. Unchanged documents are left alone. Stale indexed documents for refreshed issues are deleted without embedding.

Do not schedule --force-rebuild. Use it only as a manual maintenance action when intentionally re-embedding unchanged documents.

Cron Shape

A later cron entry can call the same wrapper:

*/30 * * * * cd /home/iadnah/redmine && semantic_index/refresh.sh --apply

Before adding a real schedule, run the wrapper manually and confirm the log shows expected embedded_documents, unchanged_documents, and skipped_issues counts.

For a quick wrapper smoke check, reduce the project limits:

SEMANTIC_INDEX_PROJECT_LIMITS='customer-service=5' semantic_index/refresh.sh

After refresh state exists, routine dry-runs should show old issues as skipped_issues without matching detail_fetched_issues. That indicates the refresh is avoiding unnecessary Redmine detail requests before it reaches the embedding cost guard.