Skip to content
githome

Observability

health endpoints, structured logs, log shipping, and alerting patterns for githome

Health endpoints

Githome exposes two HTTP health endpoints that require no authentication.

GET /readyz   200 = instance is ready to serve traffic
              503 = not ready (startup incomplete, database unreachable)

GET /healthz  200 = process is alive

Use /readyz for load balancer health checks and container healthcheck directives. Use /healthz for simple liveness probes where you only need to know the process is running.

curl -s -o /dev/null -w "%{http_code}" http://localhost:3000/readyz
# 200

A 503 from /readyz means githome is alive but cannot serve traffic. Check GITHOME_DATABASE_URL connectivity and logs.

Log format

Githome logs via Go's slog package. Every log line includes a standard set of fields.

Common fields across all log lines:

Field Type Description
time string RFC3339 timestamp
level string debug, info, warn, error
msg string human-readable event description
service string always githome
version string binary version

HTTP request fields (access log lines):

Field Type Description
method string HTTP method
path string request path
status int HTTP response status code
duration string request duration, e.g. 12ms
ip string client IP (respects X-Real-IP)

Error fields:

Field Type Description
err string error message

Log levels

Set the level with GITHOME_LOG_LEVEL.

  • debug - all of the above, plus SQL queries, git command invocations, and authentication token validation steps. Do not use in production unless diagnosing a specific issue; it logs sensitive data.
  • info (default) - HTTP access log, startup/shutdown events, webhook deliveries, background job completions and errors.
  • warn - anomalies that are not fatal: retried database operations, slow queries, webhook delivery failures.
  • error - unhandled errors, panics recovered, database connectivity failures.

Text vs JSON format

Set GITHOME_LOG_FORMAT=text for human-readable output during development:

2026-06-10T02:00:00Z INFO  GET /repos/alice/myrepo status=200 duration=4ms

Set GITHOME_LOG_FORMAT=json in production for log aggregation:

{"time":"2026-06-10T02:00:00Z","level":"INFO","msg":"request","service":"githome","version":"0.1.2","method":"GET","path":"/repos/alice/myrepo","status":200,"duration":"4ms","ip":"10.0.0.5"}

JSON format is the right choice any time logs flow into a collector or aggregation system.

Shipping logs

Vector

# /etc/vector/vector.toml

[sources.githome]
type = "journald"
include_units = ["githome.service"]

[transforms.parse_json]
type   = "remap"
inputs = ["githome"]
source = '''
  . = parse_json!(.message)
'''

[sinks.loki]
type     = "loki"
inputs   = ["parse_json"]
endpoint = "http://loki:3100"

  [sinks.loki.labels]
  service = "githome"
  level   = "{{ level }}"

Fluentd

<source>
  @type systemd
  tag githome
  matches [{"_SYSTEMD_UNIT": "githome.service"}]
</source>

<filter githome>
  @type parser
  key_name message
  <parse>
    @type json
  </parse>
</filter>

<match githome>
  @type elasticsearch
  host elasticsearch
  port 9200
  logstash_format true
  logstash_prefix githome
</match>

Datadog agent

Add to /etc/datadog-agent/conf.d/githome.d/conf.yaml:

logs:
  - type: journald
    source: githome
    service: githome
    log_processing_rules:
      - type: multi_line
        name: new_log_start
        pattern: '^\{"time"'

Set GITHOME_LOG_FORMAT=json and restart the Datadog agent.

Metrics

Githome does not expose a Prometheus metrics endpoint in the current release. Use log-based metrics as a substitute.

Extract request rates and latencies from access logs with Vector or a log aggregation query:

# Count 5xx responses in the last minute from journald
journalctl -u githome --since "1 minute ago" -o json \
  | jq 'select(.MESSAGE | fromjson? | .status >= 500) | .MESSAGE' \
  | wc -l

With Loki + LogQL:

# 5xx rate
rate({service="githome"} | json | status >= 500 [5m])

# p99 request duration (requires duration field parsed as a number)
quantile_over_time(0.99, {service="githome"} | json | unwrap duration [5m])

Background jobs

Githome runs background jobs for webhook delivery and cleanup tasks. Each job logs completion and any errors at info or warn level:

{"level":"INFO","msg":"webhook delivered","hook_id":42,"delivery_id":"abc123","status":200,"duration":"87ms"}
{"level":"WARN","msg":"webhook delivery failed","hook_id":42,"delivery_id":"def456","status":503,"attempt":3}

Webhook delivery failures

Failed webhook deliveries are stored and visible via the API. Inspect them directly:

# List recent deliveries for a hook
curl -s \
  -H "Authorization: Bearer <token>" \
  "https://git.example.com/repos/alice/myrepo/hooks/1/deliveries" \
  | jq '.[] | {id, event, status_code, delivered_at}'

Redeliver a failed delivery:

curl -s -X POST \
  -H "Authorization: Bearer <token>" \
  "https://git.example.com/repos/alice/myrepo/hooks/deliveries/456/attempts"

Alerting patterns

The following conditions warrant alerts in a production deployment:

5xx rate elevated. Any sustained rate of HTTP 5xx responses above your baseline. Query access logs grouped by status >= 500.

Health check failing. /readyz returning anything other than 200. This is the most important alert; it means githome is not serving traffic.

Git push failures. Log lines with msg containing git-receive-pack and a non-zero exit code. These indicate failed pushes, which users notice immediately.

Webhook delivery error rate. A stream of webhook delivery failed log lines, especially across multiple repositories, often indicates a misconfigured endpoint or a connectivity problem.

Slow requests. HTTP requests with duration > 5s on non-upload paths signal database or git performance problems.

Database connectivity errors. Any err field containing the string connection refused or database is locked (SQLite under high write load). These precede downtime.