2026-05-19DataMesh Consulting
19 May — Kimi model migrated to k2.6, Doffin extractor lands, tender-page hang fix
Two pieces of forced-march work and one nice piece of new coverage. Moonshot is deprecating the k2 series on 2026-05-25, so every Kimi call now points at kimi-k2.6 (with kimi-k2-thinking kept around just long enough to drain in-flight cached reasoning). A Doffin (Norway) Step-0 extractor went live via the public search API — no Playwright needed. And the public tender pages had been hanging up to 60 seconds waiting on a synchronous AI-summary call during SSR; that's now capped at 3 seconds with the summary backfilling client-side afterwards.
Kimi model migration: k2-thinking → k2.6
Moonshot's communications were specific: the k2 series (kimi-k2-base, kimi-k2-thinking, kimi-k2-chat) is discontinued on 2026-05-25. After that date all calls to the k2 family return errors. We've been on kimi-k2-thinking for the reasoning passes and an older model for embeddings since launch.
The migration is straightforward — flip the model id in two places (kimi.service.ts for chat, embedding.service.ts for vector) — but k2.6 has a different response shape:
- It now returns
reasoning_contentas a separate field
content, where k2-thinking inlined chain-of-
thought into content. The downstream parser had been
written assuming content was the only field; it now
reads reasoning_content || content and surfaces
finish_reason so we can detect "stopped due to length"
vs "complete answer."
- The request shape requires a different
response_format
A telemetry fix shipped at the same time: the
LlmCallProvider union didn't include 'kimi-cli' as a
value, so all CLI-fallback calls were being tagged as the
HTTP provider in analytics.llm_call_logs. That broke the
"Cached vs Billable" dashboard split from yesterday — CLI
calls (which are plan-billed, not per-token) were being
counted as billable. Fixed in KimiService and the
TypeScript union now includes 'kimi-cli' so future
provider additions get a compile error rather than silent
mis-tagging.
Doffin (Norway) — Step-0 extractor via public API
Doffin (Database for offentlige innkjøp) is the Norwegian
national procurement portal. It exposes a public search API
that returns notices in OCDS-adjacent JSON. No Playwright
needed — axios.get(searchEndpoint) returns paginated
results we can iterate directly.
Coverage:
- Listing — paged via
?page=N, ~100 notices per page. - Detail enrichment — each notice has a
documents
- Country / NUTS — Doffin uses Norwegian NUTS
NO for
countryCode and preserved as the original code in
nutsCodes[].
Live count after first scrape: ~3,400 active notices. About 40% have CPV codes attached at source; the rest fall back to keyword-driven matching.
Tender pages were hanging up to 60s on SSR
A user-reported regression: tender detail pages on the
public web portal were taking up to 60 seconds to render.
The cause was a synchronous Kimi summary call inside
generateMetadata — Next.js can't stream the page until
metadata resolves, and we'd been awaiting up to the full
Kimi HTTP timeout (60s) before falling through.
Two-part fix:
- Cap the AI-summary SSR wait at 3 seconds. If the
- Remove a redundant inline Kimi call inside the page body
Page TTFB on /tenders/[id] dropped from a P95 of ~12s to
~600ms. The summary appears within ~2s after page load
when the cache is warm, ~6s on a cold-cache fetch.
Hermes scheduling — doubled all intervals
The scheduling intervals across Hermes (listing-poll, detail-poll, relearn-cron) were halved on 2026-05-10 during a backfill push. We kept those intervals after the backfill ended, which was over-aggressive for steady-state operation. Doubled them today:
- Listing poll: 30 min → 60 min
- Detail poll: 5 min → 10 min
- Relearn cron: 1 hour → 2 hours
This roughly halves the Kimi API spend without affecting freshness in any material way (most portals don't update faster than hourly anyway).
What's next
- Watch the migration rollout — there's always one model-
- Functional report sweep tomorrow — there's a backlog of