Skip to main content
All insights

2026-05-19DataMesh Consulting

19 May — Kimi model migrated to k2.6, Doffin extractor lands, tender-page hang fix

Two pieces of forced-march work and one nice piece of new coverage. Moonshot is deprecating the k2 series on 2026-05-25, so every Kimi call now points at kimi-k2.6 (with kimi-k2-thinking kept around just long enough to drain in-flight cached reasoning). A Doffin (Norway) Step-0 extractor went live via the public search API — no Playwright needed. And the public tender pages had been hanging up to 60 seconds waiting on a synchronous AI-summary call during SSR; that's now capped at 3 seconds with the summary backfilling client-side afterwards.

Kimi model migration: k2-thinking → k2.6

Moonshot's communications were specific: the k2 series (kimi-k2-base, kimi-k2-thinking, kimi-k2-chat) is discontinued on 2026-05-25. After that date all calls to the k2 family return errors. We've been on kimi-k2-thinking for the reasoning passes and an older model for embeddings since launch.

The migration is straightforward — flip the model id in two places (kimi.service.ts for chat, embedding.service.ts for vector) — but k2.6 has a different response shape:

  • It now returns reasoning_content as a separate field
alongside content, where k2-thinking inlined chain-of- thought into content. The downstream parser had been written assuming content was the only field; it now reads reasoning_content || content and surfaces finish_reason so we can detect "stopped due to length" vs "complete answer."
  • The request shape requires a different response_format
for structured JSON output. The match-evaluation prompts were silently returning empty strings under the new model until we updated the request body.

A telemetry fix shipped at the same time: the LlmCallProvider union didn't include 'kimi-cli' as a value, so all CLI-fallback calls were being tagged as the HTTP provider in analytics.llm_call_logs. That broke the "Cached vs Billable" dashboard split from yesterday — CLI calls (which are plan-billed, not per-token) were being counted as billable. Fixed in KimiService and the TypeScript union now includes 'kimi-cli' so future provider additions get a compile error rather than silent mis-tagging.

Doffin (Norway) — Step-0 extractor via public API

Doffin (Database for offentlige innkjøp) is the Norwegian national procurement portal. It exposes a public search API that returns notices in OCDS-adjacent JSON. No Playwright needed — axios.get(searchEndpoint) returns paginated results we can iterate directly.

Coverage:

  • Listing — paged via ?page=N, ~100 notices per page.
  • Detail enrichment — each notice has a documents
array we follow for the full description, CPV codes, contact info, and value.
  • Country / NUTS — Doffin uses Norwegian NUTS
(NO-prefixed). Mapped to the canonical 2-char NO for countryCode and preserved as the original code in nutsCodes[].

Live count after first scrape: ~3,400 active notices. About 40% have CPV codes attached at source; the rest fall back to keyword-driven matching.

Tender pages were hanging up to 60s on SSR

A user-reported regression: tender detail pages on the public web portal were taking up to 60 seconds to render. The cause was a synchronous Kimi summary call inside generateMetadata — Next.js can't stream the page until metadata resolves, and we'd been awaiting up to the full Kimi HTTP timeout (60s) before falling through.

Two-part fix:

  • Cap the AI-summary SSR wait at 3 seconds. If the
Kimi call hasn't returned by then, render the page without the summary; the client-side hook backfills it on mount.
  • Remove a redundant inline Kimi call inside the page body
that was firing in addition to the SSR call. One call per page-view, not two.

Page TTFB on /tenders/[id] dropped from a P95 of ~12s to ~600ms. The summary appears within ~2s after page load when the cache is warm, ~6s on a cold-cache fetch.

Hermes scheduling — doubled all intervals

The scheduling intervals across Hermes (listing-poll, detail-poll, relearn-cron) were halved on 2026-05-10 during a backfill push. We kept those intervals after the backfill ended, which was over-aggressive for steady-state operation. Doubled them today:

  • Listing poll: 30 min → 60 min
  • Detail poll: 5 min → 10 min
  • Relearn cron: 1 hour → 2 hours

This roughly halves the Kimi API spend without affecting freshness in any material way (most portals don't update faster than hourly anyway).

What's next

  • Watch the migration rollout — there's always one model-
behaviour difference you only notice in prod (different hallucination patterns, different JSON-faithfulness on the structured-output path).
  • Functional report sweep tomorrow — there's a backlog of
small UI/data-quality items from the last QA pass.
Methodology: drawn from the week ending 2026-05-19 tender corpus. Tender data sourced from public procurement portals worldwide; see our methodology for the extraction pipeline.