<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Supra-50m on DRM HSE</title><link>https://www.drmhse.com/tags/supra-50m/</link><description>Recent content in Supra-50m on DRM HSE</description><generator>Hugo</generator><language>en</language><lastBuildDate>Fri, 26 Jun 2026 22:13:09 +0300</lastBuildDate><atom:link href="https://www.drmhse.com/tags/supra-50m/index.xml" rel="self" type="application/rss+xml"/><item><title>Building a Local OCR API for Kenyan ID Extraction</title><link>https://www.drmhse.com/posts/the-ocr-engine-is-not-the-document-system/</link><pubDate>Fri, 26 Jun 2026 13:00:00 +0300</pubDate><guid>https://www.drmhse.com/posts/the-ocr-engine-is-not-the-document-system/</guid><description>&lt;p>The first temptation with ID extraction is to send the image to the strongest vision model available and move on.&lt;/p>
&lt;p>That works for a demo. It is less comfortable when the image is a national ID, the output becomes part of a customer record, and someone asks where the document was processed.&lt;/p>
&lt;p>This setup keeps the OCR path local. It uses PP-OCRv6 for text detection and recognition, then an optional small understanding model for one job: turn Kenyan ID-style OCR lines into JSON fields.&lt;/p></description></item></channel></rss>