Flutter OCR Tutorial: How to Create a KTP & NPWP Scanner with Google ML Kit

This Flutter OCR tutorial will guide you on how to develop a scanner feature for Indonesia’s national ID card (KTP) and taxpayer identification number (NPWP) using Google ML Kit. The need to quickly and accurately extract important information from physical documents is growing. By utilizing the Optical Character Recognition (OCR) capability of Google ML Kit, we can integrate document scanning functions directly into a Flutter application.

Many modern applications leverage OCR (Optical Character Recognition) technology for e-KYC (Electronic Know Your Customer) processes. This includes scanning Indonesian identity documents such as KTP (Kartu Tanda Penduduk) and NPWP (Nomor Pokok Wajib Pajak).

The good news? This technology can now run directly on the user’s device no internet connection required. This improves both processing speed and data security, making it ideal for privacy-focused apps.

In this Flutter OCR tutorial, we will discuss how to create a simple Flutter application that can scan and extract data from KTP and NPWP offline using Google ML Kit. Check out the explanation below!

Flutter OCR Tutorial: Creating a KTP & NPWP Scanner with Google ML Kit

Step 1: Set Up Flutter Project


Start by creating a new Flutter project:

flutter create ktp_npwp_scanner

Then, add the following dependencies in pubspec.yaml

dependencies:
  flutter:
    sdk: flutter
  google_mlkit_text_recognition: ^0.15.0
  image_picker: ^1.1.2
  image: ^4.5.4 #optional for image preprocessing

Run flutter pub get to install the packages.

Step 2: Build a Simple UI for Image Selection

Create a user interface that allows users to pick an image from the camera or gallery, and display the OCR results.

import 'package:flutter/material.dart';
import 'package:image_picker/image_picker.dart';
import 'dart:io';

class MyHomePage extends StatefulWidget {
  @override
  _MyHomePageState createState() => _MyHomePageState();
}

class _MyHomePageState extends State<MyHomePage> {
  File? _selectedImage;
  String _result = "";

  final ImagePicker _picker = ImagePicker();

  Future<void> _pickImage(ImageSource source) async {
    final XFile? pickedFile = await _picker.pickImage(source: source);

    if (pickedFile != null) {
      setState(() {
        _selectedImage = File(pickedFile.path);
      });
      final result = await recognizeText(_selectedImage!);
      setState(() {
        _result = result;
      });
    }
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(title: const Text('KTP & NPWP Scanner')),
      body: Column(
        crossAxisAlignment: CrossAxisAlignment.stretch,
        children: [
          if (_selectedImage != null)
            Image.file(_selectedImage!, height: 300, fit: BoxFit.contain),
          const SizedBox(height: 20),
          Row(
            mainAxisAlignment: MainAxisAlignment.spaceEvenly,
            children: [
              ElevatedButton.icon(
                icon: const Icon(Icons.camera_alt),
                label: const Text('Camera'),
                onPressed: () => _pickImage(ImageSource.camera),
              ),
              ElevatedButton.icon(
                icon: const Icon(Icons.image),
                label: const Text('Gallery'),
                onPressed: () => _pickImage(ImageSource.gallery),
              ),
            ],
          ),
          const SizedBox(height: 20),
          const Text('Data:',
              style: TextStyle(fontSize: 18, fontWeight: FontWeight.bold)),
          const SizedBox(height: 10),
          Text(_result),
        ],
      ),
    );
  }
}

Step 3: Add OCR Logic Using Google ML Kit

Here’s how to recognize text from the selected image using Google ML Kit. 

  Future<String> recognizeText(File imageFile) async {
    final inputImage = InputImage.fromFile(imageFile);
    final textRecognizer = TextRecognizer(script: TextRecognitionScript.latin);
    final RecognizedText recognizedText =
        await textRecognizer.processImage(inputImage);

    return recognizedText.text;
  }

Step 4: Extract NIK and NPWP from OCR Text


Now let’s extract only the important parts: NIK (usually 16 digits) or NPWP (usually 16/15 digits). Here’s a basic extraction function:

  String? extractNIK(String ocrText) {
    // Step 1: Remove non-digit characters but keep spacing between words
    final cleaned = ocrText.replaceAll(RegExp(r'[^0-9\s]'), '');

    // Step 2: Remove extra spaces and split into words
    final words = cleaned.split(RegExp(r'\s+'));

    // Step 3: Search for 15–16 digit numbers
    for (final word in words) {
      if (RegExp(r'^\d{15,16}$').hasMatch(word)) {
        return word;
      }
    }

    return null; // No NIK found
  }

What This Function Does:

  • Cleans OCR text by stripping away everything except digits and spaces.
  • Splits the text by space to isolate individual number sequences.
  • Returns the first match with 15 or 16 digits (typical for NPWP or NIK).

Step 5: View the Results

Here’s what the app shows after scanning a KTP or NPWP image:

  • Left Section: A screenshot of the app displaying raw text extracted directly from OCR.
  • Right Section: A screenshot of the app after applying data extraction, showing only the relevant NIK and NPWP values.


Bonus Step : Preprocess the Image for Better OCR Accuracy

OCR performance is heavily influenced by image quality. Poor lighting, blur, or low contrast can cause the text recognizer to miss or misread characters. To maximize accuracy, it’s a good idea to preprocess image before passing it to the OCR engine. Here’s a simple preprocessing function :

import 'dart:io';
import 'package:image/image.dart' as img;

Future<File> preprocessImage(File file) async {
  final bytes = await file.readAsBytes();
  img.Image? image = img.decodeImage(bytes);
  if (image == null) return file;

  final gray = img.grayscale(image);
  final contrast = img.contrast(gray, contrast: 175);
  final denoised = img.gaussianBlur(contrast, radius: 1);

  final output = img.encodeJpg(denoised);
  final path = '${file.parent.path}/processed_${file.uri.pathSegments.last}';
  return File(path)..writeAsBytesSync(output);
}

What This Function Does:

  • Converts the image to grayscale to reduce color noise.
  • Increases contrast to make the text stand out more.
  • Applies a Gaussian blur to smooth out small imperfections and reduce noise.

Full Source Code

Below is the complete code that covers the entire implementation from building the user interface, implementing OCR with Google ML Kit, extracting NIK/NPWP, to preprocessing the image for better OCR accuracy.

import 'dart:io';
import 'package:flutter/material.dart';
import 'package:google_mlkit_text_recognition/google_mlkit_text_recognition.dart';
import 'package:image/image.dart' as img;
import 'package:image_picker/image_picker.dart';

class MyHomePage extends StatefulWidget {
  const MyHomePage({super.key});

  @override
  MyHomePageState createState() => MyHomePageState();
}

class MyHomePageState extends State<MyHomePage> {
  File? _selectedImage;
  String _result = "";

  final ImagePicker _picker = ImagePicker();

  Future<void> _pickImage(ImageSource source) async {
    final XFile? pickedFile = await _picker.pickImage(source: source);

    if (pickedFile != null) {
      setState(() {
        _selectedImage = File(pickedFile.path);
      });
      final result = await recognizeText(_selectedImage!);
      final nik = extractNIK(result);
      setState(() {
        _result = nik ?? "NIK tidak ditemukan";
      });
    }
  }

  String? extractNIK(String ocrText) {
    final cleaned = ocrText.replaceAll(RegExp(r'[^0-9\s]'), '');
    final words = cleaned.split(RegExp(r'\s+'));

    for (final word in words) {
      if (RegExp(r'^\d{15,16}$').hasMatch(word)) {
        return word;
      }
    }

    return null;
  }

  Future<String> recognizeText(File imageFile) async {
    final inputImage = InputImage.fromFile(await preprocessImage(imageFile));
    final textRecognizer = TextRecognizer(script: TextRecognitionScript.latin);
    final RecognizedText recognizedText =
        await textRecognizer.processImage(inputImage);

    return recognizedText.text;
  }

  Future<File> preprocessImage(File file) async {
    final bytes = await file.readAsBytes();
    img.Image? image = img.decodeImage(bytes);
    if (image == null) return file;

    final gray = img.grayscale(image);
    final contrast = img.contrast(gray, contrast: 175);
    final denoised = img.gaussianBlur(contrast, radius: 1);

    final output = img.encodeJpg(denoised);
    final path = '${file.parent.path}/processed_${file.uri.pathSegments.last}';
    return File(path)..writeAsBytesSync(output);
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(title: const Text('KTP & NPWP Scanner')),
      body: SingleChildScrollView(
        child: Column(
          crossAxisAlignment: CrossAxisAlignment.center,
          children: [
            if (_selectedImage != null)
              Image.file(_selectedImage!, height: 300, fit: BoxFit.contain),
            const SizedBox(height: 20),
            Row(
              mainAxisAlignment: MainAxisAlignment.spaceEvenly,
              children: [
                ElevatedButton.icon(
                  icon: const Icon(Icons.camera_alt),
                  label: const Text('Camera'),
                  onPressed: () => _pickImage(ImageSource.camera),
                ),
                ElevatedButton.icon(
                  icon: const Icon(Icons.image),
                  label: const Text('Gallery'),
                  onPressed: () => _pickImage(ImageSource.gallery),
                ),
              ],
            ),
            const SizedBox(height: 20),
            const Text('Data:',
                style: TextStyle(fontSize: 18, fontWeight: FontWeight.bold)),
            const SizedBox(height: 10),
            Text(_result),
          ],
        ),
      ),
    );
  }
}

Read Also: Integrate Generative AI into a Flutter Application

Conclusion

With just a few key tools, we can now build an offline OCR feature in Flutter that reads KTP and NPWP documents directly from images. By combining Google ML Kit with simple image preprocessing, our app can extract critical data like NIK and NPWP more accurately without needing internet access.

By following this Flutter OCR tutorial, you now understand the basic steps to create an OCR feature in a Flutter application that can efficiently scan and extract data from documents like KTP and NPWP. The implementation of OCR technology allows you to develop high-functionality applications, such as for e-KYC processes, which are increasingly in demand across various industries.

If you need Flutter app development services that integrate OCR features, LOGIQUE is ready to assist. Our team has experience in developing applications with the latest technologies, including OCR and Flutter-based solutions. Feel free to contact us and start building your dream Flutter application with advanced features tailored to your business needs.