# Optical Character Recognition (OCR)

Extract text from images

Optical Character Recognition (OCR) is a technology that converts different types of documents, such as scanned paper documents, PDF files, or images, into editable and searchable data.

### [​](https://docs.chunkr.ai/docs/features/ocr#ocr-strategy)OCR Strategy <a href="#ocr-strategy" id="ocr-strategy"></a>

Fractur AI API always returns OCR results. You can configure the OCR strategy using the `ocr_strategy` parameter.

We have two strategies:

* `All` (Default): Processes all pages with our OCR model.
* `Auto`: Intelligently applies OCR only to pages with missing or low-quality text. When a text layer is present, the bounding boxes from that layer are used instead of running OCR.

PythoncURL

```python
from chunkr_ai import Chunkr
from chunkr_ai.models import Configuration, OcrStrategy

chunkr = Chunkr()

chunkr.upload("path/to/file", Configuration(
    ocr_strategy=OcrStrategy.ALL # can also be OcrStrategy.AUTO
))
```

The `Auto` strategy provides the best balance between accuracy and performance for most use cases. Use the `All` strategy when you need to ensure consistent text extraction across all pages or when you suspect the existing text layer might be unreliable.

### [​](https://docs.chunkr.ai/docs/features/ocr#ocr-layout-analysis)OCR + Layout Analysis <a href="#ocr-layout-analysis" id="ocr-layout-analysis"></a>

OCR and Layout Analysis together are a powerful combination. It allows us to get word level bounding boxes and text while also understanding the layout of the document.

You can use that to make experiences like:

* Highlighting exact numbers in a table
* Highlighting text in images
* Embedding the text from pictures for semantic search

### [​](https://docs.chunkr.ai/docs/features/ocr#other-common-use-cases)Other common use cases <a href="#other-common-use-cases" id="other-common-use-cases"></a>

* Digitizing old books and documents
* Processing invoices and receipts
* Automating form data entry
* Reading license plates
* Converting handwritten notes to digital text
* Extracting text from screenshots and images


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://fractur.gitbook.io/fractur/features/optical-character-recognition-ocr.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
