# Segment Processing

Post-processing of segments

Fractur provides various post-processing capabilities. Once segments have been extracted, you can use our defaults or configure how each segment type is processed.

#### [​](https://docs.chunkr.ai/docs/features/segment-processing#processing-methods)Processing Methods <a href="#processing-methods" id="processing-methods"></a>

* **Vision Language Models (VLM)**: Leverage AI models to generate HTML/Markdown content and run custom prompts
* **Heuristic-based Processing**: Apply rule-based algorithms for consistent HTML/Markdown generation

#### [​](https://docs.chunkr.ai/docs/features/segment-processing#additional-features)Additional Features <a href="#additional-features" id="additional-features"></a>

* **Cropping**: Get back the cropped images

These processing options allow you to build highly specific pipelines. Our default processing works for most documents, and RAG use cases.

### [​](https://docs.chunkr.ai/docs/features/segment-processing#defaults)Defaults <a href="#defaults" id="defaults"></a>

By default, Fractur applies the following processing strategies for each segment type. You can override these defaults by specifying custom configuration in your `SegmentProcessing` settings. HTML and Markdown are always returned.

Tables and FormulasPicturesOther Elements

```python
# Table and Formula by default are processed using LLM. 
# Formulas are returned as LaTeX.

default_llm_config = GenerationConfig(
    html=GenerationStrategy.LLM,
    markdown=GenerationStrategy.LLM,
    crop_image=CroppingStrategy.AUTO
)

default_config = Configuration(
    segment_processing=SegmentProcessing(
        Table=default_llm_config,
        Formula=default_llm_config,
    )
)
```

### [​](https://docs.chunkr.ai/docs/features/segment-processing#example)Example <a href="#example" id="example"></a>

Here is a quick example of how to use Fractur to process a document with different segment processing configurations. This configuration will:

* Summarize the key trends of all `Table` segments
* Crop all `SectionHeader` segments to the bounding box
* Generate HTML using heurstics and Markdown using a VLM for all `Text` segments

PythoncURL

```python
from chunkr_ai import Chunkr
from chunkr_ai.models import (
    Configuration, 
    CroppingStrategy, 
    GenerationConfig, 
    GenerationStrategy, 
    SegmentProcessing
)

chunkr = Chunkr()

chunkr.upload("path/to/file", Configuration(
    segment_processing=SegmentProcessing(
        Table=GenerationConfig(
            llm="Summarize the key trends in this table"
        ),
        SectionHeader=GenerationConfig(
            crop_image=CroppingStrategy.ALL
        ),
        Text=GenerationConfig(
            html=GenerationStrategy.AUTO, 
            markdown=GenerationStrategy.LLM
        ),
    ),
))
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://fractur.gitbook.io/fractur/features/segment-processing.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
