camt-parser

A high-performance C++ library for parsing SEPA CAMT XML banking formats.

Supports:

camt.052 Bank-to-Customer Account Report
camt.053 Bank-to-Customer Statement
camt.054 Bank-to-Customer Debit/Credit Notification

The library provides a structured, domain-friendly data model and an optional CSV export layer optimized for accounting, reconciliation, and audit workloads.

Features

Fully supports camt.052 / .053 / .054
Extracts complete transaction details (counterparty, remittance, references)
Unicode-aware free-text normalization (optional USE_UTF8PROC)
Bank transaction code → GVC mapping included
Optional canonical transaction hash for duplicate detection
CSV export designed for accounting systems
Zero locale-dependencies (monetary values kept as text until formatted)
MIT License (commercial-friendly)

Installation / Build

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build

Optional dependencies:

pugixml (MIT) for XML parsing
utf8proc (MIT) for Unicode normalization (if USE_UTF8PROC is defined)

High-Level Data Model

Document
 └─ Statement(s)
     └─ Entry(s)
         └─ EntryTransaction(s)

`Document`

Represents a full CAMT file (052 / 053 / 054).

`Statement`

Contains metadata (account, creation time, balances) and Entry records.

`Entry`

Represents a booked transaction line (booking date, value date, amount).

May contain multiple EntryTransaction elements.

`EntryTransaction`

Contains full payment details:

Counterparty (IBAN/BIC/Name)
Remittance text (structured + unstructured)
ISO 20022 BankTransactionCode (Domain / Family / SubFamily)
Proprietary bank code
Charges, fees, reversal indicator, etc.

Parsing API

#include <camt_parser_pugi.hpp>

camt::Parser p;
camt::Document doc;
std::string err;

if (!p.parse_file("statement.xml", doc, &err)) {
    std::cerr << "Parse error: " << err << "\n";
}

CSV Export

#include <camt_csv.hpp>

std::ofstream out("export.csv");

camt::ExportOptions opt;
opt.write_utf8_bom = true;          // Excel compatible
opt.remittance_separator = " | ";   // readable multi-part purpose text
opt.signed_amount = true;
opt.credit_as_bool = true;

camt::export_entries_csv(doc, &out, nullptr, opt);

Quick Start Example

This example demonstrates parsing a CAMT file and exporting transactions to CSV:

#include <camt_parser_pugi.hpp>
#include <camt_csv.hpp>
#include <fstream>
#include <iostream>

int main() {
    camt::Parser parser;
    camt::Document doc;
    std::string err;

    // Parse CAMT XML from file
    if (!parser.parse_file("statement.camt053.xml", doc, &err)) {
        std::cerr << "Parse error: " << err << "\n";
        return 1;
    }

    // Export to CSV
    camt::ExportData data;
    camt::ExportOptions opt;
    opt.include_header = true;

    std::ofstream out("export.csv", std::ios::binary);
    camt::export_entries_csv(doc, &out, &data, opt);

    std::cout << "Export completed: export.csv\n";
    return 0;
}

Field Semantics: `first` vs `second`

Each exported field consists of a pair:

first → Human-readable formatted value
second → Canonical normalized value used for sorting, hashing, and deduplication

second is never empty: if not present, it is generated via normalization rules.

Supported Input Methods

Input type	Function
File path	`parse_file(const std::string& path, ...)`
Input stream	`parse_file(std::istream&, ...)`
Memory buffer	`parse_string(const char* buffer, ...)`

Complete Demonstration

A fully working test and demonstration is available here: examples/main.cpp

ExportOptions

Option	Default	Description
`delimiter`	`';'`	CSV separator (`;`, `,`, or `\t`)
`include_header`	`true`	Adds header row
`write_utf8_bom`	`false`	Necessary for Excel UTF-8 import
`signed_amount`	`true`	Credit positive / Debit negative
`credit_as_bool`	`true`	`IsCredit = 1/0` instead of `CRDT/DBIT`
`remittance_separator`	`""`	Join multiple `Ustrd[]` lines
`use_effective_credit`	`false`	Apply reversal indicator
`prefer_ultimate_counterparty`	`true`	Prefer `UltmtDbtr` / `UltmtCdtr`

Exported Row Format: Display Value (`first`) vs Normalized Value (`second`)

Each exported field consists of a pair:

first: Human-readable display value (formatted, signed, date-formatted)
second: Canonical normalized value used for sorting, comparison, and deterministic hashing.

second is never empty — if not assigned directly, it is generated via normalization rules.

Field overview (C1 display naming)

Field	`first` (display)	`second` (normalized raw)	Influenced by options
Value Date	`YYYY-MM-DD`	`YYYYMMDD` digits	Sorting (`useBookingDate=false`)
Booking Date	`YYYY-MM-DD`	`YYYYMMDD` digits	Sorting (`useBookingDate=true`)
Amount	Formatted signed or unsigned	Absolute numeric text	`signed_amount`, `use_effective_credit`
Is Credit	`1`/`0` or `CRDT`/`DBIT`	`1`/`0` original direction	`credit_as_bool`, `use_effective_credit`
Reversal	`1` or `0`	`1` or `0`	Affects sign if `use_effective_credit=true`
Currency	`EUR` etc.	Uppercased, trimmed	Normalization only
Counterparty Name	Human-friendly chosen party	NFC/casefold/trim normalized	`prefer_ultimate_counterparty`, normalization
Counterparty IBAN	Formatted IBAN	Uppercase, no spaces	Normalization
Counterparty BIC	Formatted BIC	Uppercase, no spaces	Normalization
Remittance Line	Joined text lines	GS-joined normalized tokens	`remittance_separator`, `USE_UTF8PROC`
Remittance Structured	Display text	Normalized base	Normalization
End-to-End ID	Shown as provided	Uppercase, no spaces	Normalization
Mandate ID	Shown as provided	Uppercase, no spaces	Normalization
Transaction ID	Shown as provided	Uppercase, no spaces	Normalization
Bank Reference	Display reference	Normalized	Normalization
Account IBAN	Statement IBAN	Uppercase, no spaces	Normalization
Account BIC	Statement BIC	Uppercase, no spaces	Normalization
Booking Code	Code as text	Uppercase trimmed	Normalization
Status	Display text	Trimmed	Normalization
Running Balance	Formatted running total	Same as display	Always signed logically (`CRDT = +`, `DBIT = −`)
Charges Amount	Display charges	Same as display	Independent of `signed_amount`
Charges Currency	Display currency	Normalized	Normalization
Charges Included	`1`/`0`	Same	None
Entry Ordinal	Display index	Same	Used as stable tiebreaker
Transaction Ordinal	Display index	Same	Used as stable tiebreaker

This section ensures consistent interpretation when converting to CSV, databases, or accounting systems.

(first) vs Normalized Value (second)

Each exported field consists of a pair:

first: Human-readable display value (formatted, signed, date-formatted)
second: Canonical normalized value used for sorting, comparison, and deterministic hashing.

second is never empty — if not assigned directly, it is generated via normalization.

Display Name	`first` (human output)	`second` (normalized / for sorting)	Relevant Options
Value Date	`YYYY-MM-DD`	`YYYYMMDD`	Affects sorting order
Booking Date	`YYYY-MM-DD`	`YYYYMMDD`	Sorting if selected
Amount	Signed or unsigned amount text	Absolute numeric amount	`signed_amount`, `use_effective_credit`
Is Credit	`1` (credit) / `0` (debit) or `CRDT/DBIT`	Always `1/0` for original direction	`credit_as_bool`, `use_effective_credit`
Reversal	`1` or `0`	Same	May flip meaning under `use_effective_credit`
Counterparty Name	Best resolved party name	Case-normalized canonical text	`prefer_ultimate_counterparty`
Counterparty IBAN	Standard IBAN text	Uppercased, spaces removed	Normalization
Counterparty BIC	Standard BIC text	Uppercased, spaces removed	Normalization
Remittance	Joined free text lines	Canonical token-joined, normalized lines	`remittance_separator`, `USE_UTF8PROC`
Structured Reference	Display reference text	Normalized canonical text	`USE_UTF8PROC`
End-to-End ID	Displayed as present	Uppercased, spaces removed	Normalization
Mandate ID / Transaction ID / Bank Reference	Display text	Normalized ID	Normalization
Running Balance	Accumulated signed balance	Same as display	Determined by sorting order
Opening / Closing Balance	Displayed balance	Same	Placement depends on statement order

Normalization rules:

Text fields are trimmed and unified
Codes & currency uppercased
IBAN/BIC spaces removed
Date .second always YYYYMMDD
Amount.second is always absolute value
Advanced Unicode normalization if compiled with USE_UTF8PROC

Option	Default	Description
`delimiter`	`';'`	CSV separator (`;`, `,`, or `\t`)
`include_header`	`true`	Adds header row
`write_utf8_bom`	`false`	Necessary for Excel UTF-8 import
`signed_amount`	`true`	Credit positive / Debit negative
`credit_as_bool`	`true`	`IsCredit = 1/0` instead of `CRDT/DBIT`
`remittance_separator`	`""`	Join multiple `Ustrd[]` lines
`use_effective_credit`	`false`	Apply reversal indicator
`prefer_ultimate_counterparty`	`true`	Prefer `UltmtDbtr` / `UltmtCdtr`

Exported Row Format: first vs second

Every column in ExportData is stored as a pair:

first → human-readable, formatted display value
second → normalized, canonical value used for sorting, hashing, and stable processing

If a field does not explicitly assign second, it is filled from first via normalization.

Normalization rules include:

Date fields → YYYYMMDD
IBAN/BIC → uppercase, no spaces
Free-text fields → trimmed, casefolded (with utf8proc if enabled)
Amounts → second stores absolute value

Sorting uses .second for date ordering. Running balance uses signed logic independent of display formatting.

Optional Unicode Normalization (`USE_UTF8PROC`)

This library supports optional full Unicode normalization of free-text fields (e.g., remittance lines, counterparty names, references).

If compiled with -DUSE_UTF8PROC, the following normalization is applied using the MIT-licensed utf8proc library:

Normalize text to NFC
Unicode-aware case folding
Removal of zero-width characters
Stable whitespace normalization

If USE_UTF8PROC is not defined:

A lightweight ASCII-only fallback is used
No external dependencies are required
No utf8proc code is linked or distributed

Canonical Transaction Hashing

std::string hash = camt::accumulate_hash_row(row);

Stable fingerprint for:

Duplicate detection
Ledger synchronization
Audit trails

License

Released under the MIT License.

SPDX-License-Identifier: MIT
SPDX-FileCopyrightText: 2025 Psynetic

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
examples/qt-csv-export-demo		examples/qt-csv-export-demo
external		external
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
THIRD_PARTY.md		THIRD_PARTY.md
camt-parser.pri		camt-parser.pri
camt_csv.hpp		camt_csv.hpp
camt_model.hpp		camt_model.hpp
camt_parser_pugi.hpp		camt_parser_pugi.hpp
gvc_map.cpp		gvc_map.cpp
gvc_map.hpp		gvc_map.hpp
licenses.html		licenses.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

camt-parser

Features

Installation / Build

High-Level Data Model

`Document`

`Statement`

`Entry`

`EntryTransaction`

Parsing API

CSV Export

Quick Start Example

Field Semantics: `first` vs `second`

Supported Input Methods

Complete Demonstration

ExportOptions

Exported Row Format: Display Value (`first`) vs Normalized Value (`second`)

Field overview (C1 display naming)

Exported Row Format: first vs second

Optional Unicode Normalization (`USE_UTF8PROC`)

Canonical Transaction Hashing

License

About

Uh oh!

Releases

Packages

Languages

License

psynetic-software/camt-parser

Folders and files

Latest commit

History

Repository files navigation

camt-parser

Features

Installation / Build

High-Level Data Model

Document

Statement

Entry

EntryTransaction

Parsing API

CSV Export

Quick Start Example

Field Semantics: first vs second

Supported Input Methods

Complete Demonstration

ExportOptions

Exported Row Format: Display Value (first) vs Normalized Value (second)

Field overview (C1 display naming)

Exported Row Format: first vs second

Optional Unicode Normalization (USE_UTF8PROC)

Canonical Transaction Hashing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`Document`

`Statement`

`Entry`

`EntryTransaction`

Field Semantics: `first` vs `second`

Exported Row Format: Display Value (`first`) vs Normalized Value (`second`)

Optional Unicode Normalization (`USE_UTF8PROC`)

Packages