Financial Contract Extraction Calendar Rule Case Study | Nie Er

A financial contract extraction and transaction-calendar rule engine turns asset-management contract clauses into standard rules and daily subscription or redemption availability. This case served an anonymized A+H-listed Tier 1 securities firm’s asset-management workflow.

The goal was not a prompt demo. The useful output had to survive standard-rule mapping, trading-calendar logic, and human review.

Background

Asset-management contracts can be around 100k Chinese characters. Operations teams need to extract fee rules, performance compensation terms, subscription and redemption windows, confirmation methods, lockups, and closed periods before maintaining them in internal systems.

The first version had limited production value because accuracy was too low. Standard-rule mapping broke often. One common example was tiered performance compensation: a contract might state that excess return between 8% and 15% is charged at 5%, and anything above 15% at 10%. The implicit below-8% tier still needs to be represented as 0%. If the extraction misses that tier, the downstream rule is incomplete.

What Made It Difficult

This was not generic document extraction. The extracted clauses had to become executable rules.

Fee and performance-compensation clauses involve tier tables, Chinese numeric expressions, percentages, implicit ranges, and field boundaries. Subscription and redemption clauses involve open days, postponement, advancement, calendar days, business days, and trading days. A rule such as “the Tuesday after the third Friday of September and the next two business days, postponed if not a business day” has to become structured fields before it can generate calendar dates.

Returning plausible JSON is not enough. The system after JSON is where most of the risk sits.

My Role

I led the engineering implementation across contract extraction, standard-rule mapping, EDD evaluation, and the transaction-calendar rule engine.

On the extraction side, I split the contract into five modules: fees, performance compensation, subscription liquidity, redemption liquidity, and other fields. Higher-risk modules used two prompt styles, with up to eight parallel LLM calls per case. The dual-prompt approach improved results by about 3 percentage points over the single-prompt baseline, mainly by improving coverage for tier tables and numeric rules.

On the cleanup side, I implemented enum matching and numeric normalization. Variants such as T+2, T+2日, and N+2 are mapped into one confirmation enum. Terms such as postponed, business day, Chinese percentage expressions, and positive-return language are converted into system-readable fields.

On the calendar side, I split open-day generation into calendar-day selection, trading-calendar handling, and holiday alignment. The rule engine supports nine cycle types, and openDay=32 means the last day of the month. Postponement and advancement share the same alignment logic with opposite search directions.

Tradeoffs

Dual prompts were not added for decoration. A single prompt was cheaper, but missed tier tables more often. For high-risk fields, it was better to pay extra token cost and give reviewers more complete candidates.

The deterministic cleanup layer was another deliberate tradeoff. Enum normalization, percentage parsing, and KV shape repair are more reproducible in code than in repeated prompt instructions. But cleanup cannot replace business definitions. Fields whose definitions had not settled were not treated as solved.

The transaction calendar had one small but important rule: multiple candidate open days inside the same continuous market-closed block cannot all be postponed to the first trading day after the holiday. The engine sorts candidates inside the block and maps candidate i to the ith trading day after the holiday, or before it for advancement.

Caching used an MD5 parameter fingerprint. A saved calendar row is reused only when task, month, year, and rule parameters all match. If a user changes liquidity or lockup parameters, the system recalculates instead of reusing an old calendar.

Result

The confirmed result was that contract handling moved from hours to about two minutes per contract, and field-level results improved from the 60%+ range to around 90%. Dual prompts contributed roughly 3 percentage points over the single-prompt setup.

This did not remove human review. It changed where human attention starts. Reviewers no longer begin with the full contract; they begin with structured candidates, evidence, conflicts, and a generated calendar.

My main takeaway is that reliable financial contract automation depends as much on field-level evaluation and deterministic cleanup as it does on the model.

If you are evaluating contract extraction, document parsing, field-level evaluation, or financial rule engines, contact me by email at contact@aildnc.com. For China-based inquiries, use the WeChat QR code below the article.

Contact

Discuss Similar Work

If you are evaluating a similar document AI, enterprise RAG, knowledge base, or AI workflow project, share the context first. Email works, and Telegram is available for a faster reply: contact@aildnc.com.

Telegram @NieErAI Message me on Telegram