Compliance Jon Burgess · Founder, Trinito Updated 20 May 2026 Published 15 May 2026 4 min read

The five things UK businesses keep pasting into ChatGPT

Every incident review tells the same story

When we speak to UK firms after a near-miss — or a regulator letter — the details differ but the payload does not. Staff paste structured business data into a public chat box because the model is good at the task and nobody stopped them at the keyboard. You do not need a forensic engagement to guess what went in. These five categories cover the majority of real-world leaks we see in professional services, finance, property, and healthcare admin.

1. Client and customer names tied to context

Not just “John Smith,” but John Smith — the insolvency we are advising on, deadline Thursday, opposing counsel is… Once a name is paired with matter context, it is personal data about identifiable individuals and often confidential client information. Models store nothing permanently in the way people fear, but the prompt still crossed into a US provider’s environment — and your DPA almost certainly required you to minimise that.

What to watch for: matter titles in the first line, “our client X,” CRM exports pasted as bullet lists.

2. Draft contracts, letters, and board packs

Whole clauses, NDAs, employment agreements, and investment memos go in because “rewrite this more clearly” works. The risk is not only personal data — it is unpublished commercial terms, pricing, and counterparties. For law firms this can touch legal professional privilege; for everyone else it is straight confidentiality.

What to watch for: PDF text dumps, tracked-change paragraphs, “here is the LOI we received.”

3. Spreadsheets: payroll, pipeline, and performance data

Finance and ops love pasting tables. A CSV with employee names, salaries, and NI numbers is a single action. So is the sales pipeline with deal values and competitor notes. Spreadsheets feel less “serious” than a database export; to a classifier they are worse, because everything is explicit and labelled.

What to watch for: tab-separated rows, “column A is revenue,” bonus calculations, redundancy lists.

4. UK identifiers the regex layer was built for

National Insurance numbers, UK postcodes in address blocks, company registration numbers, VAT IDs, sort codes and account numbers, NHS numbers in admin notes — these are machine-detectable and should never reach a public model unredacted. They appear constantly in HR tickets, KYC packs, and property completions because staff assume the model “needs the real number to be useful.”

What to watch for: onboarding forms, AML packs, “validate this address,” patient admin copied from the PAS.

5. Your own client list and internal codenames

Generic NER misses contextual references that pattern packs catch — Project Falcon, client ref 8842, the renewal we cannot afford to lose. One paste can teach the session everything a competitor would pay for. Pre-Send Preview and the audit log are the safety net when classifiers miss.

What to watch for: “here are our top 20 accounts,” sprint names, deal codenames in subject lines.

What to do Monday morning

You will not fix this with another policy PDF. Short, practical steps:

Run a 30-minute workshop with team leads using these five headings — ask “which did we almost do last month?”
Put inspection on the path staff already use — not a separate portal — so redaction happens before send.
Seed rules from your client/matter export on day one; expand when someone clicks “add to block list.”
Log sessions so when the board asks, you show prevention counts — not a ban that nobody obeyed.

The goal is not to scare people off AI. It is to stop these five payloads crossing the line while your team keeps the productivity win. Recognise the patterns early and you will not be the firm explaining to the ICO why a whole spreadsheet lived in a chat log for thirty seconds — which was thirty seconds too long.

See it running

Curated UK examples — the same redaction pipeline that runs on the appliance.

See it running

Every incident review tells the same story

1. Client and customer names tied to context

2. Draft contracts, letters, and board packs

3. Spreadsheets: payroll, pipeline, and performance data

4. UK identifiers the regex layer was built for

5. Your own client list and internal codenames

What to do Monday morning

See it running

Related reading

AI firewall vs SaaS DLP vs local-only models — a buyer's guide

Why blocking ChatGPT does not work — and what does