Architecture Review — AI / LLM Systems
Independent assessment of systems that integrate large language models (LLMs) or other AI models — from architecture through legal exposure, before shipping to production or in the wake of an incident.
Who this is for
- Founders and CTOs planning to integrate an AI feature into a production product — before paying for wrong outputs in customer trust.
- In-house counsel evaluating the legal exposure of an existing AI system — before investors, customers, or regulators do.
- Litigators in an AI matter — a legal dispute whose core is the output of an LLM (a wrong decision, faulty advice, copyright violation, data leak).
What the review covers
- Layer structure. How LLM calls are issued, where orchestration lives, how input is filtered, how output is verified before being shown to the user. Systems that pass user input straight through to a model — problematic.
- Risk: Prompt injection and data exfiltration. If the model has access to internal tools (RAG, function-calling, agent loops), every user input is a potential attack vector. The review checks which boundaries were placed, which were not, and what the effect is when they are breached.
- Risk: Hallucination and legal liability. When the model fabricates — who is responsible? The review checks whether a validation layer exists, whether users are disclosed that the output is AI-generated, and whether a log exists to reconstruct what the model said and when.
- Training-data IP exposure. Fine-tuning on customer data, or on a public dataset that was included in an AI training cutoff — two different legal risks. The review maps both.
- Vendor lock-in and portability. What happens if OpenAI triples its price, or Anthropic ships a breaking API change? A system that cannot switch vendor within 14 days is a precarious business position.
- Eval harness. How do you know the model did not regress after every prompt change? A production AI system without an eval harness is a system being changed with eyes closed.
Engagement format
A two-week, one-shot review. Week 1: review of code, architecture, prompt documents; interview with the lead engineer. Week 2: drafting of a conclusions document (12-20 pages) with priority-ordered recommendations. Includes one revision round and attendance at one management meeting to present findings.
Deliverable: a written document for internal distribution — not a verbal debrief. The intent is that the board, general counsel, or a prospective investor can see exactly what was found and how urgent it is.
When this is not the right engagement
- When the product is still a prototype and not production — a review then is a waste; returning to open design decisions is preferable.
- When the question is about the model itself (is GPT-5 more accurate than Claude Opus 4.7 on task X). That is an eval / benchmark project, not an architecture review.
- When the question is purely regulatory (EU AI Act, US AI HHS). Those require a lawyer specialised in AI regulation, not a software expert.
Pricing
Project-priced, written up in the engagement letter after a no-cost intake call. Most reviews are in the range of a two-week engineering project — not a retainer. If the engagement should be ongoing, the right channel is Fractional CTO.
Frequently asked questions
When does an AI / LLM architecture review make sense?
Before shipping an AI feature to production; before a funding round where investors will inspect the approach; before switching LLM vendors; when an incident (hallucination, prompt injection, cost blow-up) needs investigation.
What is included?
Architecture review, attack-surface analysis (prompt injection, data exfiltration), legal-exposure assessment for hallucination, training-data IP exposure analysis, written conclusions document with prioritised recommendations.
Is a written AI expert opinion available for court?
Yes — for matters where a legal dispute centres on the output of an AI system. The opinion examines specifically what the system did, what it should not have done, and why. The service is delivered under the Expert Witness practice.
Does the review cover the vendor (OpenAI / Anthropic / Google) itself?
No — the review focuses on how your system integrates the vendor. The vendor itself is a black box evaluated externally through behaviour, ToS, and SLA.