Protopia AI is at RSAC. Meet our team of experts in AI Data Privacy and Security.

7 min read

Written by protopia

Unlocking Sensitive Data for AI in Financial Services

Financial institutions are under pressure to move GenAI from pilots to production. The challenge is not a lack of models or infrastructure. It is how to use the most sensitive data, safely ,with those models.

In a recent webinar hosted with AI Collective, leaders from BNY Mellon, SanctifAI, and Protopia AI discussed how banks and AI builders can run sensitive workflows on shared or serverless environments without exposing plaintext at inference time, while improving GPU utilization and cost.

The Panel

  • Eiman Ebrahimi, CEO and founder, Protopia AI
  • Jeremiah Sadow, former Chief Compliance Officer for AI and Strategic Partnerships, BNY Mellon
  • Brad Lawler, founder of SanctifAI, building AI for wealth management and financial institutions
  • Moderated by Brittany Carambio, Protopia AI
Keep reading to get the session highlights, or check out the whole session below:

Why sensitive data slows AI projects in FSI

Everyone agrees AI can help with fraud, compliance, and personalized advice. The difficulty starts when projects depend on high risk data such as client information, transactions, and internal code.

From Jeremiah’s experience, serious GenAI use cases in a bank go through a cross functional review that looks at:

  • What the use case is trying to achieve
  • Who the users are and what they are allowed to see
  • What data will be used and how it is classified
  • Whether outputs are internal only or client facing

As he put it:

“When a line of business comes to a cross functional committee and says we have this use case, you really have to think about what is the use case, who are the users, their entitlements, what type of data are the users going to use.”

High value use cases usually involve higher risk data classes. That often leads to an isolation first approach, such as:

  • Building and hosting everything inside the bank
  • Asking third parties to deploy on dedicated, tightly controlled infrastructure inside the bank’s environment

These patterns can satisfy policy requirements, but they are slow to stand up and expensive to run. Smaller and mid sized institutions may not have the people or infrastructure to apply the same model to every new workload.

The builder view

Demand is strong, but privacy is the bottleneck

Brad brought the perspective of SanctifAI, which serves families and financial institutions.

SanctifAI’s application needs to:

  • Read and reason over trusts, wills, and estate planning documents
  • Combine that information with balance sheets and financial statements
  • Provide guidance to families and their advisers on taxes, legacy, and wealth transfer

This is exactly the type of use case banks want to explore, and it depends on some of the most sensitive data they hold.

On the technology side, Brad noted that access to capable models and managed inference is no longer the main constraint:

“We have incredible access to powerful inference and intelligence technologies at a cost that allows startups like ours to address problems and business models we would not really have considered before. We can now focus on what used to be niche problems.”

The constraint is whether that data can leave the institution’s environment and be processed by a third party.

A common way to make institutions more comfortable is per customer dedicated deployments. That reduces some risk, but it has clear cost implications. For example, if a family office or institution insists on its own dedicated environment, hosting costs for that account can be several times higher than if multiple customers shared infrastructure. Either the buyer pays more, or the provider absorbs the cost and compresses margins.

Brad summarized it simply:

“To deliver the cost to these families could end up becoming two or three times as much if they want the most secure solution and run it in a dedicated environment for that individual family.”

The architectural gap 

Eiman walked through why protecting data at inference time is a persistent challenge, even when data at rest and data in transit are already encrypted.

In a typical GenAI architecture:

  • Data is stored inside the institution’s root of trust, encrypted at rest
  • It is encrypted in transit to a model endpoint
  • At the inference host, the data is decrypted into plaintext so the model can run

Once decrypted, sensitive data can appear in:

  • GPU and CPU memory
  • Temporary storage if long contexts spill to disk
  • Logs and telemetry that are sometimes written in plaintext for performance reasons

This is true whether the endpoint is a managed service, a neo cloud, or a third party platform. It is not about intent. Any system that processes plaintext data can become an exposure point if it is misconfigured or compromised.

That inference time plaintext exposure is the privacy gap that makes risk and compliance teams cautious, and that slows or blocks some of the highest value FSI use cases.

How Stained Glass Transform changes the picture

The central part of the webinar was how Protopia’s Stained Glass Transform (SGT) removes plaintext from the shared or managed environment while keeping LLMs usable.

Using SanctifAI’s advisor scenario as an example, consider a family asking, “What are our estimated taxes at this person’s death” and providing:

  • Trust and estate planning documents

  • Balance sheets with detailed financial positions

With SGT in place, the flow looks different:

    1. Inside the institution or SanctifAI environment documents and prompts remain inside the root of trust.
    2. SGT runs as a small neural network that converts the combined input into stochastic embeddings. These embeddings preserve the signal the model needs but cannot be reversed into readable text.
    3. Only the transformed embeddings are sent to the inference endpoint. The LLM, such as Llama served with vLLM or another supported host, consumes those embeddings without modifications to model weights. The hosting environment never sees plaintext input.
    4. The model response is returned in protected form.Only the original environment that initiated the request has the keys needed to render the final plaintext answer to the user.
  1.  

This flow is called round trip protection. Sensitive content never appears in plaintext on shared or managed compute, yet the model still generates useful answers based on the underlying information.

From an implementation standpoint:

  • SGT runs as a container in the inference pipeline
  •  
  • Applications call an SGT proxy that exposes an OpenAI compatible API on the left and forwards prompt embeddings to the model endpoint on the right
  • The transform adds only a small amount of latency relative to the overall model generation time

This lets institutions and builders continue using their preferred models, platforms, and RAG components, while removing plaintext exposure from the hosting layer.

What this enables for risk teams and builders

For risk and compliance teams, SGT becomes a reusable control rather than a one off exception.

Jeremiah described the effect this way:

“Once the solution is applied to the enterprise, the entire GenAI risk profile across subsequent use cases is really brought down. It might not take a high risk use case to a low risk, but it is one less thing for legal, cyber, tech risk, compliance, and the line of business teams to be concerned about.”

That has several practical implications:

  • Subsequent use cases that follow the same data safe inference pattern can move more quickly
  •  
  • Institutions without large, specialized AI risk teams can adopt a stronger baseline for protecting inference workloads
  • Explainability and auditability remain intact, since the original inputs and outputs still reside in the root of trust and can be traced when needed

For builders like SanctifAI, the impact shows up in product design and sales conversations.

“It changes the conversation and it makes it easier. Protopia is making it easier. Our new data privacy and security slide should just be Protopia.”

Instead of defending a bespoke dedicated deployment for every client, they can present a clear message:

  • The provider does not need plaintext access to customer data at inference time
  • Shared or serverless deployments are compatible with privacy requirements
  • Cost and performance can be aligned with how customers actually use the application

This gives institutions a way to work with innovative AI applications while keeping sensitive data under stronger control.

Key takeaways

1. You do not have to choose between privacy and infrastructure efficiency

With Stained Glass Transform, financial institutions can use sensitive data with LLMs on shared or serverless infrastructure while removing plaintext exposure at inference time.

2. A repeatable pattern accelerates approvals

Once risk and compliance teams approve a data safe inference architecture, they can apply it across multiple GenAI use cases rather than starting from scratch each time.

3. Builders can support enterprise requirements without dedicated stacks for every customer

Startups and internal teams can serve multiple tenants on shared platforms and still meet strict privacy and governance expectations.

4. Auditability is preserved

SGT does not break the link between inputs and outputs inside the institution’s environment, which supports explainability and regulatory review.

Watch the full webinar

This recap provides a high level view of the discussion. The full session includes:

  • A deeper technical walkthrough from Eiman of how SGT fits into modern LLM stacks
  • SanctifAI’s product example for multi generational wealth planning
  • Audience questions on latency, Langflow integration, and working with legacy systems

Watch the full webinar: “Unlocking Sensitive Data for AI in Financial Services”

If you are exploring how to move GenAI from pilot to production in financial services while protecting sensitive data, contact our team to schedule a technical discussion.

Table of contents

Share this article

Related blogs