Skip to content

What Does Zero Data Retention Mean for SaaS Integrations?

Learn what zero data retention means for SaaS integrations, why sync-and-store APIs fail enterprise security reviews, and how pass-through architectures unblock deals.

Sidharth Verma Sidharth Verma · · 13 min read
What Does Zero Data Retention Mean for SaaS Integrations?

Your enterprise deal just stalled in procurement. The buyer's InfoSec team reviewed your vendor risk assessment and flagged a massive problem: your integration middleware caches their sensitive HRIS records and CRM contacts on shared infrastructure. They classified it as an unmanaged sub-processor, refused to sign the Business Associate Agreement (BAA), and the deal is effectively dead.

If you sell B2B SaaS to enterprise clients, healthcare organizations, or financial institutions, this scenario isn't hypothetical—it happens every week. Integration compliance is a binary go/no-go for revenue. The middleware that helped you ship basic integrations quickly in the SMB market will actively disqualify you upmarket. To pass strict InfoSec reviews and unblock revenue, you need an architecture that processes data in transit without ever writing it to a database.

Zero Data Retention (ZDR) for SaaS integrations means that your integration middleware processes third-party API payloads entirely in memory and never writes customer data to persistent storage. The payload enters, gets transformed into a normalized format, gets delivered to your application, and is immediately discarded. No cache. No replica. No 30-day retention window.

This guide breaks down exactly what ZDR means for integrations, why traditional sync-and-store architectures fail enterprise security audits, and how to build a stateless pass-through proxy that keeps your compliance footprint small.

What is Zero Data Retention (ZDR) in SaaS Integrations?

When evaluating integration architectures, the keyword you'll hear from every security team is "data at rest."

The term "zero data retention" has been popularized by AI API providers, but the concept applies directly to any middleware that touches your customers' sensitive data. Anthropic defines ZDR for its Claude API as an arrangement where "customer data is not stored at rest after the API response is returned, except where needed to comply with law or combat misuse." OpenRouter puts it even more simply: "Zero Data Retention (ZDR) means that a provider will not store your data for any period of time."

Apply that same principle to integration middleware and the definition becomes concrete:

  • API payloads (CRM contacts, HRIS employee records, financial transactions) are processed in transit
  • No persistent storage — no database tables, no object storage buckets, no disk-based caches hold your customers' data
  • Transformation happens in memory — field mapping, schema normalization, and pagination assembly all occur without writing intermediate state
  • Credentials are encrypted at rest, but the data flowing through the pipe never touches persistent storage
Info

Zero Data Retention (ZDR) Definition A data processing standard where an integration layer processes information entirely in memory. The system routes, transforms, and delivers the payload to its final destination without ever writing the data to a hard drive, database, or persistent cache.

Consider a concrete example: your SaaS application pulls a list of employees from a customer's Workday instance. A ZDR integration proxy fetches that data, normalizes the JSON payload in memory, and hands it directly to your application backend. If the middleware provider's servers were physically seized five seconds later, there would be zero trace of that customer's employee data on the disks.

This is a fundamentally different architecture from the traditional "sync-and-store" model where integration platforms poll APIs on a schedule, dump results into a database, and serve cached records to your application.

Why Enterprise InfoSec Teams Demand Zero Data Retention

The financial math behind enterprise security scrutiny is straightforward. IBM's Cost of a Data Breach Report 2024 found the global average breach hit a record USD 4.88 million—a 10% increase from 2023 and the largest spike since the pandemic. For the 14th year in a row, healthcare saw the costliest breaches across industries, with average breach costs reaching $9.77 million.

These numbers explain why every system that stores customer data is a potential breach surface, and every breach surface gets scrutinized during procurement.

When a SaaS company uses an integration platform that stores customer data, that SaaS company inherits the security posture of the middleware provider. If the integration platform suffers a breach, the SaaS company's customers are compromised. Enterprise InfoSec teams understand this chain of liability intimately. They actively seek to minimize the number of sub-processors that handle their data at rest.

The moment your integration vendor becomes a sub-processor, you inherit their compliance obligations:

  • You need a Data Processing Agreement (DPA) with them
  • They appear on your sub-processor list, which your enterprise customers review
  • Their SOC 2 report, penetration test results, and data residency policies all come under scrutiny
  • If they store data in a region your customer's policy prohibits, the deal is dead
  • If the data includes Protected Health Information (PHI), the integration platform must sign a BAA—and many developer tools flatly refuse to do this

A ZDR integration architecture eliminates this entire category of risk. If the middleware never stores customer data, it's not a sub-processor in the traditional sense—it's a conduit. The compliance conversation shrinks dramatically.

The SIG Core Questionnaire and the Sub-Processor Trap

When you move your SaaS integration strategy upmarket, procurement teams rely heavily on standardized risk assessments. The most common is the Standardized Information Gathering (SIG) questionnaire published by Shared Assessments.

The SIG Core Questionnaire is a comprehensive third-party risk assessment designed to evaluate vendors that store or maintain sensitive, regulated information. It covers 21 risk topics across hundreds of questions.

Domain 10—Third-Party Risk Management—is where integration deals go to die.

Warning

The Tripwire Question Somewhere in the SIG Core assessment, you will be asked: "Does any third-party sub-processor store, cache, or replicate our data?"

If your application relies on a traditional integration platform to sync data between your SaaS and the customer's internal systems, the answer is yes. Here's how the trap plays out in practice:

  1. Your B2B SaaS product integrates with your customer's Salesforce, Workday, or QuickBooks
  2. You use an integration middleware vendor that syncs data on a schedule and caches records in their database
  3. The buyer's InfoSec team asks about your sub-processors
  4. You disclose that your integration vendor stores their CRM contacts and HRIS records on shared infrastructure
  5. InfoSec flags this as an unmanaged sub-processor with an unacceptable data footprint
  6. The deal stalls for weeks—or months—while your vendor scrambles to provide a DPA, BAA, and acceptable answers to follow-up questions

By using an integration tool with a pass-through architecture, you bypass this trap entirely. Because the middleware does not store the data, it's classified as a conduit rather than a data custodian. You answer the question differently: "Our integration layer processes data in transit. No customer data is stored, cached, or replicated by any sub-processor." The follow-up questions disappear, and you can pass enterprise security reviews in days instead of months.

Evaluating Unified APIs: Sync-and-Store vs. Real-Time Pass-Through

When engineering teams evaluate unified APIs, they often treat them as interchangeable. They aren't. The industry is split between two fundamentally different architectures, and the distinction determines your compliance posture.

The Sync-and-Store Model

Most first-generation unified APIs use a polling and caching model:

sequenceDiagram
    participant App as Your App
    participant MW as Integration<br>Middleware
    participant DB as Middleware<br>Database
    participant API as Third-Party<br>API (e.g. Salesforce)

    MW->>API: Poll for new/updated records (scheduled)
    API-->>MW: Return records
    MW->>DB: Write records to cache
    App->>MW: GET /contacts
    MW->>DB: Read from cache
    DB-->>MW: Return cached records
    MW-->>App: Serve cached response

The data sits in the middleware's database. It's typically retained for 30 to 60 days, replicated across availability zones, and backed up. When your application requests data, you aren't actually querying the third-party API—you're querying the unified API provider's database.

The problems compound quickly:

  • Massive compliance footprint: They store full copies of your customers' data. This fails the SIG Core questionnaire data storage requirements outright.
  • Stale data: Because data is synced on a schedule (every 5 to 60 minutes), your application is always reading stale data. If a user updates a record in Salesforce, it won't reflect in your app until the next sync cycle completes.
  • Faked webhooks: Many of these platforms simulate webhooks by diffing their database against the source API during syncs, leading to delayed and sometimes missing events.
  • Data residency headaches: If the cache is in one region and your customer demands storage in another, you have a problem you can't easily solve.

The trade-off: Sync-and-store gives you faster read latency and the ability to run complex queries across records. For some use cases—analytics dashboards, bulk data processing—this trade-off makes sense.

The Real-Time Pass-Through Model

sequenceDiagram
    participant App as Your App
    participant MW as Integration<br>Middleware
    participant API as Third-Party<br>API (e.g. Salesforce)

    App->>MW: GET /contacts
    MW->>API: Forward request (with auth, mapping)
    API-->>MW: Return raw response
    Note over MW: Transform in memory<br>(field mapping, normalization)
    MW-->>App: Return unified response

No data is stored. The middleware acts as a real-time proxy: it receives your request, translates it into the third-party's native format, makes the API call, transforms the response into a unified schema in memory, and returns it. The entire lifecycle happens in a single request/response cycle.

  • True Zero Data Retention: Data is never written to disk. You ensure zero data retention when processing third-party API payloads.
  • Real-time accuracy: You always interact with the live state of the third-party system. No sync delays.
  • Enterprise ready: Procurement teams approve these architectures rapidly because there is no persistent shadow database to audit.
Warning

Be honest with yourself about which model you actually need. If your product requires running analytical queries across thousands of CRM records, a pass-through proxy alone won't cut it. But if you're reading and writing individual records or small lists—syncing records on user action, pulling employee data during onboarding, reading CRM context for an AI agent—real-time pass-through is almost always the better architecture for enterprise sales.

For a deeper comparison, see Tradeoffs Between Real-time and Cached Unified APIs.

How a Pass-Through Integration Architecture Actually Works

A ZDR pass-through architecture has three layers, each designed to avoid persisting customer data:

graph TD
  A[Your SaaS Application] -->|Unified Request| B[Integration Proxy Engine]
  B --> C[Load Auth Context & Config<br>From Metadata Store]
  B --> D[Fetch Live Data from<br>Third-Party API]
  D --> E[Provider Returns<br>Native JSON Response]
  E --> F[In-Memory JSONata<br>Transformation Engine]
  F --> G[Normalized Unified Payload]
  G -->|Direct Response| A
  G -.-> H[Garbage Collection<br>Payload Destroyed]

1. Credential Storage (Encrypted, Not Customer Data)

The middleware does store OAuth tokens, API keys, and connection metadata. This is necessary to authenticate against third-party APIs on your behalf. But credentials are not customer data—they're access tokens that get encrypted at rest and rotated automatically.

A well-designed system refreshes OAuth tokens proactively—shortly before they expire—so API calls never fail due to stale credentials. If a refresh fails, the account is flagged for re-authorization and a webhook event is fired to your application. No customer payload data is read from or written to the database at this layer.

2. Declarative Transformation via JSONata (In-Memory)

The hardest part of a unified API is mapping provider-specific fields to a common model. In a sync-and-store system, this transformation happens during the sync job and the result is written to a database. In a pass-through system, transformation happens in memory during the request lifecycle.

A declarative mapping language like JSONata evaluates expressions against the raw API response and produces the unified output—without ever writing intermediate results to disk:

// Raw Salesforce response (in memory)
const raw = {
  FirstName: "Jane",
  LastName: "Doe",
  Email: "jane@acme.com",
  Account: { Name: "Acme Corp" }
};
 
// JSONata expression (stored as config, not customer data)
const mapping = `{
  "first_name": FirstName,
  "last_name": LastName,
  "email": Email,
  "company_name": Account.Name
}`;
 
// Evaluated in memory, returned directly to caller
// Raw response is garbage-collected after the request completes

The mapping configuration is stored—it's platform config, not customer data. The payload that flows through it is never persisted. Because JSONata is side-effect free and evaluates entirely within the execution context, the transformation completes without requiring temporary database tables or persistent caching.

3. Webhook Forwarding (Process and Discard)

Inbound webhooks from third-party platforms follow the same principle: the middleware receives the webhook payload, verifies its signature, transforms it into a unified event format, and forwards it to your registered endpoint. The raw payload is not written to a database.

The honest caveat: webhook delivery is harder to make reliable without some form of intermediate storage. If your endpoint is down when the webhook arrives, a pure ZDR system can't replay it from its own storage. The practical solution is to use a transient message queue with a very short TTL (seconds to minutes) for delivery retries, then discard the message. This is a design trade-off worth understanding before you commit to a fully ZDR architecture.

What ZDR Does Not Cover (And Why That Matters)

Being precise about what ZDR means also requires being precise about what it doesn't mean:

What ZDR Covers What ZDR Does Not Cover
Third-party API payloads (contacts, employees, invoices) OAuth tokens and connection credentials
Request/response bodies flowing through the middleware API call logs and metadata (timestamps, status codes, latency)
Intermediate transformation state Mapping configurations and integration definitions
Webhook payloads from third parties Your own application's storage of integrated data

Zero-data-retention does not automatically mean "no data ever exists." It refers specifically to storage practices after processing.

A ZDR integration vendor still stores your configuration—which integrations you've connected, what mappings you've defined, what OAuth apps you've registered. What it doesn't store is the actual CRM contacts, employee records, or financial transactions that flow through the pipe. That's the distinction that matters for procurement.

This approach minimizes your attack surface, reduces compliance scope, and virtually eliminates the risk of data breaches within the automation layer.

The Honest Trade-Offs of a Pass-Through Architecture

Being radically honest about architectural decisions is essential for engineering teams. While a zero-storage pass-through architecture solves your enterprise compliance blockers, it introduces specific engineering trade-offs you must design around.

1. Provider Rate Limits Are Real

Because you're not querying a middleware cache, every request hits the third-party provider's API directly. If you barrage a customer's HubSpot instance with 10,000 requests a minute, HubSpot will rate-limit you. You must build intelligent queuing and exponential backoff into your own application layer to respect provider limits.

2. Network Latency Is Variable

A pass-through proxy adds a minor network hop. More importantly, the response time of your API call is entirely dependent on the speed of the third-party provider. If an older on-premise ERP takes 4 seconds to return a query, your application will wait 4 seconds. You cannot fall back on a sub-millisecond local cache. Engineers must design their systems asynchronously to handle variable third-party response times.

3. Search and Filtering Capabilities Are Constrained

When you query a cached database, you can use complex SQL joins and filters. In a pass-through model, your filtering capabilities are limited to what the third-party API natively supports. If the provider's API doesn't support filtering by a specific custom field, the proxy cannot magically invent that capability without pulling all records into memory—which defeats the purpose of an efficient API.

4. Webhook Replay Requires Your Own Storage

As noted above, if your endpoint is down when a webhook arrives, a ZDR middleware can't replay from its own logs. You need to handle missed events through reconciliation on your side—periodic polling to catch anything your webhook handler missed.

What to Ask Your Integration Vendor

Before you sign with any integration middleware provider, ask these five questions:

  1. "Do you cache or replicate third-party API responses on your infrastructure?" — If yes, for how long? In which regions? Under what retention policy?
  2. "Will you appear as a sub-processor on our customer's data processing agreements?" — If the vendor stores data, the answer is almost certainly yes.
  3. "Can you provide a SOC 2 Type II report that covers your data handling practices?" — SOC 2 without ZDR just proves they securely store data. SOC 2 with ZDR proves they don't store it at all.
  4. "What happens to a webhook payload if my endpoint is unreachable?" — This reveals whether they have intermediate storage for retries.
  5. "Can we get a BAA signed for HIPAA-covered data?" — If you're in healthtech, this is non-negotiable. A vendor with true ZDR makes the BAA conversation far simpler.

Unblock Enterprise Revenue with Zero Storage

ZDR is not a feature checkbox—it's an architectural decision that ripples through your compliance posture, your enterprise sales velocity, and your engineering team's ability to ship integrations without a six-week security review.

If you're evaluating integration vendors today, start by mapping your data flows. Identify every point where customer data could be persisted by a third party. Then ask whether each persistence point is necessary or just an artifact of a sync-and-store architecture that made sense five years ago but creates compliance drag today. This architectural shift is exactly why Truto is the best zero-storage unified API for compliance-strict SaaS.

If you need an integration tool that doesn't store customer data, look past the marketing pages of legacy API aggregators and examine their underlying architecture. Enterprise procurement teams will not compromise on data security. By adopting a true zero data retention architecture, you eliminate the friction of sub-processor audits, protect your customers from expanded attack surfaces, and empower your sales team to close upmarket deals without InfoSec blockers.

Info

Honest trade-off: If you need your integration platform to run scheduled sync jobs that write data to your data store, Truto supports that too. In that model, data flows through Truto to your infrastructure—Truto still doesn't retain it, but the sync process does involve buffering records in transit. The ZDR guarantee applies to Truto's infrastructure, not to the destination you configure.

Frequently Asked Questions

What does zero data retention mean for integrations?
It means third-party API data is processed entirely in memory during transit and is never written to a persistent database or cache. Once the response is returned to the calling application, the payload is destroyed by the runtime's garbage collector.
Why do enterprise security teams care about integration data storage?
Enterprise procurement teams use vendor risk assessments like the SIG Core questionnaire to check if third-party sub-processors store or cache sensitive data. If your integration middleware caches CRM or HRIS records, it becomes an unmanaged sub-processor that can stall or kill deals.
What is the difference between sync-and-store and pass-through integration architectures?
Sync-and-store architectures poll third-party APIs on a schedule and cache data in a database. Pass-through architectures proxy each request in real time, transform it in memory using declarative languages like JSONata, and return it directly — never writing customer data to disk.
Is zero data retention the same as data minimization?
They're related but distinct. Data minimization means collecting only what you need. Zero data retention goes further — it means the middleware never persists customer data at all, even temporarily, reducing your compliance footprint to near zero.
Does zero data retention work for webhooks and real-time sync?
Yes, but it requires careful design. Inbound webhooks can be verified, transformed, and forwarded in a single request lifecycle without writing the payload to a database. The trade-off is that you lose the ability to replay events from the middleware layer if delivery fails.

More from our Blog

What is a Unified API?
Engineering

What is a Unified API?

Learn how a unified API normalizes data across SaaS platforms, abstracts away authentication, and accelerates your product's integration roadmap.

Uday Gajavalli Uday Gajavalli · · 12 min read