DataGrail report finds your vendor may be sending data to AI models you never approved

The data processing agreement (DPA) — the bedrock contract companies use to evaluate how vendors handle personal data — can no longer be trusted at face value. That is the central, and arguably most alarming, conclusion of DataGrail’s Privacy and AI Trends Report 2026, released today.

The San Francisco-based privacy platform analyzed 2,400 popular business software providers and found that 63.6% of vendors that prominently advertise AI capabilities do not disclose a third-party AI subprocessor in their legal documentation. The implication: the majority of companies purchasing AI-enabled software may be unknowingly exposing their customers’ data to AI models and pipelines they never reviewed, never approved, and may not even know exist.

“All software vendors are trying to move to become AI vendors, which makes sense, but the technologies are moving faster than AI governance can actually keep up,” DataGrail co-founder and CEO Daniel Barber told VentureBeat in an exclusive interview ahead of the report’s release. “The DPA should be the reliable document that teams use to evaluate AI risk, but based on that number, that’s not enough in 2026.”

The finding drops into an enterprise landscape where organizations with high levels of shadow AI already experience average breach costs of $4.63 million — $670,000 more than those with low or no shadow AI, according to IBM’s 2025 Cost of Data Breach Report. And it arrives in a year when U.S. states gave out $3.425 billion in privacy-related fines — more than the last five years combined — a trend Gartner expects to accelerate through 2028.

How researchers uncovered the growing gap between AI vendor contracts and reality

DataGrail’s methodology for arriving at the 63.6% figure goes well beyond reading contracts. The company’s research team cross-referenced DPA disclosures against product documentation, GitHub environments, API connections, and marketing materials for each of the 2,400 vendors in its tracking universe.

Barber walked VentureBeat through the process: “We looked at the DPA as the baseline, but then what we also looked at is the GitHub environment, the API connections that a particular vendor has, the product documentation, the marketing documentation, and triangulate that information to discern — okay, so the DPA document says use OpenAI, but actually you’ve got these three AI subprocessors over here in your product documentation outlining features and functionality, but that is not reflected in your DPA.”

When asked directly about how confident he was that these gaps represent actual shadow AI risk rather than vendors using proprietary technology, Barber was unequivocal. “Very confident, because we looked at the sample of the 2,400 systems, and we spent a substantial amount of time actually looking at product documentation, GitHub environments, looking at actual API connections, because we integrate with these systems as well, so we know how they process personal information. It is from primary research.”

The disclosure gap matters because it undermines the entire chain of trust that privacy programs rely on. Consider a scenario Barber described: A company invests in an AI recruiting tool. The tool’s DPA lists Claude as its foundational model. The company dutifully performs a security review of Anthropic’s AI. But the recruiting tool also quietly uses OpenAI and Gemini behind the scenes — models the company never evaluated.

Those undisclosed models then process thousands of resumes and execute automated hiring decisions. The company, without knowing it, has exposed sensitive personal information — home addresses, financial data, possibly Social Security numbers — to AI systems it never vetted, potentially violating FTC regulations on automated decision-making in employment. “How those vendors are evaluating and performing that automated decision making could be really disastrous for a business,” Barber said.

One-third of AI systems also process sensitive data, and the true number is likely higher

The disclosure gap alone would be concerning enough. But DataGrail’s report layers on another finding that makes the problem materially worse: 32.8% of AI systems that disclose AI capabilities also disclose at least one other high-risk activity, such as processing sensitive personal information or powering automated decision-making. Among AI systems with self-reported risk factors, 47.1% process personal data, 20.7% have the potential to power automated decision-making, 16.5% process sensitive data categories like health or financial information, and 7.5% process biometric data.

The report argues these figures almost certainly undercount actual exposure, since they reflect only what vendors have formally disclosed. Vendors could underreport access to personal data, and the inherent flexibility of AI means even good-faith vendors might not predict riskier user applications of their tools.

This has immediate regulatory implications. The CCPA’s new risk assessment requirement, effective January 1, 2026, requires businesses to conduct and document risk assessments for processing activities that present significant privacy risks — and will require submission to CalPrivacy by April 2028, with executive attestation under penalty of perjury.

Processing sensitive personal information with AI, or using AI for automated decision-making, are precisely the activities that trigger this obligation. The report finds that 42% of companies abandoned AI initiatives in 2025 with data privacy concerns cited as a primary obstacle — a statistic sourced to S&P Global research. Privacy teams that engage early with AI projects, Barber argues, can prevent that waste by ensuring safeguards are in place before launch, with AI risk assessments serving as the right starting point.

Why consent management became 2025’s most punished privacy failure

While shadow AI is still a newer category of threat, the report makes clear that traditional privacy challenges have not eased — they have intensified. Consent management was the busiest enforcement topic of 2025. California alone publicly reported $4.3 million in CCPA consent settlements, and 2025 saw over 1,400 class action wiretapping suits driven by private firms investigating tracking pixels and session replay software.

Despite this enforcement wave, 63% of the 5,000 websites DataGrail audited still fail to comply with universal opt-out mechanisms such as the Global Privacy Control signal. While that figure represents an improvement from 75% non-compliance in 2023, the pace of improvement is slow relative to the acceleration in enforcement.

Barber pointed to the case of Todd Snyder, the menswear retailer that the California Privacy Protection Agency fined $345,178 in May 2025, as evidence that enforcement is no longer reserved for big tech. “This is a business that has two or three stores across the U.S. They have 300 employees,” he said. “They run tight margins because they’re a consumer menswear clothing store.”

The California Attorney General also reached a $2.75 million settlement with Disney over failures to honor opt-out signals, while the California Privacy Protection Agency has brought enforcement actions against PlayOn Sports and Ford — a pattern that demonstrates both the breadth and depth of regulatory activity. Among the trackers that fire even after a user sends a GPC signal, the report found that 27.1% come from Google Analytics and 43.8% are for targeted advertising via platforms like Meta and Microsoft.

For users who do engage with consent banners, 48.3% click “Accept all,” while only 12.4% select “Essential only” and 2.3% customize their preferences. A full 37% simply exit the banner without making a selection. The practical takeaway: less than 15% of users make a conscious choice to opt out of tracking, which means consent banners present relatively low business risk when properly configured — but enormous regulatory risk when they are not.

Data deletion requests surge 567% as the cost of manual processing hits $1.5 million a year

Data subject request volume hit an all-time high for the fifth consecutive year. Deletion requests have surged 567% since 2021 and now represent 87% of all data subject requests. Access requests, by contrast, have gradually declined as consumers skip visibility and reach straight for the delete button.

The cost is staggering. For a mid-sized organization receiving 5 million annual web visitors, the report estimates manual DSR management now runs approximately $1.5 million per year, based on Gartner’s estimated cost of $1,524 per manual DSR. The average cost has climbed from $238,000 in 2021 to $1.51 million in 2025 — a trajectory that makes manual processing not just inefficient but, as the report argues, “irresponsible.”

Barber emphasized that these numbers reflect verified human requests with bot and spam traffic excluded, and that data broker scenarios — which will see their own massive influx of requests under California’s Delete Act — are reported separately. “That is a natural increase,” Barber told VentureBeat. “If you’ve now got 20-plus U.S. states with privacy regulation, it’s unlikely that we see a federal bill passed, even though we’ve seen one proposed. And while we don’t see federal awareness and regulation, we do see at the state level over 20 states, and that may actually increase awareness for the consumer even more.”

He added a telling detail about how businesses are responding in practice: “99% of DataGrail customers do process that deletion” even for residents of states without privacy laws, “simply because it’s too hard at this point. Discerning and even communicating to the person, ‘Hey, you live in Montana, sorry, you’re just in an unfortunate state without regulation’ — you just can’t do that.” Data brokers felt the impact most acutely, with a 398% increase in deletion requests compared to 2024 and an average of over 2,000 deletion requests handled per month.

State regulators issued $3.4 billion in privacy fines last year, and both parties want more

The regulatory landscape underpinning all of these trends has fundamentally shifted from education to punishment. Nearly half of U.S. states now have a comprehensive privacy law in effect, plus over 160 AI-specific laws. State legislatures enacted 145 AI-related laws in 2025 alone, with another thousand introduced or reworked. According to Gartner, over 50% of the U.S. population is now covered by a comprehensive state privacy law, with 24 additional states expected to pass laws within five years. States have also begun pooling their resources, with ten forming the Consortium of Privacy Regulators last year and pledging to coordinate investigations across state lines.

Barber argued that privacy enforcement is fundamentally bipartisan, which insulates it from the shifting political winds of the current administration. “Privacy overall is a pretty bipartisan issue,” he said. “It’s easy to pass privacy regulation because constituents somewhat expect privacy in their day-to-day living. If you were flying on an airline and they said, ‘Okay, this seat, if you want your privacy, you’re going to have to pay $6 more,’ you’re like, ‘I’m going to go to another airline.’ It’s an expected part of a transaction at this stage.”

He predicted that other states will replicate California’s enforcement model. “California has their enforcement division, CalPrivacy. That group has one task: to ensure enforcement of privacy throughout businesses. Is it likely that we see other states get funding and support to fund these types of groups? Highly likely. The enforcement fines — the actual payments — go back to us as constituents. That type of model, you could imagine, being very popular across the country.”

Privacy teams are losing a third of their staff just as AI governance demands explode

Perhaps the most paradoxical finding in the report is that privacy teams lost as much as 33% of their headcount last year, even as their workloads expanded across every metric the report tracks. Cisco data cited in the report shows that 90% of privacy programs expanded in 2025 due to AI, while only 12% of AI governance programs are considered mature. Meanwhile, 74% of privacy teams planned to apply AI to privacy-related tasks in 2026, according to ISACA’s State of Privacy 2026 survey.

Barber sees this as part of a broader macroeconomic pattern rather than a sign that organizations do not value privacy. “It’s actually a fascinating macro trend, and probably one you’ve seen across all functions,” he said. “Businesses are driving more efficiency in all parts of the business. Privacy teams, five years ago, we would have said, ‘Well, there’s more regulation, the volume of deletions have increased 500%, we need more humans.’ It’s become clear that AI provides capabilities that can do the work for privacy individuals.” He drew an analogy: “They might have had a design team of 20 people five years ago, now they have a design team of five, courtesy of Claude Design or Gamma or whatever the tool may be. I think that’s what we’re seeing here as well.”

DataGrail has positioned its own AI agent, Vera — launched in March 2026 — as part of the answer. Vera is embedded within DataGrail’s existing platform and aims to automate privacy workflows across multiple jurisdictions. The company was also named the first production-ready Model Context Protocol server for privacy, using the standard created by Anthropic to enable customers to launch DataGrail tools from whatever application they are already working in, whether Slack, email, or Claude.

Can a vendor-produced report be trusted to diagnose the problems that vendor sells solutions for?

DataGrail is, of course, a company that directly benefits from the problems its report identifies. The company has raised a total of $84.2 million over five rounds, with its largest being a $45 million Series C in October 2022 led by Third Point Ventures. Its platform addresses precisely the data mapping, DSR automation, consent management, and risk assessment challenges the report spotlights.

Barber acknowledged the tension directly. “It’s a fair statement,” he said when asked about potential skepticism. “DataGrail doesn’t provide a service to keep DPAs up to date — that’s on a business to evaluate how they work with a vendor. What DataGrail does help to do is assessments, and automate those assessments using our AI agent, Vera, to assess that increased risk.”

He argued that the more neutral reading of the data is structural: “This is evidence to show that the DPA unfortunately is not keeping up with technology and the speed at which technology is innovating. That’s both exciting but also we need to accept that’s where we are.” The methodology does lend some credibility to this claim.

The report draws on anonymized privacy operations data from hundreds of enterprise customers, the 2,400-system AI tracking database, and the 5,000-website consent audit — sources that are at least partially independent of DataGrail’s commercial interests. And the broader findings on enforcement spending, DSR volume trends, and regulatory expansion align closely with independently published data from Gartner, Cisco, and state enforcement agencies.

The next frontier: agentic AI could spread unvetted data across entire organizations autonomously

When asked about the most important trend that did not make it into the report, Barber pointed to a next-generation risk that extends the shadow AI problem into far more dangerous territory: agentic AI workflows. Gartner predicts 40% of enterprise applications will feature task-specific AI agents by end of 2026, up from under 5% in 2025 — a pace of adoption that could rapidly outstrip the governance mechanisms companies are only now beginning to build.

“Where we go next with this research is agent processing,” Barber said. “How are agents then leveraging that information? Because the downstream ramifications would be far more concerning for a business. One particular system is using shadow AI, the business has no idea that that’s happening, and then an agent is propagating that information across a whole bunch of other places. The guardrails of you and I checking the system will be lower than maybe what we’ve seen in the past with agentic workflows.”

He framed the distinction in human terms: “The identity of an agent is different than a human. There is thought that goes into what am I about to use here, where did this information come from, how was it collected — that may not be considered in the same way for an agentic workflow. We need to solve the root of the problem, which is how are these businesses leveraging AI subprocessors. But this quickly becomes an agentic problem that could be far more concerning.”

For the enterprise privacy and security leaders absorbing this report today, the uncomfortable truth is that the foundational documents and processes they have relied on to manage vendor risk for years are decomposing in real time. The DPA is breaking down as a reliable instrument. State enforcement is accelerating on a bipartisan basis. Privacy teams are shrinking even as their mandates expand. And the next wave of agentic AI systems threatens to distribute unvetted data processing across networks of autonomous agents that operate with even less human oversight than today’s tools.

Five years ago, when DataGrail published its first trends report, deletion requests were a fraction of what they are today, only a handful of states had privacy laws on the books, and the phrase “shadow AI” did not exist. Every year since, the report has warned that the problem was getting worse. Every year, the data has proved it right. The companies that survive the next chapter will not be the ones with the biggest compliance teams or the thickest policy binders. They will be the ones that accept a disorienting new reality: in 2026, the contracts you signed may not describe the AI that is already processing your customers’ data — and by 2027, autonomous agents may be deciding what to do with it.

DataGrail report finds your vendor may be sending data to AI models you never approved

By

How researchers uncovered the growing gap between AI vendor contracts and reality

One-third of AI systems also process sensitive data, and the true number is likely higher

Why consent management became 2025’s most punished privacy failure

Data deletion requests surge 567% as the cost of manual processing hits $1.5 million a year

State regulators issued $3.4 billion in privacy fines last year, and both parties want more

Privacy teams are losing a third of their staff just as AI governance demands explode

Can a vendor-produced report be trusted to diagnose the problems that vendor sells solutions for?

The next frontier: agentic AI could spread unvetted data across entire organizations autonomously

Related Post

Back-to-School Shoppers Are Using More Tech Tools but Buying Fewer Tech Goods

DeepSeek cut prices 75%. The 100x problem remains

Today’s NYT Connections Hints, Answers and Help for July 13, #1128

Leave a Reply Cancel reply

You missed

Back-to-School Shoppers Are Using More Tech Tools but Buying Fewer Tech Goods

DeepSeek cut prices 75%. The 100x problem remains

Today’s NYT Connections Hints, Answers and Help for July 13, #1128

Today’s NYT Strands Hints, Answers and Help for July 13 #862