Lessons learned from a year and a half of privacy red teaming

I spent about a year and a half doing privacy red teaming at a large tech company. I’m not going to name them but if you follow me it’s not hard to figure out. What I will say is that it fundamentally changed how I think about security testing, and I think there are lessons here that apply broadly to anyone doing offensive security work, especially as privacy regulation and user expectations continue to evolve.

The short version: traditional red teaming and pen testing doesn’t cover privacy the way a privacy-focused red team would. It’s either not the focus or an afterthought. We look for SQLi, RCE, auth bypass, data exfiltration, the loud stuff. Privacy violations are quieter, more subtle, and in many cases more impactful to the people affected. A response time difference that tells you whether an account belongs to a minor is not going to show up on a CVSS calculator, but it will show up in the news, in a regulatory action, or in a civil case.

Who’s the adversary?

One of the biggest challenges I faced coming into this role was figuring out who I was supposed to emulate. In traditional red teaming, you have threat intel reports, MITRE ATT&CK mappings, and well-documented APT groups with known TTPs. You know the adversary, it’s a ransomware gang, a nation-state group, a financially motivated crew. There are writeups, IOCs, and frameworks to guide your approach.

Privacy threats don’t work like that. The adversary is often an individual targeting another individual. Stalkers, abusive ex-partners, harassment campaigns, doxing groups like VILE or communities around Doxbin (many of which have since been dismantled). Private investigators and freelance hackers-for-hire doing “OSINT” that’s really just unauthorized surveillance. Clandestine operators who don’t show up in CrowdStrike reports and aren’t tracked by any threat intel vendor.

These actors don’t have MITRE technique IDs. Nobody writes APT reports on a guy trying to find his ex-girlfriend’s new address through platform features. The techniques are often mundane: account enumeration, social engineering customer support, abusing people search features, exploiting the kind of oracles I mention below. But the impact to the target can be life-threatening.

This is a gap in the industry. Most threat modeling frameworks assume the adversary is after the organization’s data or infrastructure. When the adversary is after a specific user, and the platform is the attack surface rather than the target, the whole model breaks down. There aren’t many resources on how to threat model for this, and most security teams don’t consider it a priority because it doesn’t map to the risks they’re used to thinking about. Beyond data brokers, few industries handle personal data at the scale where individual-targeting-individual is a top-tier concern, and the ones that do (social platforms, communication apps, dating services) are still figuring out how to operationalize this kind of testing.

Oracles

If you’ve done any amount of security testing, you’re familiar with information disclosure vulnerabilities. Usually these get triaged as low severity, maybe a stack trace leaks internal paths, or an API returns more fields than the docs say it should. In privacy, that entire calculus changes.

An oracle in this context is any observable difference in system behavior that lets you infer something about a user that they haven’t chosen to share. The classic example: you query a user profile and the response is slightly different, maybe a field is absent, a status code varies, or the response takes a few milliseconds longer, and that difference tells you the user is under 18. That’s a youth protection violation. It doesn’t matter that no PII was “leaked” in the traditional sense. The inference itself is the problem.

In traditional security, this might get filed as informational, if it gets filed at all. In privacy, depending on the framework, it can be critical. The difference between a user who is a minor and one who is not is a protected attribute, and any system behavior that leaks that distinction is a vulnerability, full stop. Same applies to location, sexual orientation, health status, political affiliation, anything that falls under protected or sensitive categories depending on the jurisdiction.

The thing is, nobody really looks for these. Your typical pen test engagement isn’t scoped for it, and even internal red teams tend to focus on access and lateral movement rather than inference attacks. But as privacy regulation gets more teeth (GDPR, state-level privacy laws in the US, age verification requirements), the attack surface for oracles is only going to grow.

Internal access to user data

Every company of a certain size has internal tools that let employees access user data. Support tools, debugging interfaces, admin panels, they exist because people need them to do their jobs. The tension is between maintaining an engineering culture with low friction (where developers can debug production issues without filing three tickets and waiting two days) and ensuring that access to user data follows the principle of least privilege.

This is not a new problem, but it takes on a different character when you think about it from a privacy perspective rather than a pure security one. From a security standpoint, you’re worried about an attacker compromising an employee and using their access. From a privacy standpoint, you’re worried about the employee themselves, not necessarily being malicious, but having access to data they don’t need, accessing data without justification, or data leaking into contexts where it shouldn’t be.

The harder problem is data lineage and provenance. When data gets used in one system and then flows into another (an analytics pipeline, a machine learning model, an internal dashboard) it needs to carry its context with it. Where did this data come from? What was the user’s consent scope? Has it been properly anonymized or aggregated? Is it tagged in a way that prevents it from showing up somewhere it shouldn’t? If you’ve spent any time in a large engineering organization, you know how quickly data can drift from its original context into places nobody intended. The pipes are leaky because they were built for throughput, not containment.

Third parties, advertisers, and the surveillance economy

Cambridge Analytica showed the world what happens when third parties misuse platform features to collect data on users. That was 2018. Since then, the advertising technology ecosystem has only gotten more sophisticated in its ability to turn ad targeting into surveillance infrastructure.

A foundational study, ADINT: Using Ad Targeting for Surveillance on a Budget, demonstrated that for about $1,000, you could track an individual’s location and app usage through ad platform purchases. That was 2017. Since then, the research has gotten worse, not the research quality, the findings.

The ICCL’s report “America’s Hidden Security Crisis” found that real-time bidding (RTB) data from ad auctions was accessible in ways that exposed US military and intelligence personnel to foreign state surveillance. The EFF has documented how the RTB infrastructure is systematically exploited by surveillance vendors. CBP has used ad-based location data to track people without warrants. And an older study, TRAP: Using TaRgeted Ads to Unveil Google Personal Profiles, showed how the ad targeting parameters themselves could be reverse-engineered to reveal user profiles.

Most recently, Amnesty International revealed that Intellexa was using the programmatic advertising ecosystem to deliver Predator spyware, malware pushed through ad networks via zero-click. The ad pipeline isn’t just leaking data, it’s become a delivery mechanism for offensive tools.

From a privacy red team perspective, the question isn’t just “what data are we sharing with third parties?” It’s “what can third parties infer, aggregate, or weaponize from the data we share?” And the answer, consistently, is more than anyone intended. Third parties that haven’t been properly vetted, that have access to more data than necessary, or that can indirectly infer details about users from the data they do have, all of these are privacy vulnerabilities that don’t map neatly to a traditional security finding.

For youth, this gets even more serious. The same Cambridge Analytica-style risks apply, but with a higher bar for protection and more severe consequences. You’re not just protecting data, you’re protecting children from adults who specifically target them, the kind of people Chris Hansen would ask to take a seat. Ensuring that advertising targeting doesn’t inadvertently expose youth to inappropriate content or predatory actors, and that platform features can’t be abused to identify or locate minors, is an area where the intersection of privacy and safety gets very real, very fast.

The company as an adversary

This is the framing that most people in traditional security don’t think about, and it’s the one that probably matters the most.

What data about a user would you not want to be able to reveal if compelled to by a legal order? What information, if subpoenaed in a civil dispute, a divorce proceeding, or a government investigation, would make users less likely to use the product if they knew it could be accessed?

That’s not a hypothetical. There are real cases of user data being subpoenaed in custody disputes, domestic violence situations, and civil litigation. In the US, the legal protections around this are thinner than most people assume. Internationally, the calculus gets even darker, authoritarian governments compelling platforms to hand over data on political opposition, activists, journalists, and civil society actors.

Products like WhatsApp and Messenger went end-to-end encrypted in part because of this. The best protection for user data in a compelled disclosure scenario is not having it in the first place, or not having it tied to an identifiable user. The balance is between collecting data that’s useful for the product (showing relevant content, targeting ads effectively) and not retaining it in a form that’s useful to someone with a subpoena or a court order. This is where the business model and user privacy are in genuine tension, and where the decisions that get made have consequences that go beyond compliance checkboxes.

LLMs and the new attack surface

AI assistants with memory are the newest frontier for privacy risk. When I say memory, I’m really talking about RAG (retrieval-augmented generation) where the LLM pulls from a store of user-specific context to provide more personalized responses. Your AI assistant remembers your preferences, your habits, your conversations, your contacts, your location patterns. That’s the value proposition.

It’s also a massive attack surface. Prompt injection attacks against RAG-backed systems can potentially extract another user’s stored context. The LLM making API calls or taking actions “on behalf” of the user could leak private information from its memory to external services. The model itself doesn’t have to be compromised, the retrieval layer, the API integrations, the context window management all become vectors.

The traditional security framing would look at this as an injection vulnerability or an access control issue. The privacy framing is broader: even without a vulnerability, the mere existence of this much personal context in an accessible, queryable system changes the risk profile. What happens when the system is working exactly as designed, but the design didn’t account for how much about a person can be inferred from their aggregated interactions with an AI?

Takeaways

If you’re doing security testing and you’re not thinking about privacy, you’re leaving a whole class of findings on the table. And increasingly, those are the findings that end up in the news, in regulatory actions, and in the conversations that actually change how products get built.

The tools and methodology are different. You’re looking for inference, not injection. You’re thinking about data flow, not code execution. The adversary model includes the company itself, its partners, and its legal obligations, not just external attackers. And the severity of a finding depends on who’s affected and what can be inferred, not just what’s directly exposed.

Privacy red teaming is still a young discipline, and most organizations don’t have dedicated teams doing it. If you’re in offensive security and looking for a way to expand your impact, or if you’re at an organization that handles user data at scale, it’s worth building the muscle. The traditional security playbook doesn’t cover this, and the gap is only going to get wider.


See also