Alex is Sprintlaw’s co-founder and principal lawyer. Alex previously worked at a top-tier firm as a lawyer specialising in technology and media contracts, and founded a digital agency which he sold in 2015.
Using data to make better decisions is a huge advantage for any Australian business. But if you’re collecting information about real people, privacy laws and community expectations need to be front and centre.
That’s where data de‑identification comes in. Done well, it can lower privacy risks and support safe analytics and sharing. Done poorly, it can create a false sense of security.
In this guide, we explain what de‑identification means under Australian law, how it differs from related concepts, practical techniques you can apply, and the governance steps we recommend so you can use data confidently and compliantly.
What Is Data De‑identification (And What Isn’t)?
In simple terms, data de‑identification is the process of removing or altering information so that an individual is no longer reasonably identifiable from the dataset, either on its own or when combined with other likely data sources.
This idea of “reasonably identifiable” is important. Whether data counts as personal information depends on context, including what other data is available, who will access it, and how it could be matched. There’s rarely a one‑size‑fits‑all answer.
De‑identification vs Pseudonymisation vs Anonymisation
- De‑identified: Direct identifiers (like names or email addresses) are removed or altered and additional steps are taken to reduce the likelihood of re‑identification. If re‑identification is still reasonably likely, it remains personal information.
- Pseudonymised: Identifiers are replaced with a code (e.g. a token or ID). If the business (or another party) holds a key or linkage table that can reverse the process, it’s generally still personal information.
- Anonymised: Irreversible and robustly de‑identified, even when combined with other data reasonably available. In practice, true anonymisation is hard and should not be assumed without expert assessment.
A common misconception is that encrypted data is de‑identified. If you control the decryption keys (or can reasonably obtain them), encryption protects confidentiality in transit or at rest, but the underlying data is still personal information.
How Australian Privacy Law Treats De‑identified Data
Australia’s Privacy Act 1988 and the Australian Privacy Principles (APPs) regulate how organisations collect, use, disclose and secure personal information. De‑identification is recognised as one way to reduce privacy risks, but the legal effect depends on the circumstances.
When Is De‑identified Data Still “Personal Information”?
If a person is reasonably identifiable (for example, the dataset is unique or could be matched with likely external sources), then it will still be treated as personal information and the APPs will apply.
Conversely, if the likelihood of re‑identification is not reasonable in the relevant context, the dataset may fall outside the definition of personal information. That assessment is dynamic - it can change as new data becomes available or as technology improves.
Key APP Touchpoints
- APP 1 (Open and transparent management): Your public‑facing privacy practices should explain if and how you de‑identify data. Clear communication is usually handled in your published Privacy Policy and internal procedures.
- APP 6 (Use and disclosure): If you de‑identify data to use it for analytics beyond the original purpose, consider whether the use is within expectations and whether further consent is required if individuals remain reasonably identifiable.
- APP 11 (Security and retention): You must take reasonable steps to protect personal information. When it’s no longer needed for any permitted purpose, you should destroy or de‑identify it. Your approach to retention aligns with broader data retention obligations in Australia.
Bottom line: de‑identification is a tool, not a legal magic wand. You still need a risk‑based approach to prevent re‑identification and to apply the APPs whenever individuals remain reasonably identifiable.
Practical De‑identification Techniques (With Caveats)
The right technique depends on your data, your use case and who will access the data. Often, you’ll combine methods and calibrate them to maintain useful insights while reducing risk.
Common Techniques
- Suppression: Remove high‑risk fields entirely (e.g. names, email addresses, exact street addresses). This reduces obvious linkage points.
- Generalisation: Reduce precision so that data points describe groups rather than individuals (e.g. age bands instead of exact ages, suburbs or SA3 regions instead of unit‑level addresses).
- Pseudonymisation / Tokenisation: Replace identifiers with a random token. If a linkage table exists, treat the dataset as personal information and protect it accordingly. Keep the linkage table separate with strict access controls.
- Perturbation / Noise: Add small, statistically controlled variations to data so individual records are less linkable but trends remain intact.
- Aggregation: Report totals, averages or counts for groups (e.g. cohorts, time windows) rather than disclosing record‑level data.
- Data minimisation: Remove features that are not necessary for your analysis to reduce the overall attack surface.
Advanced Approaches
- K‑anonymity and l‑diversity: Ensure each record is indistinguishable from at least k‑1 others based on quasi‑identifiers, and that sensitive attributes are sufficiently diverse within those groups.
- Differential privacy: Use formal privacy budgets and noise mechanisms to provide strong mathematical guarantees about what can be learned about any individual from the output.
- Secure enclaves / safe settings: Rather than releasing data, allow approved analysts to query it in a restricted environment with auditing and disclosure controls.
Important caution: encryption, masking or hashing alone do not equal de‑identification if you can reasonably re‑link or decrypt the values. Treat those approaches as security controls unless the re‑linking path is eliminated and the context supports a conclusion that individuals are not reasonably identifiable.
Governance: How To Implement De‑identification Safely
The most effective programs combine technical methods with strong governance. Think policy, process and controls - not just one‑off transformations.
1) Set Your Policy, Scope and Roles
Document when and why your business will de‑identify data, who approves it, and how you’ll manage the residual risk over time. Align that with your published Privacy Policy, internal standards, and data lifecycle plans.
2) Assess Privacy Risks Upfront
Before releasing or using a de‑identified dataset, consider:
- Audience and access: Will this be open data, shared under contract, or used only in‑house? More open access usually requires stronger protections.
- Quasi‑identifiers: Fields like date of birth, postcode and gender can uniquely identify people when combined; adjust or remove them as needed.
- External data: What data sources are reasonably available that could be used to re‑identify your dataset (e.g. social media, public registers, previous releases)?
- Use case: Are granular, person‑level records actually necessary, or will aggregation provide enough value?
For higher‑risk initiatives, map out a structured assessment. Many organisations use a Privacy Impact Assessment Plan to guide decisions, record mitigations and assign accountability.
3) Choose Techniques and Test for Re‑identification
Select techniques that fit your use case, then test your work. Attempt internal re‑identification (often called “penetration testing for privacy”) and have an independent reviewer challenge the results for high‑stakes releases. Adjust parameters until the residual risk is acceptable for the context.
4) Control Sharing With Contracts and Safe Settings
If you’ll share de‑identified data with partners, vendors or researchers, lock in controls through contract. A Data Processing Agreement (or equivalent clauses) can restrict re‑identification attempts, onward disclosure, and combine those with security and audit obligations. If sharing via a portal or sandbox, implement safe settings such as query limits, output checking and user training.
5) Communicate Clearly With Individuals
Be transparent about how you use, de‑identify and share information. This is usually covered in a concise Privacy Collection Notice and your Privacy Policy. If you collect data through your website or app, align the story in your Website Terms and Conditions so users know the rules for the service and how their information is handled.
6) Plan for Incidents and Continuous Improvement
No control set is perfect. Prepare for incidents and near‑misses with a practical Data Breach Response Plan, and schedule regular reviews as your datasets, technologies and business uses evolve. Re‑evaluate older releases if new linkage risks emerge.
Can You Share De‑identified Data With Third Parties?
Yes - but only with guardrails. “De‑identified” doesn’t automatically mean safe to publish widely. The residual risk depends on who will access the data, what other data they have, and what controls you’ve applied.
Safer Sharing Patterns
- In‑house analytics: Keep the data internal, restrict access to trained staff and log queries.
- Controlled sharing: Share under contract with specific partners. Use a Data Processing Agreement or data‑sharing clauses that prohibit re‑identification, constrain use to the agreed purpose and mandate security.
- Aggregated outputs: Prefer aggregated or differentially private outputs for broader sharing, especially if public release is contemplated.
Open publication of record‑level data is rarely appropriate unless strong, formally tested methods (and limited attributes) make re‑identification not reasonably likely in the real world. When in doubt, keep it controlled or aggregate the data.
Common Pitfalls To Avoid
- Assuming removal of names is enough: Quasi‑identifiers can still single people out. Combine multiple techniques and test your dataset.
- Relying on encryption alone: If you hold the keys, the data is still personal information for privacy purposes.
- Publishing high‑risk attributes: Exact timestamps, detailed locations or rare combinations can make records unique. Generalise or suppress them.
- Not updating controls: Re‑identification risks change. Review older releases and refresh your approach regularly.
- Unclear public messaging: If your policies and notices don’t explain how you de‑identify and use data, you risk complaints and loss of trust. Align your notices, Privacy Policy and Website Terms and Conditions.
A Practical Checklist To Get You Started
If you’re building or improving your de‑identification program, this streamlined checklist can help you move forward with confidence.
- Map your data and purposes: Identify what you collect, why you collect it, and where it is stored and used.
- Define your audience: Decide who will access the de‑identified data (internal only, trusted partners, or wider release).
- Run a risk assessment: Consider external data sources, uniqueness, quasi‑identifiers and potential harms. Use a structured Privacy Impact Assessment Plan for higher‑risk initiatives.
- Pick the right techniques: Combine suppression, generalisation, aggregation and (where appropriate) formal privacy methods, then test for re‑identification.
- Control access and sharing: Use contractual protections like a Data Processing Agreement, safe settings, approval workflows and audit logging.
- Align your documents: Update your collection notice, Privacy Policy and internal data retention rules to reflect your approach and broader retention requirements.
- Prepare for incidents: Maintain and test your Data Breach Response Plan and escalation pathways.
- Train your team: Provide practical training on safe handling, de‑identification parameters, and what not to do (e.g. attempts to re‑identify).
Key Takeaways
- De‑identification reduces the chance that a person is reasonably identifiable, but it’s context‑dependent and not a one‑time switch from “personal” to “not personal.”
- APP 11 expects reasonable security and, when data is no longer needed, destruction or de‑identification - supported by clear retention rules and documentation.
- Use layered techniques (suppression, generalisation, aggregation, pseudonymisation and, where needed, formal privacy methods) and test for re‑identification.
- Encryption or hashing alone does not equal de‑identification if you can reasonably reverse or link identifiers.
- If sharing de‑identified data, keep it controlled: use contracts that prohibit re‑identification, restrict purpose, and require strong security and auditing.
- Align your Privacy Policy, collection notices and website terms, and keep a living breach response plan and privacy assessment plan to support ongoing compliance.
If you would like a consultation on data de‑identification and privacy compliance in Australia, you can reach us at 1800 730 617 or team@sprintlaw.com.au for a free, no‑obligations chat.







