Metadata “Leaking”: Securing Document Properties Before Production

Article summary: Metadata leakage can expose confidential information through hidden document properties, comments, and PDF content. Metadata removal for law firms requires a repeatable pre-production workflow that fits Texas confidentiality expectations and the form of production. This reduces accidental disclosure and keeps productions defensible.
The most uncomfortable leak in a law firm isn’t always a leaked sentence. Sometimes it’s a leaked history.
A PDF that looks final still remembers who wrote it. A “clean” Word doc can quietly carry comments, tracked changes, and the internal back-and-forth that shaped the argument. A file can reveal client names, deal context, or strategy notes without showing a single extra word on the page.
That’s why metadata removal for law firms matters. Production isn’t just about what you intend to disclose. It’s about what the file can reveal.
What “Metadata Leakage” Looks Like
Metadata leakage is when a produced file reveals information that isn’t part of the visible content.
In law firms, it often shows up in ways that feel almost unfair: the document looks correct, but the file still carries baggage.
Common examples include:
- Document properties that show author names, usernames, company names, or “last saved by”
- Hidden comments and tracked changes that reveal internal debate, edits, or strategy
- Templates and file paths that hint at internal matter structure or client names
- PDF hidden elements, like annotations, attachments, layers, form fields, or embedded objects
- Version history and timestamps that create an unintended timeline of events
This is why production should be treated like controlled data sharing, not just sending a file.
Secure sharing depends on understanding what’s being shared and applying consistent controls like access restrictions and good handling habits.
What Lawyers Are Expected to Do
Texas doesn’t treat metadata leakage as a quirky tech issue. It treats it as a confidentiality issue.
The State Bar of Texas Professional Ethics Committee says lawyers must take “reasonable measures” to avoid transmitting confidential information embedded in metadata to people who shouldn’t receive it, and that this can include using “reasonably available technical means” to remove metadata.
The key word is “reasonable.”
The opinion explains that what’s reasonable depends on the facts, including what steps were taken, how sensitive the embedded information is, and who the recipient is.
In other words, this isn’t a one-time checkbox. It’s a repeatable workflow that should match the risk of the document being produced.
Sometimes Metadata Is Required
The trap is that many firms swing too far in the other direction: “Just strip everything.” That can cause problems in discovery and eDiscovery workflows, where metadata may be part of the expected form of production.
The EDRM production guidance notes that productions may be delivered in different formats (native, near-native, image) and may include load files, extracted metadata, and searchable text.
This is why metadata handling can’t be improvised.
Your team needs clarity on what you’re producing and why. In some contexts, removing metadata is the right confidentiality step. In others, producing agreed metadata fields is the requirement.
The safest approach is deliberate: define the production format and metadata expectations up front, then apply the right hygiene steps for that specific scenario.
The Practical Workflow
Step 1: Decide what you’re producing (and why).
Before you “clean” anything, confirm the context.
Is this a routine share with a client, opposing counsel, or a vendor? Or is it a formal production where metadata fields are expected?
This decision determines whether you should strip metadata aggressively, preserve specific fields, or produce in an image-based format with the right load files.
Step 2: Sanitize Office files using Document Inspector.
Microsoft’s Document Inspector is designed to find and remove hidden data and personal information in Word, Excel, and PowerPoint. This includes things like comments, annotations, and document properties.
Microsoft also recommends running it on a copy of the file so you don’t destroy information you still need internally.
Step 3: Sanitize PDFs the right way.
PDFs can still contain hidden information even when they look final.
Adobe notes that PDFs may include hidden data such as metadata, comments, and hidden layers, and provides a Sanitize workflow in Acrobat to remove that hidden content before sharing.
Step 4: Verify the produced file before it leaves the firm.
Don’t assume “Remove” worked. Spot-check the output like a recipient would:
- Open the file and check properties
- Confirm comments and tracked changes are gone
- Confirm PDFs don’t contain leftover annotations or hidden content
- If you redacted, confirm it’s a true redaction
Step 5: Preserve an internal original and produce a cleaned copy.
A safe workflow keeps the firm’s internal version intact and treats the produced version as a separate deliverable.
“Back everything up (no exceptions)” and work in a way that prevents one mistake from becoming permanent loss.
Production Hygiene Is Confidentiality Hygiene
Metadata leakage is rarely dramatic. It’s usually quiet and avoidable. It happens when a firm treats production as “send the file” instead of a repeatable hygiene step.
A practical metadata workflow protects client confidentiality the same way good proofreading protects the record. It reduces accidental disclosure, keeps internal strategy out of the properties panel, and prevents last-minute scrambling before production deadlines.
If you want help standardizing metadata removal for law firms across Word and PDF workflows, contact Digital Crisis. We’ll review your current process and help your team build a clean, repeatable pre-production checklist.
Article FAQs
What is metadata leakage in legal documents?
Metadata leakage is when a produced file reveals hidden information that isn’t part of the visible content. That can include author details, comments, tracked changes, revision history, file paths, or hidden PDF elements. The document looks clean, but the file still carries confidential context.
Is metadata removal required in Texas?
Texas lawyers are expected to take reasonable measures to avoid transmitting confidential information embedded in metadata when sending electronic documents. What’s “reasonable” depends on the situation, including the sensitivity of the information and who will receive it. In discovery, metadata handling may also depend on the form of production and applicable rules.
What should never be included in a produced document?
Never include internal comments, tracked changes, hidden annotations, or document properties that reveal confidential client information. Avoid leaving author names, internal file paths, or strategy notes embedded in the file. The safest approach is to produce a cleaned copy that has been inspected and verified.
Does converting to PDF remove metadata?
Not reliably. Converting to PDF may change some properties, but PDFs can still contain metadata, comments, hidden layers, and other hidden content. If you need a clean production, you still have to inspect and sanitize the PDF before sharing.