Technology Apr 24, 2026 · 2 min read

I Built a Security Scanner That Audits PDFs Before You Send Them. Here's How. [Devlog #10]

All tests run on an 8-year-old MacBook Air. You're about to email a contract. It looks clean. But it still has your name, your machine's hostname, the original author's company, and a creation timestamp buried in the metadata. The recipient can see all of it. Audit Report catches this before you...

DE
DEV Community
by hiyoyo
I Built a Security Scanner That Audits PDFs Before You Send Them. Here's How. [Devlog #10]

All tests run on an 8-year-old MacBook Air.

You're about to email a contract. It looks clean.

But it still has your name, your machine's hostname, the original author's company, and a creation timestamp buried in the metadata. The recipient can see all of it.

Audit Report catches this before you hit send.

What it checks

Every PDF that goes through Audit Report gets scanned for:

  • Metadata leaks (Author, Creator, Producer, timestamps)
  • Hidden text layers (content invisible at normal zoom)
  • Embedded scripts or form actions
  • Non-standard objects that shouldn't be in a clean document
  • Redacted content that wasn't properly removed (pixel-level check)

The scanning pipeline in Rust

pub struct AuditResult {
    pub metadata_warnings: Vec,
    pub hidden_content: Vec,
    pub suspicious_objects: Vec,
    pub risk_level: RiskLevel,
}

pub fn audit_pdf(doc: &Document) -> AuditResult {
    let mut result = AuditResult::default();

    // 1. Check Info dictionary
    if let Ok(info) = doc.get_info() {
        for key in &["Author", "Creator", "Producer"] {
            if info.get(*key).is_some() {
                result.metadata_warnings.push(format!("{} field present", key));
            }
        }
    }

    // 2. Walk all objects for suspicious content
    for (id, object) in &doc.objects {
        if let Ok(stream) = object.as_stream() {
            if contains_hidden_text(stream) {
                result.hidden_content.push(HiddenLayer { id: *id });
            }
        }
    }

    result.risk_level = compute_risk(&result);
    result
}

Output: a clean report PDF

The audit result gets written as a formatted PDF report — generated fully offline with no external libraries.

Pass: green summary, list of checked items.
Fail: red warnings, exact fields that need attention, one-click fix option.

Why this matters

Most people check the visible content. Nobody checks the invisible content. This does.

Hiyoko PDF Vault → https://hiyokoko.gumroad.com/l/HiyokoPDFVault
X → @hiyoyok

DE
Source

This article was originally published by DEV Community and written by hiyoyo.

Read original article on DEV Community
Back to Discover

Reading List