PDF Malware Scanner
44 Forensic Engines. Free. Online.
PDF files are one of the most common malware delivery vectors — used in phishing campaigns, APT attacks, and exploit kits for decades. Most scanners can only find threats they have already seen. PQ PDF's 44 forensic engines detect both known and unknown threats: behavioral sandbox execution catches what a PDF does regardless of whether it has a signature, ML anomaly detection flags structurally abnormal files even with no prior example, and differential parsing exposes hidden objects whether or not they match any known exploit pattern — all free, with zero data retention.
No account. No upload limit. File deleted immediately after analysis.
PDF Threats You Can't See by Opening the File
A PDF that opens and looks normal can still be malicious. The PDF specification is complex enough that attack vectors are buried in layers of structure that no viewer surfaces to the reader. Emotet used password-protected PDF lures to deliver macro-laced Word droppers. MuddyWater (Iranian APT) relied on PDF first-stage attachments throughout 2022–2024 campaigns against government targets. APT28 (Fancy Bear) distributed CVE-2015-2545 EPS-exploit PDFs in spear-phishing operations against NATO targets. More recently, QakBot and IcedID campaigns shifted entirely to PDF delivery after Microsoft disabled Office macros by default. These are the most common threat categories found in malicious PDFs:
eval(unescape(...)) shellcode loaders, heap spray sequences, and multi-layer obfuscated scripts that execute silently when the file opens in a vulnerable viewer./SubmitForm actions. Hidden fields collect data without user knowledge.ByteRange gaps in the signature specification to hide malicious objects outside the signed byte range./Launch actions or /EmbeddedFile streams.U+202E), attackers reverse filenames and URLs in a way that looks legitimate to a casual reader.Who Should Scan a PDF Before Opening It?
PDF is the most common malware delivery format in targeted attacks. According to Verizon's DBIR, email attachments account for over 90% of malware delivery — and PDF is consistently in the top two formats alongside Office macros. These are the people who scan before they open:
PQ PDF vs. VirusTotal, Hybrid Analysis, Adobe & MetaDefender
Antivirus engines answer one question: "Have we seen this before?" If a threat has been catalogued, they find it. If it hasn't — a zero-day exploit, a freshly obfuscated payload, a novel XFA FormCalc attack, a new JS shellcode loader — it passes straight through. PQ PDF answers a different question: "What does this PDF actually do?" Behavioral execution, ML anomaly detection, structural differential analysis, and entropy profiling find dangerous files whether or not any signature for them exists anywhere. Here is how the tools compare:
| Capability | PQ PDF Free · No account |
VirusTotal Free (account) · Online |
Hybrid Analysis Free (limited) · CrowdStrike |
Adobe Acrobat Pro ~$23/month |
MetaDefender OPSWAT · Paid |
|---|---|---|---|---|---|
| AV signature scanning | ✓ ClamAV 700k+ sigs | ✓ 70+ AV engines | ✓ CrowdStrike + partners | ✗ No AV scanning | ✓ 30+ AV engines |
| YARA rules (PDF-specific) | ✓ 24 custom PDF YARA rules | ⚠ Community rules, generic | ⚠ Generic YARA rules | ✗ No | ⚠ Limited, generic |
| Behavioral sandbox execution | ✓ 6 PDF renderers, isolated namespaces, strace | ⚠ General sandbox — not PDF-specific renderers | ✓ Good dynamic analysis, general sandbox | ✗ No sandbox | ⚠ Basic sandbox, limited PDF renderer coverage |
| PDF structural analysis (XRef, objects, streams) | ✓ 15 static engines built for PDF structure | ✗ AV engines scan bytes, not PDF structure | ✗ No structural PDF analysis | ✗ No structural analysis | ✗ No structural PDF analysis |
| JavaScript AST deobfuscation | ✓ Full AST deobfuscator + Acrobat API emulation | ✗ No | ⚠ Runtime observation only | ✗ No | ✗ No |
| XFA FormCalc parsing | ✓ Dedicated XFA parser engine | ✗ No | ✗ No | ✗ No | ✗ No |
| Signature forgery / Shadow Attack detection | ✓ ByteRange forensics engine | ✗ No | ✗ No | ✗ No | ✗ No |
| AcroForm exfiltration / hidden field analysis | ✓ Full field tree, SubmitForm targets, JS triggers | ✗ No | ✗ No | ✗ No | ✗ No |
| Six-parser differential comparison | ✓ MuPDF, Poppler, GS, qpdf, pdfminer, pdf.js | ✗ No | ✗ No | ✗ No | ✗ No |
| Machine learning anomaly detection | ✓ IsolationForest + RandomForest + LightGBM + SHAP | ✗ No | ✗ No | ✗ No | ✗ No |
| MITRE ATT&CK technique mapping | ✓ Every indicator mapped to technique IDs | ⚠ Some detections, not systematic | ✓ Good ATT&CK coverage | ✗ No | ⚠ Limited mapping |
| AI forensic narrative report | ✓ Self-hosted Qwen 2.5 — structured verdict + findings | ✗ No | ✗ No | ✗ No | ✗ No |
| File privacy / zero data retention | ✓ Deleted immediately, no external calls, no hashes shared | ✗ Files stored; hashes and reports are community-shared | ✗ Files stored; can be set private (paid only) | ✓ Local processing, file stays on your machine | ⚠ Enterprise tier offers private scanning |
| Offline threat intelligence | ✓ 6.4M+ indicators in local databases — zero external calls | ⚠ All queries sent to external services | ⚠ Online lookups | ✗ No threat intel | ⚠ Cloud-based lookups |
| Sanitize / clean the PDF | ✓ 9 methods: flatten-to-images, strip JS, remove XFA, PDF/A… | ✗ No | ✗ No | ✓ "Sanitize Document" removes active content | ⚠ Basic sanitization in some tiers |
| Cost | ✓ Free — no account required | ✓ Free with account (rate limited) | ✓ Free tier (limited submissions/day) | ✗ ~$23/month subscription | ✗ Paid — enterprise pricing |
The honest assessment: VirusTotal's 70+ AV engines are the best tool in existence for one specific question — "has this exact file been seen and named by the antivirus industry?" If you need community reputation across 70 vendors, use it. For everything else — detecting what a PDF does, finding zero-days, structural forensics, sanitization, MITRE ATT&CK mapping, and keeping your file private — PQ PDF does all of it, free, with no account required.
All 44 Forensic Engines Explained
Every uploaded PDF passes through 44 independent analysis engines in a single request. Each engine is orthogonal — designed to catch a different class of threat that the others might miss. Results are correlated by a 45th synthesis layer that maps compound indicators to MITRE ATT&CK techniques.
/JavaScript, /JS, /Launch, /OpenAction, /AA, /EmbeddedFile, /RichMedia, /XFA, /AcroForm, heap spray constants, and shellcode sequences.javascript:, data:, file://, and vbscript: schemes. All URLs are passed to the Threat Intelligence engine./Widths arrays (historic heap-overflow vector), non-embedded fonts that trigger external font lookups, and suspicious glyph name mappings./JBIG2Decode (CVE-2009-0658), /JBIG2Globals exploit parameters, oversized /Widths arrays, and codec parameter combinations associated with heap-overflow and memory corruption CVEs in Adobe Reader and Foxit.qpdf --check to validate cross-reference tables and trailer dictionaries from a second, independent parser. Intentionally malformed XRef tables are a hallmark of exploit kits designed to hide objects from basic parsers.%u9090, 0x0c0c), CVE-specific byte sequences (CVE-2009-0658, CVE-2024-41869, CVE-2024-45112), obfuscated JS loaders, XFA+script combos, Cobalt Strike beacon signatures, PowerShell encoded commands, and multi-layer encoder chains.Pdf.Exploit.* family covering CVE-2009-0927, CVE-2009-4324, and the Exploit.PDF-JS category. A ClamAV match means the file is a confirmed known threat.strace. Detects: network beaconing, anonymous executable memory (shellcode), shell spawning, filesystem escape attempts, and process bombs. Static analysis sees structure; this engine sees what the PDF does.mutool), Poppler, Ghostscript, qpdf, pdfminer, and pdf.js — and cross-compares eight structural dimensions: page count, object count, PDF version, JavaScript presence, encryption status, AcroForm presence, embedded file count, and OpenAction. Discrepancies mean the file exploits parser differences to hide objects — the signature of broken-xref exploit staging and incremental-update attacks.eval/unescape layers, string-split obfuscation, hexadecimal encoding, and multi-pass encoder chains. Surfaces the final deobfuscated payload for manual review.ByteRange forensics. Detects Shadow Attacks — where a PDF displays a valid signature while concealing content outside the signed byte range — and verifies the signature covers the complete document./Launch actions that auto-execute embedded files on viewer interaction./A and /AA field events (focus, blur, keystroke, validate), hidden NoExport fields, password-type fields (credential harvesting), /SubmitForm exfiltration targets, and calculation-order chain exploitation across field objects.%%EOF boundary and extracts per-revision metadata: author, producer, modification date, and changed/new/deleted object counts per revision. Detects author identity changes, execution vectors injected after original creation, and automated exploit staging via large final-revision object injections.javascript: URI schemes, JavaScript triggers on click/hover, /Launch actions that spawn programs, /GoToR remote links, and /SubmitForm in annotation actions — attack vectors completely invisible to byte-level scanners./Names /JavaScript), /AA additional actions on page open/close/print/save, /OpenAction type classification, /DocMDP modification-prevention signatures that block sanitizers, and /Perms and /UR3 permission restriction exploitation.exec (dynamic execution), run (file execution), token (string-to-code eval), setpagedevice (PostScript-to-system bridge). Also detects malformed /ICCBased color profiles of anomalous size — the CVE-2021-21017 class of heap buffer overflows./ObjStm containers — invisible to byte scanners. This engine decompresses every object stream and re-scans the content for JavaScript, Launch actions, EmbeddedFile references, and high-entropy payloads (entropy >7.5 bits) suggesting hidden encrypted content./J#61vaScript → /JavaScript, whitespace-split token injection, and null-byte injection in name objects. These bypass simple pattern matchers while remaining valid to the PDF renderer — a classic evasion technique found in real-world exploit kits./OpenAction → /AA → field actions → annotation triggers → named actions. Visualises multi-hop execution chains where a seemingly innocent trigger leads through a chain of named actions to a final exploit — invisible when examining any single action in isolation.U+202E) that reverse displayed filenames and URLs, and zero-width joiners used to split and reassemble malicious keywords./JBIG2Decode + /JBIG2Globals combinations (CVE-2009-0658 class), abnormally large /Columns and /Rows values in CCITT streams, and unusual parameter combinations in /CCITTFaxDecode and /DCTDecode filters associated with historic heap overflow exploits.app, this, util) to reveal what JavaScript does without a real viewer — catching payload assembly that requires runtime evaluation to surface.seac operator abuse (out-of-bounds glyph lookup), stack exhaustion via deeply nested subroutine calls, and arithmetic overflow patterns in CharString arithmetic — a class of font-engine exploits affecting all major PDF viewers./OpenAction + embedded JavaScript + obfuscated URL + non-embedded font is a dangerous combination. The Correlation Engine awards bonus risk points (35–100) for such combinations and maps each compound pattern to MITRE ATT&CK technique IDs.How the Risk Score Works
Each indicator detected by any engine adds points based on its severity tier.
The Correlation Engine adds additional bonus points for dangerous indicator combinations — because a single
suspicious keyword is low-risk, but the combination of an /OpenAction, obfuscated JavaScript, and a known-malicious URL is definitively dangerous.
Severity tiers: Critical (+50 pts) · High (+25 pts) · Medium (+10 pts) · Low (+3 pts) — capped at 3 occurrences per indicator. Correlation Engine bonus: +35 to +100 points for dangerous compound patterns.
Your File Never Leaves Our Server
Uploading a potentially malicious PDF to an online scanner is only sensible if the scanner's security model is trustworthy. PQ PDF is designed around the principle that the scanner must be as safe to use as the file is dangerous.
Frequently Asked Questions
prlimit resource limits, AppArmor MAC policy (pqpdf-unshare profile), Linux user + mount + network + PID namespaces, and a private tmpfs mount. The behavioral sandbox adds another nested namespace with its own isolated network stack. The file is deleted immediately after analysis — no copy, hash, or metadata is retained.
CVE-specific exploits: CVE-2009-0658 (JBIG2 heap overflow), CVE-2024-41869 (use-after-free), CVE-2024-45112 (type confusion), and 20+ others via 24 custom YARA rules.
Form-based attacks: AcroForm credential harvesting, hidden fields, SubmitForm exfiltration, XFA FormCalc exploits.
Structural attacks: XRef Shadow Attacks (signature forgery), OCG layer cloaking, Unicode invisible text, polyglot files, PDF token obfuscation.
Behavioral threats: Anything that causes network beaconing, shell spawning, or executable memory allocation when the PDF is rendered — caught by the behavioral sandbox regardless of whether a static signature exists.
File deleted immediately. Zero data retained.