Are these PDF tools free to use?

Yes, all 47 PDF tools on PQ PDF are completely free to use. No account, subscription, or sign-up is required. The platform is funded by enterprise on-premise deployments, not user data.

Is my PDF data stored after processing?

No. Every operation creates one isolated temporary directory. The result streams back to your browser and the directory is deleted while the download is still in flight — there is no retention window because there is no buffer. No file content is ever logged or sent to a third-party service.

What PDF tools are available?

PQ PDF offers 45 free PDF tools across six categories: Core Manipulation (merge, split, compress, rotate, reorder, delete/extract pages, repair, flatten, grayscale, N-up imposition, deskew), Convert (PDF to Word/Excel/PowerPoint/Images/HTML/Markdown/PDF-A/PDF-X, and Word/Excel/PowerPoint/Images/HTML to PDF), Security (47-engine forensics scanner, AES-256 or post-quantum encryption, unlock, redact, watermark, sign/PAdES, e-signature), Annotate and Inspect (edit/annotate, OCR, compare, fill forms, extract text, camera scan, accessibility checker, colour inspector, font inspector, PDF info, tables to JSON, outline editor), Automation (workflow builder), and the full PDF Editor.

How does PQ PDF keep uploads secure?

Every external tool invocation passes through a four-layer sandbox: prlimit (kernel resource caps: 1.5 GB RAM, 512 MB write), AppArmor mandatory-access-control profile, Linux namespaces (isolated network, PID, mount, IPC — no internet access possible), and a tmpfs scratch mount (all I/O is in-memory and vanishes on exit). A strict per-request Content Security Policy uses fresh random nonces — no unsafe-inline, no unsafe-eval. Transport uses HTTP/3, TLS 1.3 only, with X25519MLKEM768 post-quantum key exchange.

What is the maximum file size for PDF uploads?

Most tools accept files up to 50 MB. The forensics scanner accepts up to 10 MB (covering all known real-world malicious PDF classes). For merge operations multiple files can be combined up to a combined 200 MB. Enterprise on-premise deployments have no file size limit.

Does PQ PDF use third-party AI services?

No. All AI features run on a self-hosted Qwen 2.5 1.5B Instruct model via llama.cpp on dedicated private hardware. No data is sent to OpenAI, Anthropic, Google, or any other third-party AI provider. Your document text never leaves pqpdf.com infrastructure.

PQ PDF — 47-Engine Forensic Scanner

What Happens When You Upload a PDF

Your file doesn’t just open — it’s interrogated.

Structure Analysis

Every object, cross-reference, and stream parsed against the PDF specification. Malformed structures, trailer chain anomalies, and embedded object nesting are flagged.

JavaScript Deobfuscation

Embedded JavaScript is extracted and emulated behaviorally. XFA FormCalc, action chains, and URI/Launch triggers are individually parsed and examined.

Multi-Engine Detection

47 independent forensic engines run in parallel — YARA rules, ClamAV signatures, byte-pattern analysis, six-parser differential, and ML anomaly detection with SHAP explanations.

Sandbox Execution

Dynamic analysis in a network-isolated sandbox. No internet access, no file system writes outside the sandbox. Captures runtime behavior without risk to your system.

Threat Intelligence Matching

6.4 million offline threat indicators — hashes, domains, IPs, and known malware signatures — matched locally with no external API calls.

Multi-Axis Forensic Verdict

Findings are graded across four forensic axes — malware/exploit, document-integrity tampering, content-integrity (semantic-determinism & AI-ingestion attacks: V/AP divergence, glyph remapping, OCR poisoning, prompt injection), and neutral structure — into a clear verdict (Clean → Dangerous) with the driving axis, MITRE ATT&CK mapping, and an AI forensic report. A document-integrity or AI-ingestion attack is surfaced even when it carries no malware.

If It’s Dangerous, We Fix It

One click removes everything risky.

Removes:

JavaScript
Embedded files
Action triggers (OpenAction, Launch, URI)
Hidden layers
Suspicious annotations
Non-standard object streams

🧼 Download Clean PDF

9 surgical sanitize modes — from light clean to full linearization rebuild

Strip JavaScript Remove Embedded Files Flatten Actions Remove Hidden Layers Flatten Forms Remove Annotations Metadata Wipe Object Stream Rebuild Full Linearization

Featured Tool

PDF Forensics Scanner Unique

A free PDF analysis tool running 47 independent forensic engines in a single pass — covering categories of semantic, integrity, and parser-level analysis most PDF scanners and malware tools don't attempt: XFA FormCalc parsing, PDF action graphs, JS behavioural emulation, and offline threat intelligence (6.4M+ indicators). Results across 24 analysis tabs with a dedicated 🤖 AI Forensic Report — Qwen 2.5 1.5B synthesises all engine outputs into a structured verdict, MITRE technique grid, and recommended actions. Fully server-side, zero retention.

Annotation Forensics Behavioural Sandbox ML Anomaly + SHAP Local Threat Intel MITRE ATT&CK YARA & ClamAV 🤖 AI Forensic Report 9-Mode Sanitize

47 Forensic
Engines

Analyse a PDF →

•Categories of analysis most tools don't attempt — free or commercial. 47 forensic engines · 24 analysis tabs · AI forensic report · 6.4M offline threat indicators · zero retention.

Featured Tool

Office Forensics Scanner 23 Engines

Deep forensic analysis for .docx, .xlsx, .pptx, .doc, .xls, .ppt, .xlsm, .msg, .rtf and more — 23 dedicated engines covering VBA macros, XLM formulas, OLE objects, IOCs, metadata, sandbox, YARA, ClamAV, threat intelligence, and container structure. AI forensic report with threat verdict and MITRE ATT&CK mapping. Four sanitize modes: convert to static PDF, strip macros, strip metadata, or upgrade legacy OLE2 to OOXML.

VBA Macro Detection XLM Deobfuscation OLE Forensics IOC Extraction MITRE ATT&CK 🤖 AI Forensic Report 4-Mode Sanitize Strip Metadata

23 Forensic
Engines

Scan an Office Doc →

•Office docs are the #1 malware delivery vector. 23 forensic engines · VBA/XLM/OLE/YARA/sandbox · AI forensic report · 4-mode sanitize · zero retention.

Featured Tool

Universal File Forensics Scanner New

Forensic analysis for any file type — images, audio, video, archives, executables, scripts, fonts, certificates, and network captures — across 23 independent engines: file ID (magic bytes/MIME mismatch), entropy anomaly, metadata forensics (EXIF/ID3), IOC extraction, PE/ELF binary analysis, archive inspection (zip bombs/path traversal), image steganography (LSB chi-square), script analysis (reverse shells/webshells/AMSI bypass), XOR brute-force deobfuscation, watermark detection (EXIF/ID3/XMP/alpha/OCR), six-layer isolation chamber detonation (strace · ltrace · in-memory YARA dump analysis · fake DNS+HTTP network capture · CPU/VM fingerprint masking · LibreOffice macro execution · Playwright interaction simulation), Windows execution layer (Wine) (.exe/.dll/.msi/.bat/.ps1/.vbs/.hta detonation in Linux namespace), real Windows micro-VM detonation (KVM/QEMU Windows 10 — genuine kernel, process spawn monitoring, network capture, registry persistence — high-risk samples only), ClamAV, YARA, threat intel, campaign intelligence (named clusters · family classification · activity trends · D3 graph), correlation, and AI forensic report. Zero data retention.

PE & ELF Binary Analysis Steganography (LSB) Archive Inspection IOC Extraction MITRE ATT&CK YARA & ClamAV Watermark Detection 🧪 Sandbox Detonation 💻 Windows Execution (Wine) 🖥 Real Windows Micro-VM 🕸 Campaign Intelligence 🤖 AI Forensic Report All File Types

24 Forensic
Engines

Scan Any File →

•Any file, any format — one scanner. 23 forensic engines · PE/ELF/archive/steganography/script/watermark/sandbox · Windows execution (Wine) · real Windows micro-VM · campaign intelligence · AI forensic report · offline threat intel · zero retention.

Featured Tool

Post-Quantum PDF Encryption Quantum-Safe

Encrypt PDFs using NIST-standardised post-quantum algorithms — Kyber/ML-KEM, NTRU, BIKE, HQC, Dilithium, and SPHINCS+ — producing a portable .pqcpdf bundle resistant to quantum computer attacks. In PQC mode your plaintext never reaches our server. AES-256 also available.

Kyber / ML-KEM NTRU BIKE HQC Dilithium SPHINCS+ Client-Side Encrypt Zero Server Knowledge

31 PQC
Algorithms

Encrypt a PDF →

•Harvest-now, decrypt-later attacks are real. Documents encrypted today with RSA or AES alone may be decryptable by quantum computers within a decade. Migrate to PQC now — free.

Featured Tool

Multi-Party E-Signatures Free

Send any PDF to up to 10 signers with a unique secure link each — no accounts on either side. Sequential chain-signing or parallel all-at-once. Every signature is PAdES-B cryptographically embedded and verifiable in Adobe Reader. Zero retention after signing completes.

Up to 10 Signers Sequential or Parallel PAdES-B Signed Unique Secure Links No Account Needed Zero Retention

10 Max
Signers

Send for Signing →

•DocuSign Essentials £12/mo · Adobe Acrobat Sign £17/mo · PandaDoc £15/mo. PQ PDF: free, cryptographically signed, no account required.

Featured Tool

HTML & URL → PDF Chromium

Convert any web page URL or HTML file to PDF using a full Playwright/Chromium browser — capturing modern CSS, web fonts, JavaScript-rendered content, and single-page applications exactly as they appear in a real browser. No API key, no headless-browser subscription.

Playwright / Chromium JS-Rendered Pages Web Fonts Modern CSS URL or HTML File Single-Page Apps

Full Browser
Fidelity

Convert a URL →

•PDFShift $9–99/mo · Browserless $50+/mo · Puppeteer.io $19+/mo. PQ PDF: free, full Chromium render, no API key.

Featured Tool

Permanent PDF Redaction Truly Erased

Permanently erase sensitive content — not just covered with a black box. Text-pattern mode supports multi-pattern lists with regex, case-sensitivity, and whole-word matching. Region mode lets you draw redaction areas directly on a page preview. Content is destroyed server-side; it cannot be recovered.

Text-Pattern Redaction Regex Support Draw Regions Server-Side Erasure GDPR / PII Safe Unrecoverable

Zero Data
Remaining

Redact a PDF →

•Adobe Acrobat Pro required for redaction (£179/yr) · iLovePDF Pro £48/yr. PQ PDF: free, truly permanent, server-side erasure — no subscription.

Threat Intelligence

Real Breaches. Every One Caught.

Twenty-five of the most damaging document-based attacks of the last decade — PDF exploits, Office macro loaders, and email-borne lures — and the engines that would have flagged each one before execution.

THREAT-001 ● Critical

Actor: Unknown / Active

Adobe Reader 0-Day (2025–2026)

Active since December 2025 — unpatched window

Est. Impact $10M+ Aggregate across finance, legal & procurement

JSFuck-obfuscated payloads inside incremental PDF updates, invisible to half the parser ecosystem. Credential theft and data exfiltration confirmed. Differential parsing discrepancy — MuPDF and Ghostscript saw nothing; Poppler and pdf.js resolved full execution chain.

DETECTED Differential Parsing JS AST Deobfuscation Action Dependency Graph Structure Validator Revision History OCG Layer Cloaking

THREAT-002 ● Critical

Actor: APT28 (Fancy Bear) — GRU Unit 26165

APT28 — Government & Defence PDF Credential Theft

2022–2025 — NATO, EU diplomatic targets

Est. Impact $50M+ Classified operational impact; U.S. & EU agencies

PDF-based spear-phishing delivering credential harvesters to diplomatic and defence networks. Lure documents used embedded launch actions and URI handlers to silently exfiltrate NTLM hashes. U.S. DOJ & EU CERT attributed; downstream costs likely tens of millions.

DETECTED Pattern Scanner URL Extractor Threat Intelligence Phishing Detection Action Dependency Graph YARA Rule Engine

THREAT-003 ● Critical

Actor: APT29 (Cozy Bear) — SVR

APT29 — Embedded Payload PDFs Targeting NGOs & Policy Groups

2021–2025 — think tanks, policy institutes

Est. Impact $20M+ Multi-million remediation across targeted organisations

Multi-stage payloads hidden in layered PDF streams. Execution deferred by renderer-version fingerprinting — files appeared clean in sandboxes running older Acrobat builds. Policy documents and internal communications exfiltrated across Atlantic-Council-tier organisations.

DETECTED Stream Inspector JS Behavioural Emulation JS AST Deobfuscation Embedded File Analysis Physical Entropy Topology ML Intelligence Engine

THREAT-004 ● Critical

Actor: FIN7 (Carbanak Group)

FIN7 — Retail & Hospitality PDF Malware Delivery

2020–2024 — U.S. DOJ indictments; ongoing activity

Est. Impact $1B+ U.S. DOJ: global fraud losses attributed to FIN7

Invoice-themed PDF lures with embedded DOCX objects and VBA macro loaders staging Carbanak/GRIFFON. Targeted restaurant chains, hotel groups, and point-of-sale operators. Card skimmer deployment followed within hours of initial PDF open event.

DETECTED Embedded File Analysis ClamAV Signature Scanner YARA Rule Engine Campaign Attribution Dynamic Behavioural Sandbox Phishing Detection

THREAT-005 ● Critical

Actor: TA505 (Clop Ransomware Group)

TA505 — Polyglot PDF/ZIP Malware Loader

2021–2024 — enterprise ransomware campaigns

Per-Victim Cost $5–40M Recovery & downtime per enterprise victim

PDF files simultaneously valid as ZIP archives (polyglot technique). Opened in Acrobat: rendered a benign invoice. Extracted and executed: deployed SDBot/FlawedGrace loader staging Clop ransomware. Bypassed attachment scanners relying on MIME type or single-parser analysis.

DETECTED Polyglot / Embedded Binary Differential Parsing Stream Inspector Trailer Chain Forensics Physical Entropy Topology ClamAV Signature Scanner

THREAT-006 ● Critical

Actor: TA542 / WizardSpider / TA570

Emotet / TrickBot / QakBot — PDF Campaign Waves

2020–2023 — global botnet distribution

Global Damage $2.5B+ Europol: Emotet alone caused hundreds of millions globally

High-volume PDF campaigns carrying macro-enabled Office loaders, credential-stealing scripts, and ransomware staging payloads. PDFs used external /URI and /Launch objects to pull second-stage payloads. Europol-coordinated takedown in 2021; TrickBot and QakBot continued variants.

DETECTED URL Extractor Threat Intelligence Object Analyzer Pattern Scanner YARA Rule Engine Campaign Attribution

THREAT-007 ● High

Actor: DarkHotel (Tapaoux) — suspected DPRK nexus

DarkHotel — Executive Targeting via Renderer-Specific PDF Exploits

2014–ongoing — hotel WiFi & spear-phishing

Impact Class Espionage Strategic intel theft; C-suite compromise across APAC and EU

Exploits fingerprinted the victim’s exact Acrobat version before deciding whether to fire — returning a clean PDF to sandboxes. Targeted executives checking in at high-end hotels via rogue network PDFs. Delivered keyloggers and remote access trojans to C-suite devices undetected for years.

DETECTED JS AST Deobfuscation JS Behavioural Emulation Differential Parsing CVE Pattern Matcher Action Dependency Graph ML Intelligence Engine

THREAT-008 ● Critical

Actor: APT41 (Double Dragon) — MSS-affiliated

APT41 — Supply-Chain PDF Payloads

2019–2024 — U.S. DOJ indictment 2020

Est. Impact $100M+ U.S. DOJ: tens of millions in theft and damages attributed

PDF payloads inserted into legitimate software update channels and vendor documentation. Multi-stage encoded payloads hidden in object streams, decoded at runtime via embedded XFA FormCalc. Targeted pharma, tech, and defence supply chains simultaneously for espionage and financial gain.

DETECTED XFA FormCalc Parser Object Stream Analysis Embedded File Analysis Signature Forensics Physical Entropy Topology Campaign Attribution

THREAT-009 ● High

Actor: Multiple — BEC-as-a-service groups

2021–2024 Invoice-Themed PDF Phishing Waves

Global — finance, legal, procurement verticals

Global BEC Losses $2.9B FBI IC3 2022: PDFs primary lure for BEC & payment diversion

Mass-distribution invoice PDFs containing QR codes, obfuscated redirect URIs, and credential-harvesting links disguised as DocuSign or Microsoft portals. High success rate in finance teams due to legitimate-looking layouts. Zero malware — pure social engineering amplified by PDF trust signals.

DETECTED Phishing Detection URL Extractor Threat Intelligence QR Code Analysis Compliance Fraud Detection Metadata Analyzer

THREAT-010 ● Critical

Actor: Ryuk / Conti / LockBit / BlackCat affiliates

2020–2024 PDF-Based Ransomware Initial Access

Global enterprise campaigns — healthcare, infrastructure, finance

Per-Incident Cost $5–100M Downtime + data loss per incident; Conti total: $150M+

PDF attachments used as the first link in a multi-stage chain: PDF → macro-enabled Office loader → Cobalt Strike beacon → ransomware deployment. Ryuk, Conti, LockBit, and BlackCat all documented this pattern. A single opened PDF in a finance inbox produced full network compromise within 48 hours in multiple documented incidents.

DETECTED Embedded File Analysis Dynamic Behavioural Sandbox YARA Rule Engine ClamAV Signature Scanner URL Extractor Campaign Attribution

THREAT-011 ● Critical

Actor: TA542 / MUMMY SPIDER (CrowdStrike)

Emotet — Word Macro Campaigns & Initial Access Brokering

2014–2021 (Europol takedown); resurged 2022–2024

Global Cumulative Damage $2.5B+ Ukrainian Cyber Police / FBI joint 2021 takedown data; 1.6 million infected machines in final wave

Emotet distributed Word and Excel macro-enabled documents via invoice and shipping-notice lures; embedded VBA executed silently on document open, installing a persistent loader that polled C2 for secondary payloads. It became the world’s #1 initial access broker, staging TrickBot and Ryuk ransomware infections across governments, hospitals, and enterprises in 180+ countries before Europol’s coordinated January 2021 dismantlement.

DETECTED VBA Macro Extractor Embedded Objects Analysis IOC Extractor Sandbox Detonation YARA Rule Engine Metadata Forensics

THREAT-012 ● Critical

Actor: WizardSpider / DEV-0193 (Microsoft)

TrickBot — Excel 4.0 & Word Macro Banking Trojan

2016–2022; 140,000+ victims since November 2020

Avg. Recovery Cost $2.7M FBI / CISA incident response data; ransom demands averaged $2M per enterprise; 140,000+ compromised systems globally

TrickBot arrived via DOCM and Excel 4.0 XLM macro attachments; macro code silently downloaded the Ostap dropper which fetched modular TrickBot components for browser MITM credential theft, Active Directory reconnaissance, and network propagation. It served as the primary staging platform for Ryuk ransomware and Conti — a single TrickBot infection routinely progressed to full enterprise ransomware within 48–72 hours.

DETECTED VBA Macro Extractor XLM Macro Parser OLE Structure Analysis IOC Extractor Sandbox Detonation YARA Rule Engine

THREAT-013 ● Critical

Actor: Evil Corp / INDRIK SPIDER (Maksim Yakubets)

Dridex / Evil Corp — “Enable Content” Banking Fraud

2014–2023 — 40+ countries, hundreds of financial institutions

Confirmed Illicit Proceeds $100M+ U.S. Treasury OFAC sanctions Dec 2019 & DOJ indictment; UK losses alone $30M; Evil Corp sanctioned with $5M bounty

Dridex weaponised Word macro documents with carefully crafted “Enable Content” social engineering prompts, tricking users into activating embedded VBA that silently downloaded the Dridex banking trojan. The malware intercepted online banking sessions via form-grabbing and injected fake login overlays, enabling mass wire fraud. Evil Corp later repurposed the same delivery infrastructure for ransomware campaigns (BitPaymer, WastedLocker, Hades) while operating under OFAC sanctions.

DETECTED VBA Macro Extractor OLE Structure Analysis NLP Social Engineering Detection IOC Extractor Sandbox Detonation Threat Intelligence

THREAT-014 ● High

Actor: Multiple criminal affiliates (GOLD MYSTIC cluster)

ZLoader — Office Macros Abusing Signed Windows Binaries

2020–2023 — Microsoft / Mandiant infrastructure disruption April 2022

Payload Downstream Cost $4M avg. ZLoader staged Egregor & Ryuk ransomware; Egregor demanded an average $4M ransom per enterprise victim

ZLoader’s document chain began with a macro-enabled Word file that downloaded a password-protected Excel workbook; a second VBA layer then abused Rundll32 and other signed Windows binaries (LOLBAS technique) to load the ZLoader DLL while bypassing AV. Malicious MSI installers bearing fraudulent Microsemi Corporation code-signing certificates provided a final evasion layer before staging Egregor and Ryuk ransomware payloads across enterprise networks.

DETECTED VBA Macro Extractor Embedded Objects Analysis OLE Structure Analysis Cryptographic Analysis IOC Extractor Sandbox Detonation

THREAT-015 ● Critical

Actor: TA570 / TA577 (Black Basta & Conti staging)

QakBot — Email Thread Hijacking with Office Attachments

2019–2023 — FBI Operation Duck Hunt, August 2023

FBI-Attributed Ransom Fees $58M+ DOJ press release Aug 2023; 700,000+ infected machines globally; administrators collected $58M in fees (Oct 2021–Apr 2023)

QakBot hijacked legitimate email threads, replying into real business conversations with contextually credible Word or Excel attachments that bypassed recipient suspicion. Embedded XLM macros or VBA downloaded the QakBot executable which persisted via Windows Registry, then staged Cobalt Strike beacons for Black Basta and Conti ransomware operators. The FBI’s Operation Duck Hunt (Aug 2023) dismantled its C2 infrastructure, but variants re-emerged within months under new delivery methods.

DETECTED VBA Macro Extractor XLM Macro Parser OLE Structure Analysis IOC Extractor Sandbox Detonation Threat Intelligence

THREAT-016 ● Critical

Actor: WizardSpider / Grim Spider

Ryuk — Word Doc Chain to Enterprise Ransomware

2018–2021 peak — hospitals, governments, media organisations

Confirmed Ransom Collected $150M+ FBI / Intel 471 through 2020; average demand $12.5M in 2019; Sopra Steria breach alone €40–50M in recovery

Ryuk reached victim networks through a three-stage Office document chain: Word macro attachment (Emotet loader) → credential-theft module (TrickBot lateral movement) → manual Ryuk deployment via reverse shell. The ransomware encrypted network shares with RSA-2048 and AES-256, demanded Bitcoin ransoms averaging $12.5M per incident at its 2019 peak, and deliberately targeted hospitals, municipal governments, and media organisations to maximise extortion pressure and limit victim options.

DETECTED IOC Extractor Sandbox Detonation Cryptographic Analysis Threat Intelligence Relationships Analysis Metadata Forensics

THREAT-017 ● Critical

Actor: WizardSpider / Periwinkle Tempest

Conti — Healthcare & Government Office Phishing

2020–2022 — 1,000+ confirmed victims globally

Confirmed Ransom Revenue $150M+ FBI January 2022; U.S. State Dept offered $15M reward; HSE Ireland €100M recovery; Costa Rica declared national emergency

Conti gained initial access via phishing emails with Office macro attachments that dropped TrickBot, IcedID, or BazarLoader for reconnaissance and lateral movement; Cobalt Strike beacons then enabled hands-on-keyboard operations before Conti encryptor deployment. Leaked internal communications revealed a professional 65-person RaaS operation with HR, management, and a dedicated OPSEC team — industrialised ransomware running at the scale of a mid-sized enterprise.

DETECTED VBA Macro Extractor NLP Social Engineering Detection IOC Extractor Sandbox Detonation Threat Intelligence Cryptographic Analysis

THREAT-018 ● Critical

Actor: LockBit Operations (LockBitSupp) — suspected Russia-origin

LockBit — Most Prolific Ransomware of the Decade

2019–2025 — regrouped within 96 hours of Operation Cronos (Feb 2024)

Confirmed Ransom Payments $120M+ CISA/FBI Advisory AA23-165A; 2,000+ victims across 120 countries; responsible for 16% of all ransomware attacks in 2024

LockBit affiliates used Office macro documents, exposed RDP endpoints, and compromised VPN credentials as initial access vectors, with an 80/20 affiliate revenue split that drove rapid RaaS adoption. The encryptor deployed AES-256 with double-extortion tactics — file encryption plus data exfiltration to publicly accessible leak sites — targeting hospitals, courts, and critical infrastructure across 120+ countries. Despite Operation Cronos’ infrastructure seizure in February 2024, LockBit resumed public operations within four days.

DETECTED IOC Extractor Sandbox Detonation Cryptographic Analysis Threat Intelligence Metadata Forensics Relationships Analysis

THREAT-019 ● Critical

Actor: FIN7 / GOLD NIAGARA (Carbanak Group)

FIN7 — Targeted Excel DDE Spear-Phishing

2015–2021 — DOJ indictments 2018; 3,600+ U.S. businesses breached

Payment Cards Compromised 20M+ records Mandiant / DOJ: 3,600+ breached businesses; dark-web card sales enabled an estimated $300–500M in fraudulent transactions

FIN7 sent personalised Excel spreadsheets to payroll, finance, and executive personnel, exploiting DDE (Dynamic Data Exchange) formulas to launch PowerShell commands without triggering standard macro warnings — a technique that bypassed most enterprise AV of the era. The Carbanak backdoor established C2, enabling lateral movement to POS systems and ATM networks for mass payment card harvesting across restaurant chains, hotel groups, and retail enterprises. FIN7 subsequently pivoted to ransomware operations using the same spear-phishing infrastructure.

DETECTED XLM Macro Parser Embedded Objects Analysis NLP Social Engineering Detection IOC Extractor Sandbox Detonation Metadata Forensics

THREAT-020 ● Critical

Actor: APTs (Chinese, Iranian, Russian-aligned) & criminal RaaS affiliates

Cobalt Strike — Office-Delivered Beacons Across Nation-State & Ransomware Ops

2019–2025 — cracked licenses sold on criminal forums for under $500

Present in Ransomware Incidents 66%+ Proofpoint: Cobalt Strike detected in the majority of enterprise ransomware incidents; cracked CS licenses democratised nation-state tooling to criminal affiliates

Cobalt Strike beacons are delivered via Office macro documents (DOCX, XLSM) containing VBA that spawns cmd.exe and fetches the beacon through encoded PowerShell — typically over HTTP/S to blend with normal web traffic. Once active, the beacon provides command-and-control, lateral movement, credential dumping via Mimikatz, and exfiltration; Iranian Lemon Sandstorm, Chinese APTs, and the Ryuk, Conti, and LockBit ransomware families all leverage it as their standard post-exploitation platform.

DETECTED VBA Macro Extractor OLE Structure Analysis IOC Extractor Sandbox Detonation Relationships Analysis Threat Intelligence

THREAT-021 ● High

Actor: Distributed MaaS operators (unattributed core developer)

FormBook — Credential-Stealing MaaS via DOC & PDF Lures

2016–2024 — 3rd most prevalent global malware in 2021 (Check Point)

Corporate Networks Hit (2022) 1-in-20 Check Point 2022: 5% of corporate networks worldwide; available on underground forums for $29–$59/week as full-featured MaaS

FormBook is distributed as a commodity Malware-as-a-Service via phishing emails with malicious DOCX and OLE-embedded attachments; macros or embedded objects download and execute the FormBook stealer payload, which hollows legitimate Windows processes to evade detection. The infostealer harvests browser-saved credentials, screenshots, and keystroke logs from all major browsers, uploading credential dumps to attacker C2 servers where affiliates sell the data on underground markets within hours of capture.

DETECTED VBA Macro Extractor OLE Structure Analysis Embedded Objects Analysis IOC Extractor Sandbox Detonation YARA Rule Engine

THREAT-022 ● High

Actor: Commodity RAT / MaaS operators ($15–$69/month)

Agent Tesla — Excel CVE Exploits & Mass Credential Harvesting

2014–2024 — continuous commodity MaaS campaigns

Daily Attack Volume (Peak) 3,000+/day FortiGuard Labs: 3,000+ blocked attacks per day at peak; harvests credentials from 55+ applications including browsers, VPN, FTP, and email clients

Agent Tesla is delivered via phishing emails with Excel attachments exploiting Office memory corruption vulnerabilities (CVE-2017-11882, CVE-2018-0802) to drop the .NET RAT executable without any macro interaction required. The payload performs keystroke logging, clipboard monitoring, periodic screenshot capture, and credential harvesting across 55+ applications, exfiltrating data via SMTP, FTP, or Telegram C2 channels. Sold for $15–$69/month on criminal forums, it remains one of the most widely deployed credential-theft tools globally.

DETECTED XLM Macro Parser Embedded Objects Analysis OLE Structure Analysis IOC Extractor Sandbox Detonation YARA Rule Engine

THREAT-023 ● High

Actor: Multiple Russian-nexus operators (Gozi source code leaked 2014)

Ursnif / Gozi ISFB — Word Macro Banking Trojan

2007–2022 — DOJ GozNym prosecution 2019

Confirmed Bank Theft (GozNym) $100M+ DOJ May 2019: GozNym hybrid stole $100M+ from thousands of business banking accounts across the U.S., Germany, Poland, and Georgia

Ursnif/Gozi spreads via Word documents with VBA macros that install the banking trojan through process injection, establishing persistence via Windows Registry and scheduled tasks. The trojan hooks browser processes via man-in-the-browser (MITB) techniques to intercept online banking sessions in real time, capturing two-factor authentication codes and enabling fraudulent wire transfers. Gozi’s 2014 source code leak spawned over a dozen criminal variants — ISFB, RM3, and GozNym among them — deployed globally for more than a decade.

DETECTED VBA Macro Extractor OLE Structure Analysis IOC Extractor Sandbox Detonation Cryptographic Analysis Threat Intelligence

THREAT-024 ● High

Actor: TA511 / MAN1 / Moskalvzapoe

Hancitor — DocuSign-Lure Office Loader for Infostealers

2013–2021 — 3,000+ campaigns tracked in 2020 alone (Cofense)

Annual Campaign Volume (2020) 3,000+/yr Cofense Intelligence: 3,000+ Hancitor campaigns in 2020; delivers Pony stealer, FickerStealer, and Cobalt Strike as secondary payloads

Hancitor masqueraded as DocuSign notifications or corporate document alerts, delivering Word macro attachments or embedded links that triggered on enable-content; the macro downloaded Hancitor which fetched Pony credential stealer and FickerStealer for immediate data exfiltration. Post-2020 campaigns evolved to deploy Cobalt Strike beacons, bridging commodity credential theft with hands-on-keyboard ransomware operations — an early model of the commodity-to-enterprise escalation pattern that became dominant across the criminal ecosystem.

DETECTED VBA Macro Extractor OLE Structure Analysis NLP Social Engineering Detection IOC Extractor Sandbox Detonation YARA Rule Engine

THREAT-025 ● Critical

Actor: Unattributed malvertising operators

TamperedChef — 56-Day Dormant Infostealer in Trojanised PDF Tools

Active 2025 — WithSecure & Sophos disclosure August 2025

Active Campaign Scale 50+ domains WithSecure Labs / Sophos Aug 2025: 50+ fraudulent domains; EU-heavy — Germany 15%, UK 14%, France 9%; activated August 21, 2025

TamperedChef redirects victims via malvertising to fraudulent AppSuite PDF Editor sites; the trojanised installer executes silently then remains completely dormant for ~56 days before a remote activation command triggers the infostealer. Upon activation the payload terminates all open browsers to extract saved credentials and session cookies, enumerates installed security products, and exfiltrates data to attacker infrastructure — a deliberate dormancy window designed to outlast enterprise sandbox detonation windows and initial-infection investigation periods.

DETECTED Container Integrity Embedded Objects Analysis IOC Extractor Metadata Forensics Sandbox Detonation Threat Intelligence

1 / 10

Semantic Nondeterminism — Proven in PDF

A document that means one thing to every reader is an assumption — search, AI, e-discovery and compliance all share it. We measured it failing in the world's most common format, across 24,824 real PDFs.

🧷 Latest & definitive synthesis

The PDF Is Not the Document — 24,824 PDFs, Three Corpora

One finding across an adversarial detection set, a real-world benign control, and the entire 16,971-PDF Epstein release: a PDF is a stack of representations that can disagree, and malware is only one axis.

Read the synthesis →

📊

PDF Forensics at Scale

1,572 curated + 6,281 real-world PDFs: live-malware detection and a 0.34% false-positive rate on genuinely-benign documents.

Read →

🗂️

The Epstein Files, Forensically

All 16,971 DOJ Epstein PDFs through every engine: malware-clean, 100% metadata-stripped, 18.6% read differently to machines than to humans.

Read →

⚖️

Parser Disagreement

11 minimal PDFs through six parsers — every file produced a different reading.

Read →

🌀

PDF Reality Drift

One file, different semantic realities for humans, extractors, OCR, accessibility and AI.

Read →

All nine studies & the canonical definition → · Apply it: AI Document Integrity →

What do you want to do?

🔬 Check if PDF is safe 🧼 Sanitize a PDF ⬛ Redact sensitive info 🛡️ Encrypt & protect 🗜️ Make PDF smaller 📎 Merge files together 📝 Convert to Word 🔎 Extract text (OCR)

Everything Else You Expect

After it’s safe, do anything you want with it. 48 tools, all free.

📎

Merge PDFs

Combine

Combine multiple PDF files into one document. Drag, reorder, and merge with a single click.

✂️

Split PDF

Split

Split a PDF into individual pages or custom page ranges. Extract exactly what you need.

🗜️

Compress PDF

Reduce

Reduce PDF file size with intelligent compression. Choose quality vs. size trade-off.

🔄

Rotate Pages

Rotate

Rotate PDF pages by 90°, 180°, or 270°. Apply to all pages, odd, even, or a custom range.

📑

Extract Pages

Extract

Pick specific pages from a PDF and save them as a new document. Supports ranges.

🗑️

Delete Pages

Delete

Click pages to mark them for removal. All remaining pages are saved into a new document.

🔀

Reorder Pages

Reorder

Drag and drop PDF pages into any order, then save as a new document.

🔧

Repair PDF

Repair

Attempt to repair and re-linearize a corrupted or malformed PDF file.

📐

Flatten PDF

Flatten

Flatten form fields and annotations into the PDF content layer for archiving.

🎨

Grayscale PDF

Grayscale

Convert a color PDF to grayscale. Ideal for print optimization and reducing file size.

📝

PDF → Word

PDF → DOCX

Convert PDF documents to fully editable Word (.docx) files.

📈

PDF to Excel

PDF → XLSX

Extract tables from PDF files and export them to an Excel spreadsheet (XLSX).

🖼️

PDF to Images

PDF → PNG/JPG

Convert PDF pages to PNG or JPEG images at your chosen DPI (72–300). Get a ZIP of all pages.

🗄️

PDF → PDF/A

Convert a PDF to archival PDF/A format for long-term storage, legal, and compliance use.

📄

Word to PDF

DOCX → PDF

Convert DOCX, DOC, ODT, RTF, and TXT documents to PDF with formatting preserved.

📊

Excel to PDF

XLSX → PDF

Convert XLSX, XLS, ODS, and CSV spreadsheets to PDF. Tables and charts preserved.

📽️

PowerPoint to PDF

PPTX → PDF

Convert PPTX, PPT, and ODP presentation files to PDF. Each slide becomes a page.

🖼️

Images to PDF

Images → PDF

Combine JPG, PNG, TIFF, WebP, BMP, and GIF images into a single PDF. Drag to reorder.

🌐

HTML to PDF

HTML → PDF

Convert HTML files or web page URLs to PDF using a full Chromium browser — captures modern CSS, web fonts, and JS-rendered content.

🔬

PDF Forensics Scanner

47 Engines

Forensic analysis of PDF files across 47 independent engines — covering categories of semantic, integrity, and parser-level analysis that most PDF scanners and malware tools don't attempt. Including: XFA FormCalc parser, PDF action dependency graph, OCG layer cloaking detection, Unicode/invisible text attacks, trailer chain forensics, codec exploit parameter validation, physical entropy topology, image steganography (LSB chi-square), PDF/A compliance fraud detection, JavaScript behavioral emulation, font CharString stack machine emulator, cross-object XRef integrity graph, DocMDP certification forensics (/P level parsing, incremental update constraint validation, /Contents structural integrity, ByteRange coverage verification, full-save rewrite detection), FieldMDP per-signature field lock analysis (ISO 32000 §12.8.2.4 — Include/Exclude action parsing, empty-fields bypass detection, incremental form-field modification detection), Value/Appearance Stream (V/AP) divergence detection (/NeedAppearances stale-AP detection, checkbox/radio V vs AS key mismatch, text/listbox/combobox AP stream text extraction with font /Encoding /Differences remap, image-based AP structural detection, blank AP stream hiding), file-level polyglot detection (JPEG+PDF, ZIP+PDF — ISO 32000 §7.5.2 header-offset exploitation), linearized PDF first-page object override detection (set-intersection OID redefinition, /Linearized /O hint-table parsing, T1036/T1027). Plus: structural integrity, byte-pattern signatures, YARA, ClamAV, ML + SHAP, dynamic sandbox, six-parser differential, phishing, MITRE ATT&CK, offline threat intelligence (6.4M+ indicators). Results across 24 analysis tabs including 🤖 AI Forensic Report (Qwen 2.5 1.5B — structured verdict, key findings, MITRE mapping, recommended actions). 9-mode surgical sanitize.

🗂️

Office Forensics Scanner

20 Engines

Forensic analysis of Word, Excel, PowerPoint, Outlook, Access, and Visio files across 20 independent engines — container integrity, cryptographic anomalies, VBA & XLM macro extraction, OLE compound structure, metadata provenance, IOC extraction (URLs, IPs, domains, hashes), ClamAV, YARA, threat intel, LibreOffice rendering, sandbox detonation, entropy, OPC/schema validation, NLP social-engineering detection, intelligent correlation, and AI forensic report. Zero data retention.

🔬

Universal File Forensics Scanner

21 Engines

Forensic analysis of all file types — images, audio, video, archives, executables, scripts, databases, fonts, network captures — across 21 independent engines: file identification (magic bytes/MIME/format mismatch), entropy & compression anomaly, metadata forensics (EXIF/ID3), IOC & string extraction, binary artifact extraction + XOR deobfuscation, PE executable analysis, ELF binary analysis, archive inspection (zip bombs/path traversal), image steganography (LSB chi-square), script & code analysis (reverse shells/webshells), watermark detection (EXIF/ID3/XMP/IPTC/alpha overlay/OCR), six-layer isolation chamber detonation (strace · ltrace · in-memory YARA dump analysis · fake DNS+HTTP network capture · CPU/VM fingerprint masking · LibreOffice macro execution · Playwright interaction simulation), ClamAV, YARA, threat intel, intelligent correlation, and AI forensic report. Zero data retention.

🔬

File Fingerprint Comparator

25+ Features

Upload two PDF or Office documents to compare their structural fingerprints and security profiles side by side. Scanned in parallel through all forensic engines, then diffed across 25+ features — similarity score, variant verdict, differences-first table. Detect malware variants, verify document integrity. Zero data retention.

🛡️

Protect PDF

PQC + AES-256

AES-256 password protection with permissions, or wrap your PDF in a Post-Quantum Cryptography (PQC) layer — quantum-computer-resistant encryption.

🔓

Unlock PDF

PQC + AES-256

Remove AES-256 password protection from PDFs you own, or decrypt PQC-encrypted .pqcpdf bundles using your private key or password.

⬛

Redact PDF

Privacy

Permanently remove sensitive text, names, numbers, and patterns. Server-side redaction — content is truly erased, not just covered.

💧

Add Watermark

Stamp

Stamp text watermarks on PDF pages. Control opacity, position, font size, and color.

✍️

Sign PDF

Sign & PAdES

Draw, type, or upload a signature image — or apply an invisible PAdES cryptographic digital signature (pyhanko, RSA-2048). Own certificate or auto self-signed.

🖊️

Send for E-Signature

Multi-Signer

Multi-party electronic signature workflow — add up to 10 signers, sequential or parallel order. Each signer gets a unique secure link. PAdES-B cryptographic signatures, zero retention, no account needed.

📷

PDF Scanner

Camera → PDF

Scan documents to searchable PDF using your camera or uploaded photos. Real-time edge detection, OpenCV perspective correction, CLAHE auto-enhancement, Tesseract 5 OCR. No app install, zero retention.

✏️

Edit PDF

Annotate

16 annotation tools including text, draw, eraser, shapes (with fill), highlight, whiteout, sticky notes, signatures, QR codes, and stamps. Plus form builder, bookmark editor, and per-page rotation. Changes are permanently embedded server-side.

📝

Fill PDF Form

Fill

Detect and fill interactive form fields — text inputs, checkboxes, radio buttons, and drop-downs. Optionally flatten for archiving.

🔍

Compare PDFs

Diff

Visual pixel-level diff between two PDF versions. Highlights added, removed, and changed content page by page.

📋

Extract Text

Text

Extract all text content from a PDF with layout preservation. Download as plain text.

ℹ️

PDF Info

Inspect

Inspect PDF metadata: page count, dimensions, author, title, encryption status, and more.

🔎

OCR PDF

Tesseract 5

Extract text from scanned PDFs and image-based documents using Tesseract 5 LSTM neural network OCR. Output as plain text, searchable PDF, or both.

⚙️

Workflow Builder

Automate

Chain multiple PDF operations into a single automated pipeline. Save, load, and compose named workflows. Rotate, watermark, compress, protect, and more — in any order.

📽️

PDF to PowerPoint

PDF → PPTX

Convert PDF pages to a PPTX presentation. Each page becomes a slide rendered at 150 DPI. Uses PyMuPDF and python-pptx.

🌐

PDF to HTML

PDF → HTML

Convert PDF to a self-contained HTML file with preserved text layout, fonts, and structure. Uses PyMuPDF structured HTML extraction.

📄

PDF to Markdown

PDF → MD

Convert PDF to clean Markdown using pymupdf4llm AI-powered layout analysis. Ideal for LLM workflows, RAG pipelines, and documentation sites.

🔖

Outline / Bookmark Editor

Outline

View and edit PDF bookmarks and document outline. Add, rename, reorder, and delete entries. Set heading levels and target pages.

📋

N-up / Imposition

N-up

Arrange multiple PDF pages per output sheet. 2-up, 4-up, 6-up, 8-up, 9-up, and booklet imposition for print.

📐

Auto-crop & Deskew

Auto-crop

Automatically crop white margins and fix page rotation. Uses PyMuPDF text block analysis to trim excess whitespace.

♿

Accessibility Checker

WCAG

Audit your PDF against WCAG 2.1 and PDF/UA standards. Checks tagging, language, alt text, reading order, font embedding, and more.

🔤

Font Inspector

Fonts

Inspect all fonts in a PDF: name, type, encoding, embedded status, and subset flag. Non-embedded fonts flagged in red.

🎨

Colour Inspector

CMYK

Detect RGB, CMYK, spot, and ICC colour spaces across images, vectors, and text. Flags overprint, transparency, and Total Ink Coverage over 300%.

🖨️

PDF to PDF/X

PDF/X

Convert PDF to print-ready PDF/X (X-1a, X-3, X-4) with CMYK colour conversion via Ghostscript. Fonts embedded, transparency flattened.

📊

Tables to JSON

Tables → JSON

Extract all tables from a PDF as structured JSON. Uses pdfplumber with line and text detection. Preview first table inline.

No tools match your search.

How It Works

Upload file

Select your PDF from your device. Files go directly to our processing server — nowhere else.

Process server-side

Your file is processed entirely on our server using proven open-source engines — no third-party cloud involved.

Download — file is immediately deleted

The processed result is sent straight to your browser. Both the upload and the output are wiped from the server the moment your download begins.

Automate This in Your Pipeline

Scan uploads before they reach storage. Sanitize attachments automatically. Convert documents safely at scale.

Scan uploads before storage

Block malicious PDFs at the point of ingestion — before they reach your document store, email archive, or content platform.

Sanitize attachments automatically

Strip all active content from inbound PDFs before delivery — JavaScript, embedded files, action triggers — without disrupting the document workflow.

Convert documents safely

Merge, compress, OCR, and convert at scale. 45 operations in one pipeline. On-premise deployment removes all file size and rate limits.

POST https://api.pqpdf.com/v1/{operation}

API Reference → On-Premise →

REST API available now — API-key auth, IP whitelisting, 83 operations. On-premise deployment for teams that need full infrastructure control.

Designed for Zero-Trust Environments

Built with multi-engine detection principles used in malware analysis.

Zero server knowledge

Files are processed and immediately destroyed. No document content is retained, indexed, or logged at any point in the processing pipeline.

Layered detection, no single point of failure

47 independent engines means no single detection surface controls the verdict. Built with the same multi-layer approach used in professional malware analysis workflows.

Open-source forensic stack

Every engine is open-source and auditable: YARA, ClamAV, PyMuPDF, peepdf, pdfid, pdf-parser, Tesseract, and more. No proprietary black box in the analysis chain.

Offline threat intelligence

6.4M+ threat indicators matched locally — no external API calls during analysis. Your file and its metadata never leave the processing server.