🗂️

Office Document Forensics Scanner

Deep forensic analysis of Word, Excel, PowerPoint, Outlook, Access, and Visio files across 23 independent engines — container integrity, encryption detection, metadata provenance, OOXML relationship forensics (remote template injection), embedded payload detection (PE/ELF/scripts), VBA macro extraction using olevba · mraptor · pcodedmp, Excel 4.0 XLM/DDE chain analysis, OLE compound structure inspection, IOC extraction (URLs · IPs · domains · registry keys · base64 payloads), ClamAV antivirus, YARA rule engine, offline threat intelligence (URLhaus · MalwareBazaar · ThreatFox · FeodoTracker), LibreOffice behavioural rendering, isolation chamber detonation (unshare + strace), entropy & compression anomaly detection, OPC rule validation, OOXML schema validation, font & theme forensics, MIME/transport forensics, digital signature forensics, NLP social engineering classifier (regex + LLM), intelligent cross-engine correlation, and AI forensic report (Qwen 2.5 · MITRE ATT&CK · verdict · confidence). Results across 27 analysis tabs. Zero data retention — file deleted immediately after analysis.

Multi-Engine Forensic Architecture container · crypto · metadata · relationships · embedded · VBA · XLM/DDE · OLE2 · IOC · ClamAV · YARA · threat intel · LibreOffice · sandbox · entropy · OPC · schema · fonts · MIME · signatures · NLP · correlation · AI 23 engines

①

Container & Format ID

▼

Validates ZIP (OOXML) or OLE2 container structure, checks part relationships, content-type map anomalies, embedded file signatures, and computes SHA-256 / MD5 / SHA-1 hashes for threat intelligence correlation.

②

Encryption Detection

▼

Detects encryption indicators, weak cipher modes (RC4 40-bit legacy), CryptoAPI structure anomalies, and password-hash artifacts. Encrypted documents can hide malicious macros from static scanners.

③

Metadata Forensics

▼

Extracts core document properties: author, last-saved-by, revision count, creation and modification timestamps, company, application version. Detects anti-forensic metadata tampering, mismatched authorship, and anomalous revision history patterns.

④

Relationship Forensics

▼

Parses all OOXML .rels files to detect external relationships. Flags remote template injection (attachedTemplate), suspicious external links, IP-based URLs, UNC paths, and smb:// targets — all common initial-access techniques.

⑤

Embedded Payload Detection

▼

Scans document streams for embedded PE executables (MZ/PE headers), ELF binaries, shell scripts, nested archives, OLE Package objects (dropper technique), and PowerShell / certutil / mshta / regsvr32 LOLBin invocations.

⑥

VBA Macro Analysis

▼

Runs olevba (JSON + text fallback), mraptor, and pcodedmp. Extracts VBA source, compiled p-code bytecode (executes even with stripped source), auto-exec triggers, suspicious APIs (Shell, CreateObject, URLDownloadToFile), obfuscation indicators, and embedded IOCs.

⑦

XLM / DDE Analysis

▼

Parses Excel 4.0 (XLM) macro sheets and DDE field references — a legacy attack vector still active in modern campaigns. Extracts formula chains and EXEC() / CALL() execution paths invisible to standard VBA scanners.

⑧

OLE2 Structure Analyzer

▼

Deep inspection of OLE compound document streams: storage tree reconstruction, directory entry enumeration, embedded PE / OLE objects, CLSID identification, sector chain integrity, and exploit-pattern byte sequences inside individual storage streams.

⑨

IOC & String Extraction

▼

Extracts all Indicators of Compromise from raw bytes and decoded streams: HTTP/HTTPS/FTP URLs, IPv4 addresses, domain names, email addresses, Windows file paths, registry key references, PowerShell fragments, Base64-encoded payloads, and embedded hash strings.

⑩

ClamAV Antivirus

▼

Runs the document through the ClamAV signature database — over 8 million malware signatures including Office macro exploits, macro droppers, and known weaponised document families. Detects encrypted documents and macro-carrying files.

⑪

YARA Rule Engine

▼

Runs 12 curated YARA rules compiled at runtime: VBA auto-exec dropper, heavy obfuscation, XLM shell execution, DDE injection, template injection, shellcode NOP sled, LOLBin references, process injection, embedded PE, encoded PowerShell, suspicious CLSID, and external connection patterns.

⑫

Threat Intelligence

▼

Correlates file hashes (SHA-256, MD5, SHA-1) and extracted IOCs against an offline threat intelligence database: URLhaus malicious URLs, MalwareBazaar hash index, ThreatFox IOC feed, and FeodoTracker C2 IP list.

⑬

LibreOffice Behavioural

▼

Renders the document in LibreOffice headless mode under a timeout. Captures macro-load attempts logged to stderr, rendering failures that indicate corrupt/exploit structure, and documents that trigger errors only when opened — behavioural signals invisible to static analysis.

⑭

Isolation Chamber Detonation

▼

Opens the document inside a fully isolated Linux namespace (unshare --net --pid --fork --mount --ipc) while strace monitors every syscall. Detects network beacon attempts, process spawning, suspicious file writes, and LOLBin execution that only manifest at open-time.

⑮

Entropy & Compression Anomaly

▼

Computes Shannon entropy (bits/byte) for every stream and XML part inside OOXML and OLE2 containers. Flags encrypted or compressed blobs (≥7.2 bits/byte), unexpectedly high-entropy XML parts (≥6.5), and regions that evade static pattern matching through encoding. Images, fonts, and legitimate compressed streams are automatically exempted.

⑯

OPC Rule Engine

▼

Validates Open Packaging Conventions (ECMA-376 Part 2) structural rules: [Content_Types].xml existence and parse integrity, all declared parts present in ZIP, no duplicate part names, well-formed .rels files, no path traversal in internal targets, and suspicious external targets (IP addresses, UNC paths, smb://, file://). Malformed OPC is a reliable indicator of intentional weaponisation or parser confusion attacks.

⑰

OOXML Schema Validator

▼

Checks every XML and VML part for: well-formedness violations that parsers silently recover from, XXE injection attempts (SYSTEM/PUBLIC DOCTYPE declarations), embedded null bytes used to terminate string comparisons, and oversized CDATA blobs (>50 KB) in unexpected document parts — a steganographic payload carrier technique.

⑱

Font & Theme Forensics

▼

Inspects embedded font files (TTF/OTF/EOT/WOFF), theme XML for suspicious content (remote URLs, UNC paths, LOLBin references), custom document properties with encoded payloads, custom XML data islands (hidden structured data stores), and external data connections (xl/connections.xml) — channels used to exfiltrate data or stage secondary payloads.

⑲

MIME / Transport Forensics

▼

Deep analysis of .eml and .msg email files: sender/Reply-To domain mismatch (spoofing indicator), SPF/DKIM/DMARC authentication failure headers, social engineering subject patterns, executable attachment detection (PE, scripts, LNK, HTA), embedded URL extraction, X-Mailer fingerprinting, and MSG binary parsing via extract-msg with raw fallback.

⑳

Digital Signature Forensics

▼

Extracts and analyses digital signatures from OOXML (_xmlsignatures/) and OLE2 (\x05DigitalSignature stream). Identifies signer identity, timestamp, and signing algorithm. Detects weak algorithms (SHA-1/MD5), unsigned VBA macros inside a signed document structure — a known bypass to make untrusted macros appear trusted to enterprise security controls.

㉑

NLP Social Engineering Classifier

▼

Two-tier detection of social engineering language in document text. Tier 1: fast regex matching across 5 categories — urgency, impersonation (CEO/IT/government), financial fraud (wire transfer, gift cards), credential harvesting, and security alert spoofing. Tier 2: Qwen 2.5 LLM classifies intent, technique, and target persona when regex patterns fire — providing explainable AI-based social engineering detection.

㉒

Intelligent Correlation Engine

▼

Cross-engine signal aggregation that identifies attack chains no single engine can see alone. Applies 10 correlation rules: DROPPER_CHAIN, C2_BEACON, TEMPLATE_INJECTION, ENCRYPTED_PAYLOAD, CREDENTIAL_THEFT_UNC, LIVING_OFF_LAND, TARGETED_ATTACK, SIGNATURE_BYPASS, HIGH_CONFIDENCE_MALWARE, and PARSER_CONFUSION — each with MITRE ATT&CK mapping and confidence scoring. Runs after all structural engines, before AI.

🤖

AI Forensic Report

▼

Qwen 2.5 3B analyses all engine findings and produces a structured verdict: MALICIOUS / SUSPICIOUS / LIKELY_BENIGN / CLEAN with confidence level, executive summary, attack chain narrative, MITRE ATT&CK techniques, and recommended actions. Runs last, after all other engines complete.

Supported: .docx .docm .doc .xlsx .xlsm .xlsb .xls .pptx .pptm .ppt .rtf .one .vsdx .vsdm .msg .eml .ics .mdb .accdb

🧹 Sanitize Options 1 max safety 3 surgical

📄 Convert to PDF Max Safety

✂️ Strip Macros Surgical

🏷️ Strip Metadata Surgical

🔄 Convert to OOXML Surgical

🗂️

Drop your Office document here or click to browse

23 forensic engines · ClamAV · YARA · VBA/XLM · OLE · IOC · NLP · Entropy · OPC · Signatures · Threat Intel · Sandbox · Correlation · AI Report · MITRE ATT&CK · Max 10 MB

📄

We'll email the full forensic report when the scan completes.

🏢 Free scanner: 10 MB limit — covers 99% of real-world malicious Office documents (most weaponised docs are under 2 MB). Need to scan larger files? Enterprise deployment removes all size limits.

Uploading…

① Container ② Crypto ③ Metadata ④ Relationships ⑤ Embedded ⑥ Macros ⑦ XLM ⑧ OLE ⑨ IOC ⑩ ClamAV ⑪ YARA ⑫ Threat Intel ⑬ LibreOffice ⑭ Sandbox ⑮ Entropy ⑯ OPC ⑰ Schema ⑱ Fonts ⑲ MIME ⑳ Signatures ㉑ NLP ㉒ Correlation 🤖 AI Report

Forensic Console idle

────────── Forensic console ready — awaiting scan

🧹 Sanitize Document Remove active content · produce clean output · original unchanged

📄 Convert to PDF Max Safety Renders via LibreOffice — destroys all macros, VBA, XLM, OLE objects, and active content with certainty. Produces a static PDF.

✂️ Strip Macros Surgical Removes all VBA and XLM macros while preserving document content, formatting, and structure.

🏷️ Strip Metadata Surgical Removes author, revision history, last-saved-by, company, and all custom properties from the document.

🔄 Convert to OOXML Surgical Converts legacy OLE2 formats (doc/xls/ppt) to modern OOXML — eliminates OLE exploit surface while preserving content.

Sanitizing…

⬇ Download Sanitized File

⚠️