PDF Tools Universal File Forensics Scanner

🔬 Universal File Forensics Scanner

Forensic analysis of all file types — images, audio, video, archives, executables, scripts, databases, fonts, certificates, and network captures — across 21 independent engines: file identification (magic bytes, MIME, polyglot detection), entropy & compression anomaly, metadata forensics (EXIF/GPS/ID3), IOC & string extraction (URLs, IPs, domains, base64, reverse shells), binary artifact analysis, PE executable analysis (imports, sections, anti-debug, overlay), ELF binary analysis (rootkit indicators, dangerous syscalls), archive inspection (zip bombs, path traversal, double-extension), image forensics & steganography (LSB chi-square, SVG JavaScript, PNG chunk abuse, JPEG trailer), script & code analysis (reverse shells, AMSI bypass, PHP webshells, obfuscation), ClamAV antivirus, YARA rule engine (20 universal rules), offline threat intelligence (URLhaus · MalwareBazaar · ThreatFox · FeodoTracker), intelligent correlation engine, and AI forensic report (Qwen 2.5 · MITRE ATT&CK · verdict · confidence). Zero data retention.

No ads. No tracking. No data sold. Ever.
🔬
Universal File Forensics Scanner
Deep forensic analysis of all file types — images, audio, video, archives, executables, scripts, databases, fonts, certificates, and network captures — across 21 independent engines: file identification (magic bytes, MIME, polyglot & extension mismatch detection), entropy & compression anomaly, metadata forensics (EXIF/GPS/ID3/container), IOC & string extraction (URLs, IPs, domains, Base64, PowerShell, reverse shell patterns), binary artifact analysis, PE executable analysis (imports, sections, anti-debug, overlay), ELF binary analysis (rootkit indicators, dangerous syscalls, UPX packing), archive inspection (zip bombs, path traversal, double-extension, nested archives), image forensics & steganography (LSB chi-square, SVG JavaScript injection, PNG chunk abuse, JPEG trailer data), script & code analysis (reverse shells, AMSI bypass, PHP webshells, obfuscation layers), document & markup forensics (HTML XSS, XXE, CSV formula injection, YAML deserialization), font forensics (embedded executables, off-spec tables, CVE patterns), certificate & key forensics (weak keys, expired certs, private key exposure), network capture forensics (C2 ports, DNS exfiltration, cleartext credentials), ClamAV antivirus, YARA rule engine (20 universal threat rules), offline threat intelligence (URLhaus · MalwareBazaar · ThreatFox · FeodoTracker), intelligent cross-engine correlation, and AI forensic report (Qwen 2.5 · MITRE ATT&CK · verdict · confidence). Zero data retention — file deleted immediately after analysis.
21-Engine Universal Forensic Architecture file-id · entropy · metadata · ioc · strings · pe · elf · archive · image · watermark · script · document · font · certificate · network · clamav · yara · threat-intel · correlation · ai 21 engines
File Identification
Magic bytes detection for 40+ file signatures, MIME type via libmagic, extension/MIME mismatch detection, polyglot file detection (e.g. executable disguised as image), SHA-256/MD5/SHA-1 hash computation, JPEG trailing data check, and file size anomaly detection. Serves as ground truth for all other engines.
Entropy Analysis
Shannon entropy per byte on raw file data and individual ZIP stream members. Flags entropy >7.2 bits/byte as encrypted or packed payload. ZIP-based formats (APK, JAR, EPUB, CBZ) are analysed stream-by-stream. Compressed media formats (JPEG, MP3, MP4) are excluded from false-positive flagging as their high entropy is normal.
Metadata Forensics
EXIF extraction for images (GPS coordinates — privacy leak and target identification, camera make/model, software tags). ID3/Mutagen tags for audio (embedded URLs, unusually long comments). Video container metadata. Flags GPS privacy leaks, suspicious software identifiers (steghide, OpenStego), and EXIF thumbnail size anomalies.
IOC Extraction
Regex-based extraction of Indicators of Compromise from raw bytes and decoded text: HTTP/HTTPS/FTP URLs, external IP addresses, bare domain names, UNC paths, email addresses, Base64 blobs (with inline decode and re-scan), PowerShell invocations, WScript/Shell references, cryptocurrency wallet addresses, and 40+ suspicious command keywords (mshta, certutil, mimikatz, AMSI bypass strings, etc.).
String & Artifact Analysis
Extracts ASCII and UTF-16LE strings from binary files. Flags: Windows persistence registry keys (HKLM Run, Winlogon), suspicious process names (cmd.exe, mshta.exe, certutil.exe), dangerous Win32 APIs (VirtualAllocEx, WriteProcessMemory, CreateRemoteThread), anti-debugging calls (IsDebuggerPresent), Linux dangerous syscalls, privilege escalation, and cryptographic credentials embedded in binaries.
PE Executable Analysis
Full PE32/PE32+ header parsing using pefile (raw struct fallback). Detects: packed sections (UPX0, ASPack, Themida), writable+executable (W+X) sections indicating shellcode injection, dangerous import APIs (URLDownloadToFile, CreateRemoteThread, CryptGenKey), anti-debugging APIs (IsDebuggerPresent, CheckRemoteDebuggerPresent), self-deletion patterns, missing import table (packed binary), PE overlay data, and timestamp anomalies.
ELF Binary Analysis
Parses ELF headers (32/64-bit, big/little endian) via raw struct. Detects: UPX packing, dangerous libc calls (execve, ptrace, mprotect, setuid), network socket APIs, rootkit indicators (LD_PRELOAD, /proc/self/mem, LD_AUDIT), suspicious packed section names, stripped symbol table (obfuscated shared object), and RPATH/RUNPATH anomalies.
Archive Inspection
Analyses ZIP, TAR, 7Z, and RAR archives (including APK, JAR, EPUB, CBZ). Detects: zip bombs (decompression ratio >100:1 or >10,000 entries), path traversal filenames (../ attack), double-extension files (document.pdf.exe), executable or script files inside archives, password-protected archives (opaque to AV scanners), and deeply nested archive structures used to evade automated scanning.
Image Forensics & Steganography
SVG: detects embedded <script> tags, inline event handlers (onload, onclick), and base64 data URIs. PNG: chunk analysis — malicious URLs in tEXt/iTXt chunks, data after IEND. JPEG: trailing data after End-of-Image marker. GIF: header/trailer validation, script content. BMP: size anomalies. LSB steganography: chi-square test on red-channel LSBs — near-zero chi indicates LSBs overwritten with hidden payload.
Script & Code Analysis
Handles: sh/bash/zsh, PowerShell ps1/psm1, bat/cmd, Python, Ruby, JavaScript/TypeScript, PHP, Perl, Lua, VBScript, HTA, WSF, Groovy. Detects: multi-layer obfuscation (base64_decode+eval, hex encoding, char concat, ROT13, gzip inflate), reverse shell patterns (12 variants: bash, nc, ncat, socat, Python, Perl, Ruby, PHP, PowerShell TCPClient), AMSI bypass (AmsiScanBuffer patches, Disable-Amsi), hardcoded credentials/private keys, PHP webshells, dangerous JS/Python/shell/Ruby/Perl idioms, and inline base64 decode with nested command analysis.
ClamAV Antivirus
Runs the locally installed ClamAV scanner against its full local signature database (updated via freshclam). Covers known malware families, generic shellcode patterns, trojan and ransomware heuristics, and script-based threats. Zero network calls — entirely offline. ClamAV detection is one of the strongest confirmation signals for the Correlation Engine.
YARA Rule Engine
Matches against 20 universal threat rules covering all file types: PE dropper with auto-exec & process injection, UPX packing, shellcode NOP sleds & stack pivots, reverse shell patterns (bash/nc/ncat), PowerShell downloader (DownloadString + IEX), PHP webshell (eval+user-input), archive path traversal, ransomware indicators (shadow deletion + encryption APIs), credential stealer (Mimikatz strings), cryptocurrency miners (stratum+tcp), steganography tools, keylogger APIs, network scanner, AMSI bypass, Python reverse shell, SQL injection/xp_cmdshell, and dropper/downloader chains.
Threat Intelligence
Offline lookup of extracted IOCs (URLs, IPs, domains, file hashes) against four local PostgreSQL databases: URLhaus (malware distribution URLs), MalwareBazaar (malware sample SHA-256 hashes), ThreatFox (C2 indicators — IPs, domains, URLs), and FeodoTracker (botnet C2 addresses). Zero external API calls — all queries run against locally-synced threat intelligence feeds.
Correlation Engine
Cross-engine signal aggregation running after all primary engines complete. Applies escalation rules: DROPPER_CHAIN (IOC download + execution capability), HIGH_CONFIDENCE_MALWARE (2+ of ClamAV/YARA/ThreatIntel), PACKED_MALWARE (entropy + packing + AV), POLYGLOT_ATTACK (format mismatch + IOC), STEGO_PAYLOAD (image anomaly + IOC), ZIP_BOMB (critical archive finding), C2_BEACON_SCRIPT (obfuscated script + network IOC), REVERSE_SHELL_CONFIRMED, CREDENTIAL_THEFT, ARCHIVE_DROPPER, AMSI_EVASION. De-escalation: BENIGN_CORROBORATION (all primary engines clean).
AI Forensic Report
Self-hosted Qwen 2.5 LLM on remotellm (OpenAI-compatible API, zero third-party AI). Receives structured JSON of all engine findings and outputs: threat verdict (MALICIOUS / SUSPICIOUS / LIKELY_BENIGN / CLEAN), confidence level (HIGH/MEDIUM/LOW), 2–4 sentence executive summary, up to 6 key findings, attack chain description, MITRE ATT&CK technique mapping (up to 6 techniques), recommended actions, and risk rating 0–100.
Document & Markup Forensics
HTML/MHT: detects inline <script> tags, event handlers, <iframe>/<object>/<embed> elements, and external form actions (credential harvesting). XML/XSLT: XXE injection (<!ENTITY SYSTEM>, php://filter, gopher://), XSLT system() code execution. CSV/TSV: formula injection (=cmd|, DDE, HYPERLINK). YAML: deserialization gadgets (!!python/object, __class__). JSON: prototype pollution (__proto__, constructor keys). RTF: OLE embedding, remote template injection, DDE fields.
Font File Forensics
Covers TTF, OTF, WOFF, WOFF2, EOT, TTC. Checks for: embedded executable payloads (MZ/ELF signatures inside font data — font dropper technique), shellcode NOP sleds, off-spec SFNT table offsets outside file bounds (heap overflow exploit family — CVE-2011-3402 variants), embedded bitmap tables (EBDT/EBLC — attack vector), dangerous OpenType feature tags, PostScript systemdict and exec operator abuse in CFF/Type1 fonts, and unusually large font files indicating embedded payload.
Certificate & Key Forensics
Covers PEM, CRT, CER, DER, P12/PFX, JKS, KEY, P7B. Detects: exposed private key material (RSA PKCS#1/PKCS#8, EC, DSA, OpenSSH private key blocks), expired certificates, unusually long validity periods (>10 years), weak RSA keys (<2048 bits), weak EC keys (<256 bits), deprecated DSA keys, self-signed certificates, IP-in-CN (RFC violation), suspicious CN patterns (attack toolkit fingerprints), excessive SAN count (>50 domains), and PKCS#12 bundles lacking password protection assessment.
Network Capture Forensics
Covers PCAP, CAP, PCAPNG (via dpkt with raw struct fallback). Extracts: packet count, unique source/destination IPs, destination port distribution. Flags: connections to C2-associated ports (4444, 1337, 31337, 6667, 9090), port scan indicators (>100 unique destination ports), suspicious HTTP user-agents (python-requests, curl, Go-http-client), commands in HTTP payloads (shell/PowerShell), cleartext credentials (Basic auth, FTP USER/PASS, SMTP AUTH PLAIN), DNS exfiltration (unusually long subdomain labels), and high destination IP diversity (botnet patterns).
Watermark Detection
Covers images (JPG, PNG, TIFF, WebP, RAW, SVG), audio (MP3, WAV, FLAC, OGG, WMA, M4A), video (MP4, MOV, AVI, MKV, WMV), and EPUB. Detection layers: ExifTool (gold-standard metadata extraction across all formats) → Pillow EXIF/IPTC/XMPmutagen (ID3, MP4 atoms, Vorbis comments, WMA ASF) → raw struct parsing (RIFF INFO chunks, MP4 ©cpy/cprt atoms, PNG tEXt/iTXt, FLAC VORBIS_COMMENT) → JPEG DCT coefficient analysis for invisible/frequency-domain watermarks (APP14/Digimarc, quantisation table anomalies, HF energy ratio) → alpha-channel overlay detection (semi-transparent text/logo watermarks) → OCR (Tesseract) to extract burned-in visible text watermarks from image pixels → SVG <text> extractionvendor signature matching (Digimarc, Getty Images, Shutterstock, Adobe Stock, iStock, Reuters, AP Photo, AFP, Alamy, Dreamstime, 123RF, Depositphotos, Pond5, Corbis, WireImage, Dolby, SMPTE) → tracking/serial/licence code extraction from all metadata values → EPUB OPF manifest (dc:rights, dc:identifier, dc:creator, custom watermark metadata).
Isolation Chamber Detonation
Opens the file inside fully isolated Linux namespaces (unshare --net --pid --fork --mount --ipc) while strace monitors every syscall. Per-category detonation: images → ImageMagick convert (exposes ImageTragick-class parser CVEs); audio/video → ffprobe (exposes libavformat/libavcodec parser surface); archives → 7-Zip listing (detects zip bombs, path traversal extraction); HTML/SVG → Chromium headless (executes embedded JavaScript, detects network calls and DOM manipulation); fonts → fc-scan (exposes freetype parser surface); EPUB → unzip listing. Detects: outbound internet socket creation (definitively malicious in isolated network namespace), unexpected binary execution (parser exploit code execution), anonymous executable memory (shellcode injection pattern), writes to sensitive paths (/etc/, /root/, /home/), excessive fork/clone (process-bomb). Scripts and executables are never detonated. Resource-capped via prlimit (768 MB RAM, 64 MB output, 128 processes).
Images JPG PNG GIF BMP TIFF WEBP PSD SVG ICO HEIC RAW CR2 NEF DNG
Audio MP3 WAV FLAC OGG AAC M4A WMA OPUS AIFF MID
Video MP4 AVI MOV MKV WEBM WMV FLV TS 3GP MPEG
Archives ZIP TAR GZ 7Z RAR ISO APK JAR DEB RPM EPUB CAB CBZ
Executables EXE DLL SYS OCX SCR MSI ELF SO COM CPL
Scripts SH PS1 BAT CMD PY JS PHP RB LUA VBS HTA WSF PL
Data & Web HTML XML JSON YAML CSV SQLITE DB TTF OTF WOFF PEM CRT KEY
Network PCAP CAP PCAPNG
🔬
Drop Any File to Scan
Drag & drop or click · Max 50 MB
All file types supported — except PDFs and Office documents
📄
0%
Scan Log
File Information
Cryptographic Hashes
Top Findings
All Findings 0
Engine Data

    
© Copyright & Ownership Fields
🧪 Isolation Chamber Detonation
Sandbox result not yet available
📊 Syscall Profile
No syscall data