Universal File Forensics Scanner
Deep forensic analysis of any file type except PDFs and
Office documents — across
24 independent engines spanning static analysis, content forensics,
dynamic sandboxing, threat intelligence, campaign tracking, and AI-driven correlation.
Static Analysis
10 engines
Magic bytes · MIME mismatch · polyglot detection · Shannon entropy · 255-key XOR brute-force over 512 KB · EXIF/GPS/ID3 extraction · PE/ELF header parsing · zip bomb & path traversal · reverse shells · HTML XSS/XXE
Content Forensics
5 engines
LSB chi-square steganography · Tesseract OCR visible watermarks · alpha-channel overlay extraction · DCT frequency watermarks · PE-in-font detection · weak RSA/EC keys · PCAP C2 port & DNS exfiltration
Dynamic Analysis
3 engines
Isolated Linux namespace · strace syscalls · ltrace library calls · in-memory YARA dump (Meterpreter/CobaltStrike) · fake DNS+HTTP capture · CPU masking · LibreOffice macros · Playwright — Wine .exe/.ps1/.hta detonation — KVM/QEMU real Windows kernel (high-risk samples only)
Threat Intelligence
3 engines
Offline ClamAV signature database · 20 universal YARA rules (PE dropper · shellcode · ransomware · C2 · AMSI bypass · credential stealer · cryptominer) · four locally-synced PostgreSQL threat feeds · zero external API calls
Campaign Intelligence
1 engine
TLSH similarity clustering across full scan history · deterministic campaign naming (e.g. PHANTOM-KRAKEN-07) · malware family classification (CobaltStrike · Meterpreter · Emotet · QakBot · Mirai · AsyncRAT · RedLine + more) · D3 force-directed graph
AI Correlation
2 engines
Cross-engine signal aggregation (11 escalation rules) · self-hosted Qwen 2.5 synthesizes all findings into verdict, confidence, executive summary, up to 6 MITRE techniques, risk score 0–100 · no third-party AI
● SYS:READY — 24-Engine Forensic Architecture
file-id • entropy • metadata • ioc • strings • pe • elf • archive • image • watermark • script • document • font • certificate • network • sandbox • wine • winvm • clamav • yara • threat-intel • scan-intel • correlation • ai
24 / 24
①
File Identification
▼
Magic bytes detection for 40+ file signatures, MIME type via libmagic, extension/MIME mismatch detection, polyglot file detection (e.g. executable disguised as image), SHA-256/MD5/SHA-1 hash computation, JPEG trailing data check, and file size anomaly detection. Serves as ground truth for all other engines.
②
Entropy Analysis
▼
Shannon entropy per byte on raw file data and individual ZIP stream members. Flags entropy >7.2 bits/byte as encrypted or packed payload. ZIP-based formats (APK, JAR, EPUB, CBZ) are analysed stream-by-stream. Compressed media formats (JPEG, MP3, MP4) are excluded from false-positive flagging as their high entropy is normal.
③
Metadata Forensics
▼
EXIF extraction for images (GPS coordinates — privacy leak and target identification, camera make/model, software tags). ID3/Mutagen tags for audio (embedded URLs, unusually long comments). Video container metadata. Flags GPS privacy leaks, suspicious software identifiers (steghide, OpenStego), and EXIF thumbnail size anomalies.
④
IOC Extraction
▼
Regex-based extraction of Indicators of Compromise from raw bytes and decoded text: HTTP/HTTPS/FTP URLs, external IP addresses, bare domain names, UNC paths, email addresses, Base64 blobs (with inline decode and re-scan), PowerShell invocations, WScript/Shell references, cryptocurrency wallet addresses, and 40+ suspicious command keywords (mshta, certutil, mimikatz, AMSI bypass strings, etc.).
⑤
String & Artifact Analysis
▼
Extracts ASCII and UTF-16LE strings from binary files. Flags: Windows persistence registry keys (HKLM Run, Winlogon), suspicious process names (
cmd.exe, mshta.exe, certutil.exe), dangerous Win32 APIs (VirtualAllocEx, WriteProcessMemory, CreateRemoteThread), anti-debugging calls (IsDebuggerPresent), Linux dangerous syscalls, privilege escalation, and cryptographic credentials embedded in binaries.⑥
PE Executable Analysis
▼
Full PE32/PE32+ header parsing using pefile (raw struct fallback). Detects: packed sections (UPX0, ASPack, Themida), writable+executable (W+X) sections indicating shellcode injection, dangerous import APIs (
URLDownloadToFile, CreateRemoteThread, CryptGenKey), anti-debugging APIs (IsDebuggerPresent, CheckRemoteDebuggerPresent), self-deletion patterns, missing import table (packed binary), PE overlay data, and timestamp anomalies.⑦
ELF Binary Analysis
▼
Parses ELF headers (32/64-bit, big/little endian) via raw struct. Detects: UPX packing, dangerous libc calls (
execve, ptrace, mprotect, setuid), network socket APIs, rootkit indicators (LD_PRELOAD, /proc/self/mem, LD_AUDIT), suspicious packed section names, stripped symbol table (obfuscated shared object), and RPATH/RUNPATH anomalies.⑧
Archive Inspection
▼
Analyses ZIP, TAR, 7Z, and RAR archives (including APK, JAR, EPUB, CBZ). Detects: zip bombs (decompression ratio >100:1 or >10,000 entries), path traversal filenames (
../ attack), double-extension files (document.pdf.exe), executable or script files inside archives, password-protected archives (opaque to AV scanners), and deeply nested archive structures used to evade automated scanning.⑨
Image Forensics & Steganography
▼
SVG: detects embedded
<script> tags, inline event handlers (onload, onclick), and base64 data URIs. PNG: chunk analysis — malicious URLs in tEXt/iTXt chunks, data after IEND. JPEG: trailing data after End-of-Image marker. GIF: header/trailer validation, script content. BMP: size anomalies. LSB steganography: chi-square test on red-channel LSBs — near-zero chi indicates LSBs overwritten with hidden payload.⑩
Script & Code Analysis
▼
Handles: sh/bash/zsh, PowerShell ps1/psm1, bat/cmd, Python, Ruby, JavaScript/TypeScript, PHP, Perl, Lua, VBScript, HTA, WSF, Groovy. Detects: multi-layer obfuscation (base64_decode+eval, hex encoding, char concat, ROT13, gzip inflate), reverse shell patterns (12 variants: bash, nc, ncat, socat, Python, Perl, Ruby, PHP, PowerShell TCPClient), AMSI bypass (AmsiScanBuffer patches, Disable-Amsi), hardcoded credentials/private keys, PHP webshells, dangerous JS/Python/shell/Ruby/Perl idioms, and inline base64 decode with nested command analysis.
⑪
ClamAV Antivirus
▼
Runs the locally installed ClamAV scanner against its full local signature database (updated via freshclam). Covers known malware families, generic shellcode patterns, trojan and ransomware heuristics, and script-based threats. Zero network calls — entirely offline. ClamAV detection is one of the strongest confirmation signals for the Correlation Engine.
⑫
YARA Rule Engine
▼
Matches against 20 universal threat rules covering all file types: PE dropper with auto-exec & process injection, UPX packing, shellcode NOP sleds & stack pivots, reverse shell patterns (bash/nc/ncat), PowerShell downloader (DownloadString + IEX), PHP webshell (eval+user-input), archive path traversal, ransomware indicators (shadow deletion + encryption APIs), credential stealer (Mimikatz strings), cryptocurrency miners (stratum+tcp), steganography tools, keylogger APIs, network scanner, AMSI bypass, Python reverse shell, SQL injection/xp_cmdshell, and dropper/downloader chains.
⑬
Threat Intelligence
▼
Offline lookup of extracted IOCs (URLs, IPs, domains, file hashes) against four local PostgreSQL databases: URLhaus (malware distribution URLs), MalwareBazaar (malware sample SHA-256 hashes), ThreatFox (C2 indicators — IPs, domains, URLs), and FeodoTracker (botnet C2 addresses). Zero external API calls — all queries run against locally-synced threat intelligence feeds.
⑭
Correlation Engine
▼
Cross-engine signal aggregation running after all primary engines complete. Applies escalation rules: DROPPER_CHAIN (IOC download + execution capability), HIGH_CONFIDENCE_MALWARE (2+ of ClamAV/YARA/ThreatIntel), PACKED_MALWARE (entropy + packing + AV), POLYGLOT_ATTACK (format mismatch + IOC), STEGO_PAYLOAD (image anomaly + IOC), ZIP_BOMB (critical archive finding), C2_BEACON_SCRIPT (obfuscated script + network IOC), REVERSE_SHELL_CONFIRMED, CREDENTIAL_THEFT, ARCHIVE_DROPPER, AMSI_EVASION. De-escalation: BENIGN_CORROBORATION (all primary engines clean).
⑮
AI Forensic Report
▼
Self-hosted Qwen 2.5 LLM on remotellm (OpenAI-compatible API, zero third-party AI). Receives structured JSON of all engine findings and outputs: threat verdict (MALICIOUS / SUSPICIOUS / LIKELY_BENIGN / CLEAN), confidence level (HIGH/MEDIUM/LOW), 2–4 sentence executive summary, up to 6 key findings, attack chain description, MITRE ATT&CK technique mapping (up to 6 techniques), recommended actions, and risk rating 0–100.
⑯
Document & Markup Forensics
▼
HTML/MHT: detects inline
<script> tags, event handlers, <iframe>/<object>/<embed> elements, and external form actions (credential harvesting). XML/XSLT: XXE injection (<!ENTITY SYSTEM>, php://filter, gopher://), XSLT system() code execution. CSV/TSV: formula injection (=cmd|, DDE, HYPERLINK). YAML: deserialization gadgets (!!python/object, __class__). JSON: prototype pollution (__proto__, constructor keys). RTF: OLE embedding, remote template injection, DDE fields.⑰
Font File Forensics
▼
Covers TTF, OTF, WOFF, WOFF2, EOT, TTC. Checks for: embedded executable payloads (MZ/ELF signatures inside font data — font dropper technique), shellcode NOP sleds, off-spec SFNT table offsets outside file bounds (heap overflow exploit family — CVE-2011-3402 variants), embedded bitmap tables (EBDT/EBLC — attack vector), dangerous OpenType feature tags, PostScript
systemdict and exec operator abuse in CFF/Type1 fonts, and unusually large font files indicating embedded payload.⑱
Certificate & Key Forensics
▼
Covers PEM, CRT, CER, DER, P12/PFX, JKS, KEY, P7B. Detects: exposed private key material (RSA PKCS#1/PKCS#8, EC, DSA, OpenSSH private key blocks), expired certificates, unusually long validity periods (>10 years), weak RSA keys (<2048 bits), weak EC keys (<256 bits), deprecated DSA keys, self-signed certificates, IP-in-CN (RFC violation), suspicious CN patterns (attack toolkit fingerprints), excessive SAN count (>50 domains), and PKCS#12 bundles lacking password protection assessment.
⑲
Network Capture Forensics
▼
Covers PCAP, CAP, PCAPNG (via dpkt with raw struct fallback). Extracts: packet count, unique source/destination IPs, destination port distribution. Flags: connections to C2-associated ports (4444, 1337, 31337, 6667, 9090), port scan indicators (>100 unique destination ports), suspicious HTTP user-agents (
python-requests, curl, Go-http-client), commands in HTTP payloads (shell/PowerShell), cleartext credentials (Basic auth, FTP USER/PASS, SMTP AUTH PLAIN), DNS exfiltration (unusually long subdomain labels), and high destination IP diversity (botnet patterns).⑳
Watermark Detection
▼
Covers images (JPG, PNG, TIFF, WebP, RAW, SVG), audio (MP3, WAV, FLAC, OGG, WMA, M4A), video (MP4, MOV, AVI, MKV, WMV), and EPUB. Detection layers: ExifTool (gold-standard metadata extraction across all formats) → Pillow EXIF/IPTC/XMP → mutagen (ID3, MP4 atoms, Vorbis comments, WMA ASF) → raw struct parsing (RIFF INFO chunks, MP4 ©cpy/cprt atoms, PNG tEXt/iTXt, FLAC VORBIS_COMMENT) → JPEG DCT coefficient analysis for invisible/frequency-domain watermarks (APP14/Digimarc, quantisation table anomalies, HF energy ratio) → alpha-channel overlay detection (semi-transparent text/logo watermarks) → OCR (Tesseract) to extract burned-in visible text watermarks from image pixels → SVG <text> extraction → vendor signature matching (Digimarc, Getty Images, Shutterstock, Adobe Stock, iStock, Reuters, AP Photo, AFP, Alamy, Dreamstime, 123RF, Depositphotos, Pond5, Corbis, WireImage, Dolby, SMPTE) → tracking/serial/licence code extraction from all metadata values → EPUB OPF manifest (dc:rights, dc:identifier, dc:creator, custom watermark metadata).
㉑
Isolation Chamber Detonation
▼
Six-layer dynamic analysis inside fully isolated Linux namespaces (
unshare --net --pid --fork --mount --ipc). Pass 1 — strace: syscall-level tracing (network beacons, shellcode mmap, unexpected process spawns, sensitive path writes, anti-sandbox env probing via /proc/1/cgroup/DMI/ptrace). Pass 2 — ltrace: library-level tracing (system, popen, execv, dlopen, SSL_connect, getenv sandbox fingerprinting). Pass 3 — memory dump analysis: polls /proc/<pid>/maps during detonation, dumps every anonymous rwxp region, YARA-scans for PE/ELF payloads, NOP sleds, GetPC shellcode, PEB-walk API resolution, Meterpreter, CobaltStrike beacons, and reverse-shell syscall sequences — catches fileless malware that never touches disk. Pass 4 — fake network capture: DNS server resolves all queries to 127.0.0.1; HTTP/443/8080 listeners intercept every request, confirming C2 or download intent without permitting outbound traffic. Pass 5 — CPU/VM fingerprint masking: /proc/cpuinfo bind-mounted with a realistic Intel i7-8700K profile (no hypervisor flag) to defeat VM-detection evasion; Faketime (LD_PRELOAD) freezes clock at 2023-06-15 09:30:00 to defeat sleep/time-bomb staging. Pass 6 — application-layer detonation: Office macro documents (.doc/.docm/.xls/.xlsm/.xlsb/.ppt/.pptm/.odt) opened via LibreOffice headless with macro security disabled; HTML/SVG via Playwright + Chromium with mouse interaction simulation (click, scroll, button/link activation, deferred-JS wait). Filesystem diff and process tree recorded for every pass. Per-category detonators: images → ImageMagick; audio/video → ffprobe; archives → 7-Zip; Office macros → LibreOffice; HTML/SVG → Playwright + Chromium; fonts → fc-scan; EPUB → unzip. Scripts and executables are never detonated. Resource-capped via prlimit (768 MB RAM, 64 MB output, 512 processes).㉒
Wine Execution Layer
▼
Windows PE executables and scripts (.exe/.dll/.msi/.bat/.ps1/.vbs/.hta/.wsf) detonated inside Wine 9.0 running in an isolated Linux namespace with the same six-layer instrumentation as the isolation chamber: strace syscalls, ltrace library calls, memory YARA (PE/shellcode/Meterpreter/CobaltStrike), fake DNS+HTTP network capture, CPU/VM fingerprint masking, and faketime time-warp. Captures registry writes, process spawns, network connection attempts, and in-memory payloads — all without modifying the host system.
㉓
Real Windows Micro-VM
▼
KVM/QEMU Windows 10 micro-VM with genuine Windows kernel — not emulation. COW QCOW2 overlay per scan (base image never modified); Skylake-Client-v4 CPU with
-hypervisor bit cleared (defeats hypervisor detection); no network adapter (prevents real C2). File injected via QEMU Guest Agent (QGA) protocol over Unix socket. Pre-staged PowerShell detonation agent: monitors process spawns, 6×5 s netstat snapshots (ESTABLISHED/SYN_SENT), registry diff (Run/RunOnce/Winlogon/Services), and filesystem writes to Public/Temp/AppData. 30 s instrumented execution. Triggered only for high-risk samples (risk_score ≥ 60, Windows executable extension).㉔
Cross-Scan Intelligence
▼
TLSH fuzzy hash computed for every scan and compared against all previous scans within ±50 Hamming distance to identify structurally/behaviourally related samples. Assigns a deterministic campaign name (e.g. PHANTOM-KRAKEN-07) to malware clusters seeded from SHA-256 of cluster ID. Classifies malware family (CobaltStrike · Meterpreter · Emotet · QakBot · Mirai · AsyncRAT · RedLine · Ransomware · Cryptominer · Shellcode) via YARA rule name overlap, ClamAV hit strings, and MITRE TTP intersection. 90-day activity trends per campaign. Every scan stored to PostgreSQL history for future correlation.
Images
JPG
PNG
GIF
BMP
TIFF
WEBP
PSD
SVG
ICO
HEIC
RAW
CR2
NEF
DNG
Audio
MP3
WAV
FLAC
OGG
AAC
M4A
WMA
OPUS
AIFF
MID
Video
MP4
AVI
MOV
MKV
WEBM
WMV
FLV
TS
3GP
MPEG
Archives
ZIP
TAR
GZ
7Z
RAR
ISO
APK
JAR
DEB
RPM
EPUB
CAB
CBZ
Executables
EXE
DLL
SYS
OCX
SCR
MSI
ELF
SO
COM
CPL
Scripts
SH
PS1
BAT
CMD
PY
JS
PHP
RB
LUA
VBS
HTA
WSF
PL
Data & Web
HTML
XML
JSON
YAML
CSV
SQLITE
DB
TTF
OTF
WOFF
PEM
CRT
KEY
Network
PCAP
CAP
PCAPNG
Initialize Scan Target
Drop file or click to load · Max 50 MB
All file types supported except PDFs and Office documents — use their dedicated scanners
All file types supported except PDFs and Office documents — use their dedicated scanners
24 engines
zero retention
offline AI
real Windows VM
MITRE ATT&CK
0%
Scan Terminal
File Information
Cryptographic Hashes
Top Findings
All Findings 0
Engine Data
© Copyright & Ownership Fields
No watermark engine data yet
🧪 Isolation Chamber Detonation
Sandbox result not yet available
📊 Syscall Profile
No syscall data
💻 Wine 9.0 — Linux namespace detonation
Windows detonation results will appear here
Intelligence correlation will appear here
🕸 Full Scan Graph
Visualize all scan history, malware clusters, and C2 infrastructure overlap across every file scanned.
Open Intelligence Graph →