🔬

Universal File Forensics Scanner

Deep forensic analysis of any file type except PDFs and Office documents — across 23 independent engines spanning static analysis, content forensics, dynamic sandboxing, threat intelligence, campaign tracking, and AI-driven correlation.

Static Analysis

10 engines

File IDEntropyMetadataIOC Strings + XORPEELFArchive ScriptDocument

Magic bytes · MIME mismatch · polyglot detection · Shannon entropy · 255-key XOR brute-force over 512 KB · EXIF/GPS/ID3 extraction · PE/ELF header parsing · zip bomb & path traversal · reverse shells · HTML XSS/XXE

Content Forensics

5 engines

Image + StegoWatermarkFont CertificateNetwork Capture

LSB chi-square steganography · Tesseract OCR visible watermarks · alpha-channel overlay extraction · DCT frequency watermarks · PE-in-font detection · weak RSA/EC keys · PCAP C2 port & DNS exfiltration

Dynamic Analysis

3 engines

6-layer SandboxWine 9.0Windows 10 VM

Isolated Linux namespace · strace syscalls · ltrace library calls · in-memory YARA dump (Meterpreter/CobaltStrike) · fake DNS+HTTP capture · CPU masking · LibreOffice macros · Playwright — Wine .exe/.ps1/.hta detonation — KVM/QEMU real Windows kernel (high-risk samples only)

Threat Intelligence

3 engines

ClamAVYARA · 20 rules URLhausMalwareBazaarThreatFoxFeodoTracker

Offline ClamAV signature database · 20 universal YARA rules (PE dropper · shellcode · ransomware · C2 · AMSI bypass · credential stealer · cryptominer) · four locally-synced PostgreSQL threat feeds · zero external API calls

Campaign Intelligence

1 engine

TLSH fuzzy hashNamed clusters Family classification90-day trendsD3 graph

TLSH similarity clustering across full scan history · deterministic campaign naming (e.g. PHANTOM-KRAKEN-07) · malware family classification (CobaltStrike · Meterpreter · Emotet · QakBot · Mirai · AsyncRAT · RedLine + more) · D3 force-directed graph

AI Correlation

2 engines

Qwen 2.5 · local LLMMITRE ATT&CK Cross-engine correlation

Cross-engine signal aggregation (11 escalation rules) · self-hosted Qwen 2.5 synthesizes all findings into verdict, confidence, executive summary, up to 6 MITRE techniques, risk score 0–100 · no third-party AI

● Zero data retention — file deleted immediately after analysis ● Offline AI — Qwen 2.5 on local hardware, never leaves the server ● All threat intelligence synced locally — zero external API calls during scanning

● SYS:READY — 24-Engine Forensic Architecture file-id • entropy • metadata • ioc • strings • pe • elf • archive • image • watermark • script • document • font • certificate • network • sandbox • wine • winvm • clamav • yara • threat-intel • scan-intel • correlation • ai 24 / 24

①

File Identification

▼

Magic bytes detection for 40+ file signatures, MIME type via libmagic, extension/MIME mismatch detection, polyglot file detection (e.g. executable disguised as image), SHA-256/MD5/SHA-1 hash computation, JPEG trailing data check, and file size anomaly detection. Serves as ground truth for all other engines.

②

Entropy Analysis

▼

Shannon entropy per byte on raw file data and individual ZIP stream members. Flags entropy >7.2 bits/byte as encrypted or packed payload. ZIP-based formats (APK, JAR, EPUB, CBZ) are analysed stream-by-stream. Compressed media formats (JPEG, MP3, MP4) are excluded from false-positive flagging as their high entropy is normal.

③

Metadata Forensics

▼

EXIF extraction for images (GPS coordinates — privacy leak and target identification, camera make/model, software tags). ID3/Mutagen tags for audio (embedded URLs, unusually long comments). Video container metadata. Flags GPS privacy leaks, suspicious software identifiers (steghide, OpenStego), and EXIF thumbnail size anomalies.

④

IOC Extraction

▼

Regex-based extraction of Indicators of Compromise from raw bytes and decoded text: HTTP/HTTPS/FTP URLs, external IP addresses, bare domain names, UNC paths, email addresses, Base64 blobs (with inline decode and re-scan), PowerShell invocations, WScript/Shell references, cryptocurrency wallet addresses, and 40+ suspicious command keywords (mshta, certutil, mimikatz, AMSI bypass strings, etc.).

⑤

String & Artifact Analysis

▼

Extracts ASCII and UTF-16LE strings from binary files. Flags: Windows persistence registry keys (HKLM Run, Winlogon), suspicious process names (cmd.exe, mshta.exe, certutil.exe), dangerous Win32 APIs (VirtualAllocEx, WriteProcessMemory, CreateRemoteThread), anti-debugging calls (IsDebuggerPresent), Linux dangerous syscalls, privilege escalation, and cryptographic credentials embedded in binaries.

⑥

PE Executable Analysis

▼

Full PE32/PE32+ header parsing using pefile (raw struct fallback). Detects: packed sections (UPX0, ASPack, Themida), writable+executable (W+X) sections indicating shellcode injection, dangerous import APIs (URLDownloadToFile, CreateRemoteThread, CryptGenKey), anti-debugging APIs (IsDebuggerPresent, CheckRemoteDebuggerPresent), self-deletion patterns, missing import table (packed binary), PE overlay data, and timestamp anomalies.

⑦

ELF Binary Analysis

▼

Parses ELF headers (32/64-bit, big/little endian) via raw struct. Detects: UPX packing, dangerous libc calls (execve, ptrace, mprotect, setuid), network socket APIs, rootkit indicators (LD_PRELOAD, /proc/self/mem, LD_AUDIT), suspicious packed section names, stripped symbol table (obfuscated shared object), and RPATH/RUNPATH anomalies.

⑧

Archive Inspection

▼

Analyses ZIP, TAR, 7Z, and RAR archives (including APK, JAR, EPUB, CBZ). Detects: zip bombs (decompression ratio >100:1 or >10,000 entries), path traversal filenames (../ attack), double-extension files (document.pdf.exe), executable or script files inside archives, password-protected archives (opaque to AV scanners), and deeply nested archive structures used to evade automated scanning.

⑨

Image Forensics & Steganography

▼

SVG: detects embedded <script> tags, inline event handlers (onload, onclick), and base64 data URIs. PNG: chunk analysis — malicious URLs in tEXt/iTXt chunks, data after IEND. JPEG: trailing data after End-of-Image marker. GIF: header/trailer validation, script content. BMP: size anomalies. LSB steganography: chi-square test on red-channel LSBs — near-zero chi indicates LSBs overwritten with hidden payload.

⑩

Script & Code Analysis

▼

Handles: sh/bash/zsh, PowerShell ps1/psm1, bat/cmd, Python, Ruby, JavaScript/TypeScript, PHP, Perl, Lua, VBScript, HTA, WSF, Groovy. Detects: multi-layer obfuscation (base64_decode+eval, hex encoding, char concat, ROT13, gzip inflate), reverse shell patterns (12 variants: bash, nc, ncat, socat, Python, Perl, Ruby, PHP, PowerShell TCPClient), AMSI bypass (AmsiScanBuffer patches, Disable-Amsi), hardcoded credentials/private keys, PHP webshells, dangerous JS/Python/shell/Ruby/Perl idioms, and inline base64 decode with nested command analysis.

⑪

ClamAV Antivirus

▼

Runs the locally installed ClamAV scanner against its full local signature database (updated via freshclam). Covers known malware families, generic shellcode patterns, trojan and ransomware heuristics, and script-based threats. Zero network calls — entirely offline. ClamAV detection is one of the strongest confirmation signals for the Correlation Engine.

⑫

YARA Rule Engine

▼

Matches against 20 universal threat rules covering all file types: PE dropper with auto-exec & process injection, UPX packing, shellcode NOP sleds & stack pivots, reverse shell patterns (bash/nc/ncat), PowerShell downloader (DownloadString + IEX), PHP webshell (eval+user-input), archive path traversal, ransomware indicators (shadow deletion + encryption APIs), credential stealer (Mimikatz strings), cryptocurrency miners (stratum+tcp), steganography tools, keylogger APIs, network scanner, AMSI bypass, Python reverse shell, SQL injection/xp_cmdshell, and dropper/downloader chains.

⑬

Threat Intelligence

▼

Offline lookup of extracted IOCs (URLs, IPs, domains, file hashes) against four local PostgreSQL databases: URLhaus (malware distribution URLs), MalwareBazaar (malware sample SHA-256 hashes), ThreatFox (C2 indicators — IPs, domains, URLs), and FeodoTracker (botnet C2 addresses). Zero external API calls — all queries run against locally-synced threat intelligence feeds.

⑭

Correlation Engine

▼

Cross-engine signal aggregation running after all primary engines complete. Applies escalation rules: DROPPER_CHAIN (IOC download + execution capability), HIGH_CONFIDENCE_MALWARE (2+ of ClamAV/YARA/ThreatIntel), PACKED_MALWARE (entropy + packing + AV), POLYGLOT_ATTACK (format mismatch + IOC), STEGO_PAYLOAD (image anomaly + IOC), ZIP_BOMB (critical archive finding), C2_BEACON_SCRIPT (obfuscated script + network IOC), REVERSE_SHELL_CONFIRMED, CREDENTIAL_THEFT, ARCHIVE_DROPPER, AMSI_EVASION. De-escalation: BENIGN_CORROBORATION (all primary engines clean).

⑮

AI Forensic Report

▼

Self-hosted Qwen 2.5 LLM on remotellm (OpenAI-compatible API, zero third-party AI). Receives structured JSON of all engine findings and outputs: threat verdict (MALICIOUS / SUSPICIOUS / LIKELY_BENIGN / CLEAN), confidence level (HIGH/MEDIUM/LOW), 2–4 sentence executive summary, up to 6 key findings, attack chain description, MITRE ATT&CK technique mapping (up to 6 techniques), recommended actions, and risk rating 0–100.

⑯

Document & Markup Forensics

▼

HTML/MHT: detects inline <script> tags, event handlers, <iframe>/<object>/<embed> elements, and external form actions (credential harvesting). XML/XSLT: XXE injection (<!ENTITY SYSTEM>, php://filter, gopher://), XSLT system() code execution. CSV/TSV: formula injection (=cmd|, DDE, HYPERLINK). YAML: deserialization gadgets (!!python/object, __class__). JSON: prototype pollution (__proto__, constructor keys). RTF: OLE embedding, remote template injection, DDE fields.

⑰

Font File Forensics

▼

Covers TTF, OTF, WOFF, WOFF2, EOT, TTC. Checks for: embedded executable payloads (MZ/ELF signatures inside font data — font dropper technique), shellcode NOP sleds, off-spec SFNT table offsets outside file bounds (heap overflow exploit family — CVE-2011-3402 variants), embedded bitmap tables (EBDT/EBLC — attack vector), dangerous OpenType feature tags, PostScript systemdict and exec operator abuse in CFF/Type1 fonts, and unusually large font files indicating embedded payload.

⑱

Certificate & Key Forensics

▼

Covers PEM, CRT, CER, DER, P12/PFX, JKS, KEY, P7B. Detects: exposed private key material (RSA PKCS#1/PKCS#8, EC, DSA, OpenSSH private key blocks), expired certificates, unusually long validity periods (>10 years), weak RSA keys (<2048 bits), weak EC keys (<256 bits), deprecated DSA keys, self-signed certificates, IP-in-CN (RFC violation), suspicious CN patterns (attack toolkit fingerprints), excessive SAN count (>50 domains), and PKCS#12 bundles lacking password protection assessment.

⑲

Network Capture Forensics

▼

Covers PCAP, CAP, PCAPNG (via dpkt with raw struct fallback). Extracts: packet count, unique source/destination IPs, destination port distribution. Flags: connections to C2-associated ports (4444, 1337, 31337, 6667, 9090), port scan indicators (>100 unique destination ports), suspicious HTTP user-agents (python-requests, curl, Go-http-client), commands in HTTP payloads (shell/PowerShell), cleartext credentials (Basic auth, FTP USER/PASS, SMTP AUTH PLAIN), DNS exfiltration (unusually long subdomain labels), and high destination IP diversity (botnet patterns).

⑳

Watermark Detection

▼

Covers images (JPG, PNG, TIFF, WebP, RAW, SVG), audio (MP3, WAV, FLAC, OGG, WMA, M4A), video (MP4, MOV, AVI, MKV, WMV), and EPUB. Detection layers: ExifTool (gold-standard metadata extraction across all formats) → Pillow EXIF/IPTC/XMP → mutagen (ID3, MP4 atoms, Vorbis comments, WMA ASF) → raw struct parsing (RIFF INFO chunks, MP4 ©cpy/cprt atoms, PNG tEXt/iTXt, FLAC VORBIS_COMMENT) → JPEG DCT coefficient analysis for invisible/frequency-domain watermarks (APP14/Digimarc, quantisation table anomalies, HF energy ratio) → alpha-channel overlay detection (semi-transparent text/logo watermarks) → OCR (Tesseract) to extract burned-in visible text watermarks from image pixels → SVG <text> extraction → vendor signature matching (Digimarc, Getty Images, Shutterstock, Adobe Stock, iStock, Reuters, AP Photo, AFP, Alamy, Dreamstime, 123RF, Depositphotos, Pond5, Corbis, WireImage, Dolby, SMPTE) → tracking/serial/licence code extraction from all metadata values → EPUB OPF manifest (dc:rights, dc:identifier, dc:creator, custom watermark metadata).

㉑

Isolation Chamber Detonation

▼

Six-layer dynamic analysis inside fully isolated Linux namespaces (unshare --net --pid --fork --mount --ipc). Pass 1 — strace: syscall-level tracing (network beacons, shellcode mmap, unexpected process spawns, sensitive path writes, anti-sandbox env probing via /proc/1/cgroup/DMI/ptrace). Pass 2 — ltrace: library-level tracing (system, popen, execv, dlopen, SSL_connect, getenv sandbox fingerprinting). Pass 3 — memory dump analysis: polls /proc/<pid>/maps during detonation, dumps every anonymous rwxp region, YARA-scans for PE/ELF payloads, NOP sleds, GetPC shellcode, PEB-walk API resolution, Meterpreter, CobaltStrike beacons, and reverse-shell syscall sequences — catches fileless malware that never touches disk. Pass 4 — fake network capture: DNS server resolves all queries to 127.0.0.1; HTTP/443/8080 listeners intercept every request, confirming C2 or download intent without permitting outbound traffic. Pass 5 — CPU/VM fingerprint masking: /proc/cpuinfo bind-mounted with a realistic Intel i7-8700K profile (no hypervisor flag) to defeat VM-detection evasion; Faketime (LD_PRELOAD) freezes clock at 2023-06-15 09:30:00 to defeat sleep/time-bomb staging. Pass 6 — application-layer detonation: Office macro documents (.doc/.docm/.xls/.xlsm/.xlsb/.ppt/.pptm/.odt) opened via LibreOffice headless with macro security disabled; HTML/SVG via Playwright + Chromium with mouse interaction simulation (click, scroll, button/link activation, deferred-JS wait). Filesystem diff and process tree recorded for every pass. Per-category detonators: images → ImageMagick; audio/video → ffprobe; archives → 7-Zip; Office macros → LibreOffice; HTML/SVG → Playwright + Chromium; fonts → fc-scan; EPUB → unzip. Scripts and executables are never detonated. Resource-capped via prlimit (768 MB RAM, 64 MB output, 512 processes).

㉒

Wine Execution Layer

▼

Windows PE executables and scripts (.exe/.dll/.msi/.bat/.ps1/.vbs/.hta/.wsf) detonated inside Wine 9.0 running in an isolated Linux namespace with the same six-layer instrumentation as the isolation chamber: strace syscalls, ltrace library calls, memory YARA (PE/shellcode/Meterpreter/CobaltStrike), fake DNS+HTTP network capture, CPU/VM fingerprint masking, and faketime time-warp. Captures registry writes, process spawns, network connection attempts, and in-memory payloads — all without modifying the host system.

㉓

Real Windows Micro-VM

▼

KVM/QEMU Windows 10 micro-VM with genuine Windows kernel — not emulation. COW QCOW2 overlay per scan (base image never modified); Skylake-Client-v4 CPU with -hypervisor bit cleared (defeats hypervisor detection); no network adapter (prevents real C2). File injected via QEMU Guest Agent (QGA) protocol over Unix socket. Pre-staged PowerShell detonation agent: monitors process spawns, 6×5 s netstat snapshots (ESTABLISHED/SYN_SENT), registry diff (Run/RunOnce/Winlogon/Services), and filesystem writes to Public/Temp/AppData. 30 s instrumented execution. Triggered only for high-risk samples (risk_score ≥ 60, Windows executable extension).

㉔

Cross-Scan Intelligence

▼

TLSH fuzzy hash computed for every scan and compared against all previous scans within ±50 Hamming distance to identify structurally/behaviourally related samples. Assigns a deterministic campaign name (e.g. PHANTOM-KRAKEN-07) to malware clusters seeded from SHA-256 of cluster ID. Classifies malware family (CobaltStrike · Meterpreter · Emotet · QakBot · Mirai · AsyncRAT · RedLine · Ransomware · Cryptominer · Shellcode) via YARA rule name overlap, ClamAV hit strings, and MITRE TTP intersection. 90-day activity trends per campaign. Every scan stored to PostgreSQL history for future correlation.

Images JPG PNG GIF BMP TIFF WEBP PSD SVG ICO HEIC RAW CR2 NEF DNG

Audio MP3 WAV FLAC OGG AAC M4A WMA OPUS AIFF MID

Video MP4 AVI MOV MKV WEBM WMV FLV TS 3GP MPEG

Archives ZIP TAR GZ 7Z RAR ISO APK JAR DEB RPM EPUB CAB CBZ

Executables EXE DLL SYS OCX SCR MSI ELF SO COM CPL

Scripts SH PS1 BAT CMD PY JS PHP RB LUA VBS HTA WSF PL

Data & Web HTML XML JSON YAML CSV SQLITE DB TTF OTF WOFF PEM CRT KEY

Network PCAP CAP PCAPNG

🔬

Initialize Scan Target

Drop file or click to load · Max 50 MB
All file types supported except PDFs and Office documents — use their dedicated scanners

23 engines zero retention offline AI real Windows VM MITRE ATT&CK

📄

Scan Terminal

File Information

Cryptographic Hashes

Top Findings

All Findings 0

Engine Data

🏷 Vendor & Platform Watermarks

📝 Visible Watermark Text (OCR)

🖹 SVG Visible Text Elements

🖼 Visible Overlay (Alpha Channel)

👁 Invisible / Frequency-Domain Watermarks

🔎 Tracking & Serial Codes

No watermark engine data yet

🧪 Isolation Chamber Detonation

Sandbox result not yet available

📶 Network Activity (in isolated namespace)

⚠ Unexpected Process Spawns

📁 Suspicious Filesystem Writes

🌳 Process Tree

📚 Library Calls (ltrace)

📄 Files Created in Sandbox

🔍 Anti-Sandbox Probing

🧠 In-Memory Payload Detection (YARA)

📡 Network Capture (DNS & HTTP)

🎭 Playwright Interaction Results

🖼 Detonation Preview

📊 Syscall Profile

No syscall data

💻 Wine 9.0 — Linux namespace detonation

Windows detonation results will appear here

Intelligence correlation will appear here

📅 Previous Scans (Same File)

🕸 Full Scan Graph

Visualize all scan history, malware clusters, and C2 infrastructure overlap across every file scanned.

Open Intelligence Graph →

⬇ Download JSON Report