Signature / binary composition analysis.
Identify open-source components where there is no manifest to read — inside compiled binaries, archives, container layers, and renamed or vendored source — by content fingerprint, then correlate them with the same vulnerability + license stack as the rest of a scan.
Why
Manifest-based SCA only sees what a lockfile declares. Vendored source trees, statically-linked C/C++, repackaged JARs, and stripped binaries carry no manifest — but still carry known-vulnerable OSS. Signature analysis recovers those components from content alone.
How it works
- File SHA-256 — exact match of a file against a known component release (confidence 1.0).
- Structural codeprint — a normalised, comment-stripped token-shingle hash that survives reformatting and whitespace changes.
- Fuzzy snippet shingles — Jaccard overlap of feature hashes, so a modified copy of a component still matches partially.
- Embedded metadata — pom.properties (JARs), .nuspec (nupkg), wheel METADATA, and binary format hints (ELF/PE/Mach-O, Go build IDs, OpenSSL strings).
Matches resolve to a package URL (purl), then flow into the normal OSV/NVD/GHSA correlation, EPSS/KEV/ExploitDB prioritisation, and license-obligation stack.
Scan a directory or artifact
# fingerprint + match an extracted tree / unpacked artifact against the corpus
curl -s -X POST https://your-host/api/artifacts/match \
-H "Authorization: Bearer $TOKEN" \
-F "path=@./build/output"The component corpus
Matching is performed against a signature corpus — a fingerprint database of OSS releases. dpndncY ships a reproducible corpus builder; point it at a set of release archives (or your own approved-component mirror) to extend it.
node scripts/build-signature-corpus.js --from /path/to/releases --limit 500/data — build and extend it offline, no registry access required. See Air-gapped install.