Agent skill supply-chain testing¶

Use this workflow during authorized assessments of AI-agent platforms, internal skill registries, curated skill bundles, and marketplace scanner pipelines. It turns the Trail of Bits June 2026 skill-distribution research into a reusable validation path without publishing ready-to-run malicious payloads.

Operator value¶

Agent skills are a hybrid supply-chain artifact: natural-language instructions, helper scripts, packaged data, and sometimes compiled or archived content. Scanner pass/fail output is not enough proof that a skill is safe or that a marketplace has meaningful review coverage.

Durable test targets:

Marketplace or registry upload review.
Skill ZIP import APIs.
Git-repository based skill ingestion.
Agent harnesses that execute helper scripts referenced from SKILL.md.
Scanner integrations that combine static patterns, LLM review, package policy, or third-party verdicts.

Inputs¶

Collect these before testing:

Written authorization for skill upload/import tests.
A disposable test organization, workspace, and agent identity.
The skill packaging rules: allowed file extensions, size limits, symlink handling, archive nesting, and executable-bit handling.
The scanner decision model: block/warn/allow, where verdicts appear, and whether maintainers can override them.
Network and filesystem egress controls for the agent runtime.

Containment

Use inert canary strings and local-only callbacks in lab infrastructure. Do not include credential theft, destructive commands, persistence, or external exfiltration in marketplace submissions.

Test matrix¶

Boundary	What to vary	Evidence to capture
Prompt review	Long padding before risky instructions; instructions hidden in non-obvious sections; conflicting benign summaries	Scanner transcript, truncation indicators, final verdict
File-type policy	Plain text, Markdown, JSON/YAML config, office documents, bytecode, archives, nested archives	Accepted/rejected file list and unpacked artifact tree
Helper execution	`SKILL.md` references to scripts, shell snippets, package managers, interpreters, or generated commands	Agent plan, command preview, execution log
Indirection	Instructions stored outside `SKILL.md`, referenced by filename, config key, or generated task	Whether scanner follows the reference chain
Binary/compiled content	Precompiled bytecode, minified bundles, wasm, native extensions, or generated lockfiles	Whether scanner decompiles, hashes, or ignores content
UI review	Very large files, long scroll regions, collapsed sections, misleading filenames	Reviewer-visible screenshot and raw source diff
Verdict consistency	Re-submit equivalent artifacts with renamed files, reordered content, or harmless formatting changes	Per-submission verdict deltas

Workflow¶

Mirror the ingestion path. Test the exact route real users use: direct ZIP upload, Git repository import, marketplace submission, or internal registry publish.
Start with a benign control skill. Confirm that a minimal skill imports, scans, and runs as expected. Save the raw artifact, scanner output, and agent execution trace.
Map package-policy enforcement. Submit harmless artifacts that vary extension, size, nesting, and executable bits. The goal is to learn what reaches the scanner and what reaches the agent runtime.
Probe scanner visibility. Place a unique inert canary in each location (SKILL.md, helper script, config file, document XML, compiled artifact metadata) and verify which canaries appear in scanner findings or reviewer UI.
Probe indirection handling. Reference secondary files from SKILL.md and record whether scanners follow the chain or only review top-level Markdown.
Probe truncation and parser gaps. Use large but harmless padding, deeply nested structures, or verbose generated files to determine whether scanner summaries omit tail content.
Exercise runtime boundaries in a sandbox. If execution is permitted, make helper scripts print a local canary and current working directory only. Capture whether the agent asks for approval, previews commands, or runs automatically.
Report scanner bypasses as review-coverage failures. The finding is stronger when it shows a scanner verdict mismatch: the raw artifact contains a clearly labeled inert policy violation, but the platform marks it safe or hides the relevant content from reviewers.

Safe canary patterns¶

Prefer canaries that prove reachability without collecting secrets:

SKILLZ_CANARY_DO_NOT_EXECUTE_<case-id>
SKILLZ_POLICY_VIOLATION_MARKER_<case-id>
SKILLZ_LOCAL_RUNTIME_MARKER_<case-id>

For runtime tests, use local-only commands such as printing the marker, interpreter version, and working directory. Avoid environment dumps, token paths, outbound HTTP, shell reverse connections, or destructive filesystem writes.

Reporting checklist¶

Include:

Artifact hash and submission timestamp.
Upload/import path and scanner name/version if visible.
Allowed file tree after platform unpacking or normalization.
Exact verdict text and screenshots of reviewer UI.
Canary placement map and whether each canary appeared in scanner output.
Runtime evidence limited to benign local markers.
Clear impact statement: users can install or run a skill containing scanner-invisible instructions or code-like content.

Sources¶

Trail of Bits, "The sorry state of skill distribution" (June 3, 2026): https://blog.trailofbits.com/2026/06/03/the-sorry-state-of-skill-distribution/
Trail of Bits Blog RSS: https://blog.trailofbits.com/feed/