--- title: "Auditing Your Dependencies" date: 2026-02-17 description: "A practical guide to building a security pipeline for third-party code, including LLM-assisted review." tags: ["security","supply-chain","llms","devops"] readingTime: "8 min read" url: https://alexmoening.com/dev-thoughts/auditing-your-dependencies.html markdownUrl: https://alexmoening.com/dev-thoughts/auditing-your-dependencies.md --- # Auditing Your Dependencies [← Back to /dev/thoughts](/dev-thoughts/)

Self-hosting is only half the solution. You also need to verify what you're hosting.

This is Part 2 of a series on supply chain security. [Part 1](supply-chain-roulette.html) covers why CDN embeds are risky and tells the story of a real supply chain attack I witnessed at Akamai. ### Implementing a Security Pipeline

Five layers of verification before code reaches production.

#### 1. Fetch with Version Pinning

#!/bin/bash
# Pin versions explicitly - no @latest in production
P5_VERSION="1.7.0"
curl -sSL "https://cdnjs.cloudflare.com/ajax/libs/p5.js/${P5_VERSION}/p5.min.js" \
    -o "public/lib/p5.min.js"

# Generate SRI hash for integrity verification
SRI_HASH=$(openssl dgst -sha384 -binary "public/lib/p5.min.js" | openssl base64 -A)
echo "sha384-${SRI_HASH}" > "public/lib/p5.min.js.sri"

#### 2. Baseline Comparison Store a SHA256 hash of each library when you first audit it. On subsequent fetches, compare:

# First audit: create baseline
shasum -a 256 p5.min.js > .security-baselines/p5.min.js.sha256

# Later fetches: detect changes
current_hash=$(shasum -a 256 p5.min.js | awk '{print $1}')
baseline_hash=$(cat .security-baselines/p5.min.js.sha256)

if [ "$current_hash" != "$baseline_hash" ]; then
    echo "WARNING: Library content has changed!"
    # Require manual review before proceeding
fi

#### 3. Static Pattern Scanning

# High-risk patterns
HIGH_RISK_PATTERNS=(
    'document\.write\s*\('              # DOM clobbering
    'eval\s*\([^)]*\$'                  # eval with dynamic input
    'new\s+Function\s*\([^)]*\+'        # Function constructor
    'document\.createElement.*script.*src\s*='  # Dynamic script injection
)

for pattern in "${HIGH_RISK_PATTERNS[@]}"; do
    if grep -qE "$pattern" "$library_file"; then
        echo "HIGH RISK: Found pattern '$pattern'"
    fi
done

#### 4. Domain Extraction

# Extract URLs and check against allowlist
urls=$(grep -oE 'https?://[a-zA-Z0-9][a-zA-Z0-9.-]+\.[a-zA-Z]{2,}' "$file")
ALLOWED_DOMAINS=("github.com" "jsdelivr.net")

for url in $urls; do
    if ! is_allowed "$url"; then
        echo "WARNING: Unknown domain in library"
    fi
done

#### 5. Deploy Gate

# In deploy.sh - fail fast if audit fails
echo "Running security audit..."
./scripts/security-audit-libs.sh || exit 1

# Only proceed if audit passes
aws s3 sync public/ s3://mybucket/

### The LLM Experiment

LLMs can help spot suspicious patterns, but don't trust the hype. Real-world false positive rates are high.

There's been a lot of excitement about using LLMs to detect malicious code. Some papers claim 99% precision. **Be skeptical.** A 2025 study on project-scale vulnerability detection found real-world false positive rates of **63-97%** depending on the model.^[1] Another ICSE 2025 paper found that many benchmark datasets have poor label accuracy--meaning prior claims were evaluated on flawed data.^[2]

Claim	Reality
"99% precision"	On curated benchmarks only; production FP rates much higher
"LLMs understand obfuscation"	Partially true, but adversarial evasion bypasses most detection
"Drop-in replacement for SAST"	No. High cost (300K-300M tokens/project), inconsistent outputs

#### Where LLMs Actually Help Despite the limitations, LLMs add value as *one layer* in a defense-in-depth strategy:

Use Case	Why It Works
Explaining suspicious code	LLMs describe what obfuscated code does in plain English
Triaging SAST findings	Helps filter false positives from static analysis
Identifying data flows	Traces data from sources to sinks

#### A Practical Prompt Following [Anthropic's best practices](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/claude-4-best-practices): explicit instructions, XML structure, and uncertainty handling.

<role>
You are a security analyst reviewing third-party JavaScript for supply chain risks.
Your job is to help a human reviewer understand the code faster, not to render verdicts.
</role>

<task>
Analyze the JavaScript code provided. Do NOT assign a malware verdict or confidence score.
Instead, describe what the code does so a human can make an informed decision.
</task>

<analysis_structure>
1. BEHAVIOR: What does this code do? (plain English, 2-3 sentences)
2. DATA SOURCES: What does it read? (files, env vars, DOM, cookies)
3. DATA SINKS: Where does data go? (network, file writes, exec, eval)
4. SUSPICIOUS PATTERNS: What looks unusual?
5. COMPARISON: What would a malicious version look like?
</analysis_structure>

<uncertainty_handling>
If you cannot determine what code does, say "Unable to determine" and explain why.
Do not guess or speculate.
</uncertainty_handling>

#### Defense in Depth LLMs are *one layer*, not the whole stack:

Layer	Tool	Catches
1	`Hash verification`	Any change from audited baseline
2	`SAST / pattern scan`	Known bad patterns, CVEs
3	`LLM triage`	Explains suspicious code in plain English
4	`Human review`	Final judgment on context and intent

Skip a layer, and you're back to playing roulette. ### PCI DSS 4.0 Implications

If you handle payments, script integrity is now regulation.

PCI DSS 4.0 (effective March 2025) explicitly requires inventory and change-detection for payment page scripts. Self-hosting with integrity verification directly addresses both requirements. ### Getting Started

Six steps from CDN-dependent to supply-chain secure.

Step	Action
1	Inventory your embeds--search codebase for external script tags
2	Download and pin versions--no `@latest` in production
3	Generate SRI hashes for integrity verification
4	Create SHA256 baselines for change detection
5	Add audit to CI/CD--verify before every deploy
6	Set quarterly reminders for security patch reviews

### Resources #### Implementation

Resource	Topic
MDN	Subresource Integrity
OWASP	Secrets Management Cheat Sheet
GitHub Docs	OIDC Federation with AWS
PCI SSC	PCI DSS 4.0 Requirements

#### LLM Research

Ref	Paper	Key Finding
[1]	LLM-based Vulnerability Detection at Project Scale	63-97% FP rates in production
[2]	PrimeVul (ICSE 2025)	Benchmark datasets have poor label accuracy
[3]	CoTDeceptor	Adversarial evasion bypasses 14/15 categories

[Back to Part 1: Supply Chain Roulette](supply-chain-roulette.html) --- ## Navigation - [Home](/) - [About](/about.html) - [Projects](/projects.html) - [Contact](/contact.html) - [/dev/thoughts](/dev-thoughts/) *Copyright 2026 Alex Moening. Opinions expressed are my own.*