Skip to content

brightdata/supply-chain-sankey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bright Data      🤝      AWS

Supply Chain Sankey

AI-Powered Supply Chain Discovery & Visualization




An autonomous AI agent that discovers real-world supply chain relationships for any company using Bright Data web intelligence and Amazon Bedrock LLMs, then renders the results as an interactive Sankey diagram — all deployed serverlessly on AWS Bedrock AgentCore.


How It Works · Architecture · Deploy · Project Structure




How It Works

image

The agent runs a 7-step pipeline to go from a company name to a fully verified supply chain graph:

Step Node What Happens
1 Plan Queries LLM generates 6–7 targeted search queries for the company (upstream suppliers, downstream distributors)
2 Search Parallel web searches via Bright Data — discovers ~50+ relevant URLs
3 Scrape Parallel URL scraping via Bright Data Web Unlocker — bypasses anti-bot, extracts content
4 Reflect LLM compresses each page into compact, high-signal evidence (~900 chars) with signal scoring
5 Parse Evidence LLM classifies evidence — extracts counterparties, confidence scores, relationship types
6 Build Edges Deterministically constructs supplier/distributor edges from parsed counterparties
7 Build Sankey Generates the final Sankey JSON — nodes with signed tiers, links with confidence, full metadata

Key Capabilities

Feature Description
Bidirectional Discovery Discovers both upstream (suppliers) and downstream (distributors) in parallel
Evidence-Backed Edges Every relationship links back to source URLs with confidence scores
Signal Scoring LLM-based quality scoring filters noise before expensive classification
Fan-Out Parallelism LangGraph Send() pattern runs searches and scrapes concurrently
Automatic Retry Exponential backoff for rate limits, fallback to sequential on token overflow
Node Expansion Click any leaf node to expand deeper tiers of the supply chain
Entity Filtering AI classifier removes generic labels ("Suppliers", "Partners") — keeps real org names


Architecture

image

LangGraph Pipeline Detail

image


Prerequisites

Requirement Details
AWS Account With Bedrock model access enabled for us.amazon.nova-2-lite-v1:0
AWS CLI Configured with credentials (aws configure)
AgentCore CLI pip install bedrock-agentcore-starter-toolkit
AWS SAM CLI Install SAM CLI
Bright Data Account With a Web Unlocker zone — Sign up
Node.js 18+ For the Next.js frontend
Python 3.11 For the backend agent
Supabase Project (optional) For server-side rate limiting — Create project


Deployment

Deployment has three stages: AgentCore Runtime (the AI agent), Lambda (the proxy), and Frontend (the UI).


Stage 1 — Deploy the Agent to AgentCore Runtime

backend/  ──▶  AgentCore Runtime (serverless, auto-scaling)
# 1. Navigate to the backend directory
cd backend

# 2. Install dependencies locally (for development/testing)
pip install -r requirements.txt

# 3. Configure AgentCore (first time only)
#    This creates the IAM role, ECR repo, and .bedrock_agentcore.yaml
agentcore configure --entrypoint agentcore_app.py

# 4. Deploy to AgentCore Runtime with environment variables
agentcore deploy \
  --env BRIGHT_DATA_API_KEY=<your-bright-data-api-key> \
  --env BRIGHT_DATA_ZONE=web_unlocker1 \
  --env MODEL_ID=us.amazon.nova-2-lite-v1:0 \
  --env AWS_REGION=us-east-1

# 5. Verify deployment
agentcore status

# 6. Test the agent directly
agentcore invoke '{"company": "Tesla", "direction": "upstream"}'

Note: The deploy command packages your code, builds it via AWS CodeBuild, and creates a serverless endpoint. Copy the Agent Runtime ARN from the output — you'll need it in Stage 2.


Stage 2 — Deploy the Lambda Proxy

infra/template.yaml  ──▶  Lambda Function + Function URL

The Lambda function acts as an HTTP proxy between the frontend and AgentCore Runtime.

# 1. Navigate to the infra directory
cd infra

# 2. Build the SAM application
sam build

# 3. Deploy with guided prompts (first time)
sam deploy --guided

During the guided deployment, you'll be prompted for:

Parameter Value
Stack Name supplychain-sankey-lambda
Region us-east-1
AgentRuntimeArn The ARN from Stage 1 (e.g., arn:aws:bedrock-agentcore:us-east-1:ACCOUNT_ID:runtime/AGENT_ID)
# For subsequent deployments (after samconfig.toml exists)
sam build && sam deploy

Output: The stack outputs a Function URL — this is the HTTPS endpoint your frontend will call. Copy it for Stage 3.


What the SAM Template Creates

Resource Configuration
Lambda Function Python 3.11, ARM64, 256MB RAM, 300s timeout
Function URL Public HTTPS, CORS enabled, POST only, streaming responses
IAM Policy bedrock-agentcore:InvokeAgentRuntime on the specified ARN

Stage 3 — Deploy the Frontend

frontend/  ──▶  Next.js App (Vercel, EC2, or any Node.js host)
# 1. Navigate to the frontend directory
cd frontend

# 2. Install dependencies
npm install

# 3. Create environment file from template
cp .env.example .env.local

# 4. Edit .env.local with your values:
#    LAMBDA_URL=<Function URL from Stage 2>
#    SUPABASE_URL=<your Supabase project URL>         (optional, for rate limiting)
#    SUPABASE_SERVICE_ROLE_KEY=<your service role key> (optional, for rate limiting)

# 5. Run in development mode
npm run dev

# 6. Or build for production
npm run build && npm start

Note: The LAMBDA_URL is server-side only and never exposed to the browser. The Next.js API route at /api/supply-chain acts as a proxy.


Supabase Rate Limiting (Optional)

If you want per-IP rate limiting, create a Supabase table:

CREATE TABLE rate_limits (
  id BIGSERIAL PRIMARY KEY,
  ip_address TEXT NOT NULL,
  request_type TEXT NOT NULL,      -- 'search' | 'expand_downstream'
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_rate_limits_ip ON rate_limits (ip_address);

Default limits: 1 search + 1 expansion per IP (configurable in lib/rate-limiter.ts).



Sankey Output Format

The agent returns a JSON structure optimized for D3 Sankey rendering:

{
  "sankey": {
    "nodes": [
      { "id": "Apple Inc.", "name": "Apple Inc.", "tier": 0, "evidence": [...] },
      { "id": "TSMC",       "name": "TSMC",       "tier": -1, "evidence": [...] },
      { "id": "Best Buy",   "name": "Best Buy",   "tier": 1,  "evidence": [...] }
    ],
    "links": [
      {
        "source": "TSMC",
        "target": "Apple Inc.",
        "value": 1,
        "tier": -1,
        "direction": "upstream",
        "relationship_type": "supplier",
        "status": "confirmed",         // confirmed | unconfirmed | rejected
        "confidence": 0.82,
        "evidence_urls": ["https://..."]
      }
    ],
    "metadata": {
      "total_nodes": 14,
      "total_links": 16,
      "upstream_nodes": 8,
      "downstream_nodes": 5,
      "confirmed_edges": 10,
      "unconfirmed_edges": 5,
      "rejected_edges": 1,
      "total_urls_discovered": 42,
      "high_grade_urls": 12
    }
  }
}

Tier semantics: tier < 0 = upstream suppliers, tier 0 = root company, tier > 0 = downstream distributors.

Edge status: confirmed (strong multi-source evidence), unconfirmed (plausible, needs corroboration), rejected (contradicted or unsupported).



Environment Variables Reference

Backend (AgentCore Runtime)

Variable Required Default Description
BRIGHT_DATA_API_KEY Yes Bright Data API token for search & scraping
BRIGHT_DATA_ZONE No web_unlocker1 Bright Data Web Unlocker zone name
MODEL_ID No us.amazon.nova-2-lite-v1:0 Amazon Bedrock model ID for LLM calls
AWS_REGION No us-east-1 AWS region for Bedrock API calls
ENABLE_EDGE_VERIFICATION No 0 Set to 1 to enable LLM edge verification step
MAX_SCRAPE_URLS No 10 Maximum URLs to scrape per query batch
MIN_SIGNAL_SCORE No 0.05 Minimum signal score to process evidence
SUPPLYCHAIN_LOG_LEVEL No INFO Logging level (DEBUG, INFO, WARNING, ERROR)

Frontend (Next.js)

Variable Required Default Description
LAMBDA_URL Yes Lambda Function URL from Stage 2 deployment
SUPABASE_URL No Supabase project URL (for rate limiting)
SUPABASE_SERVICE_ROLE_KEY No Supabase service role key (for rate limiting)

Lambda (SAM Parameter)

Parameter Required Description
AgentRuntimeArn Yes ARN of the deployed AgentCore Runtime agent


Local Development

# Run the backend locally (FastAPI dev server on port 8000)
cd backend
pip install -r requirements.txt
cp .env.example .env  # Fill in your Bright Data API key
python dev_server.py

# Test locally
curl -X POST http://localhost:8000/invoke \
  -H "Content-Type: application/json" \
  -d '{"company": "Tesla", "direction": "upstream"}'
# Run the frontend locally (Next.js dev server on port 3000)
cd frontend
npm install
cp .env.example .env.local  # Fill in Lambda URL (or http://localhost:8000 for local backend)
npm run dev


Technology Stack

Layer Technology Purpose
Agent Runtime AWS Bedrock AgentCore Serverless agent hosting with session isolation
Agent Framework LangGraph State machine orchestration with fan-out parallelism
LLM Amazon Bedrock (Nova 2 Lite) Query planning, evidence reflection, classification
Web Intelligence Bright Data Web Unlocker Anti-bot bypass, CAPTCHA solving, web scraping
Lambda Proxy AWS Lambda + Function URL HTTP gateway to AgentCore Runtime
Infrastructure AWS SAM Infrastructure as Code for Lambda deployment
Frontend Next.js 16 + React 19 Server-side rendering, API routes
Visualization D3.js + d3-sankey Interactive Sankey diagram rendering
Styling Tailwind CSS 4 + shadcn/ui Dark theme UI components
Rate Limiting Supabase (optional) Per-IP server-side request tracking


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors