File Intake API

This page will help you get started with Intake API.

What is the File Intake API?

The File Intake API helps you upload files to Stairwell for analysis. Stairwell processes the files and can run them in its built-in sandbox (a safe testing environment). This process is useful if you want to analyze files for threats automatically.

How Does It Work?

Uploading files to Stairwell happens in two steps:
Preflight Check: Checks if Stairwell already has the file. If yes, you’re done!
Upload: If Stairwell hasn’t seen the file before, it provides a special link (upload URL) for you to send the file.

Before you start, make sure you have:

Asset ID: A unique identifier for the place where the file will be uploaded (you can use DefaultAsset or create a new one via Stairwell’s UI).
File Path: Where the file is located on your computer.
SHA256: A unique digital fingerprint of the file (optional but recommended).

Step 1: Preflight Check

The preflight check determines if the file is new or already in the system. To perform the check:
Copy and customize the code below:

curl --request POST \
  --url https://http.intake.app.stairwell.com/v2021.05/upload \
  --header 'Content-Type: application/json' \
  --data '{
    "asset": { "id": "<YOUR_ASSET_ID>" },
    "files": [{
      "filePath": "<YOUR_FILE_PATH>",
      "expected_attributes": {
        "identifiers": [{ "sha256": "<YOUR_FILE_SHA256>" }]
      },
      "origin": { "unspecified": {} }
    }]
  }'

Replace:
<YOUR_ASSET_ID>: The asset ID you’re using.
<YOUR_FILE_PATH>: The file’s location (e.g., C:\Users\frank\test.db).
<YOUR_FILE_SHA256>: File’s SHA256, if available.

Additional params

filesys_create_time: Timestamp of when the file was created on your system.
Format: "YYYY-MM-DDTHH:MM:SSZ"
Example: "2024-11-20T12:00:00Z"
filesys_last_modified: Timestamp of when the file was last changed on your system.
Example: "2024-11-19T10:30:00Z"
filesys_last_accessed: Timestamp of when the file was last opened or accessed.
Example: "2024-11-20T11:45:00Z"
origin: Specifies where the file came from (e.g., the web). If the file originated from the web, include additional fields like:

  • unspecified: If no origin metadata is included, or if origin: {"unspecified": {}} is added to the request, then the origin will be set to unspecified
  • web: Mark of the web! This can be used to specify from where the file was downloaded from by including the following in your request. NOTE: referrer-url, zone-id, and host-url are completely optional, you can just have origin: {"web": {}} to indicate that the file was acquired via the internet
    detonation_plan: Plan to detonate file after file is ingested. If this param is not included, it will by default be set to DETONATION_PLAN_UNSPECIFIED. If a file needs to be detonated, this can be set to DETONATE.
"origin": {
  "web": {
    "referrer-url": "testing.com",
    "host-url": "host.com",
    "zone-id": 1,
  }
}

Understand the Response:

If the response says NO_ACTION_ALREADY_EXISTS, stop here—the file is already in the system.
If the response says UPLOAD, the response should look something like this

{
  "fileActions": [
    {
      "filePath": "<filepath>",
      "expectedAttributes": {
        "identifiers": [
          {
            "sha256": "<sha256>"
          }
        ]
      },
      "uploadUrl": "<upload_url>",
      "fileField": "file",
      "method": "POST",
      "fields": {
        "key": "<key>",
        "policy": "<policy>",
        "x-goog-algorithm": "GOOG4-RSA-SHA256",
        "x-goog-credential": "[email protected]/20241120/auto/storage/goog4_request",
        "x-goog-date": "20241120T202146Z",
        "x-goog-meta-asset-id": "AAAAAA-BBBBBB-CCCCCC-DDDDDDDD",
        "x-goog-meta-file-detonate": "DETONATION_PLAN_UNSPECIFIED",
        "x-goog-meta-file-format": "RAW",
        "x-goog-meta-file-path": "/home/test/filepath",
        "x-goog-meta-sha256": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
        "x-goog-signature": "67cfc4a5d1dd07adec819e309657bd6ac0c1b7db3a81822fdc3bad8342056cad3b928fa48db24bf3aeda9ff3afbc246c51274519c10d33e3720281e7ecdfcd1c85d33cf6c46b8448c14b0de473c9a73cd3a6e970ec5acf075617520a2b7c1216f1d9b0627b54bd28105e0db273a148acc3a196f16f2d7a82227b7369d850717a0d4724394d5e76d4ff26cc93c4a191ac738176447c17156e0312f67edb85eec30194da4534fea403c2cc731a6ffcaac188dc637d9273cfb33e0d573818044d481f00672edf70676dc6d3179850b3eca30a2f21073407b5d94b309e776bd0de2b149db3de4b59c1bd883b0d65195cf377c7c03fb8f1daa7259d4f0bccd155a812"
      },
      "headers": {
        "sha256": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
      },
      "action": "UPLOAD"
    }
  ]
}

This means that your environment has not yet seen the file and so the file will have to be uploaded. The form fields must be included and match the preflight response when uploading the file or else the file upload will fail.

Step 2: Upload the File

If the file needs to be uploaded, follow these steps:
Use the upload URL from the preflight response. All of the form fields need to be included in this request or else the request will fail.

curl --request POST \
  --url <UPLOAD_URL> \
  --header 'Content-Type: multipart/form-data' \
  --form "key=<key>" \
  --form "policy=<policy>" \
  --form "x-goog-algorithm=GOOG4-RSA-SHA256" \
  --form "x-goog-credential=intake-http@stairwell-prod.iam.gserviceaccount.com/20241120/auto/storage/goog4_request" \
  --form "x-goog-date=20241120T202146Z" \
  --form "x-goog-meta-asset-id=AAAAAA-BBBBBB-CCCCCC-DDDDDDDD" \
  --form "x-goog-meta-file-detonate=DETONATION_PLAN_UNSPECIFIED" \
  --form "x-goog-meta-file-format=RAW" \
  --form "x-goog-meta-file-path=/home/test/filepath" \
  --form "x-goog-meta-sha256=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" \
  --form "x-goog-signature=67cfc4a5d1dd07adec819e309657bd6ac0c1b7db3a81822fdc3bad8342056cad3b928fa48db24bf3aeda9ff3afbc246c51274519c10d33e3720281e7ecdfcd1c85d33cf6c46b8448c14b0de473c9a73cd3a6e970ec5acf075617520a2b7c1216f1d9b0627b54bd28105e0db273a148acc3a196f16f2d7a82227b7369d850717a0d4724394d5e76d4ff26cc93c4a191ac738176447c17156e0312f67edb85eec30194da4534fea403c2cc731a6ffcaac188dc637d9273cfb33e0d573818044d481f00672edf70676dc6d3179850b3eca30a2f21073407b5d94b309e776bd0de2b149db3de4b59c1bd883b0d65195cf377c7c03fb8f1daa7259d4f0bccd155a812" \
  --form 'file=@<YOUR_FILE_PATH>'

Replace values (like <UPLOAD_URL>, <YOUR_FILE_PATH>, etc.) with those from the preflight response.

Important: The file field must appear last in the multipart form data. If placed earlier, GCS may return misleading errors (e.g., InvalidArgument or Cannot create buckets using a POST).

Tip: Include a filename parameter in the Content-Disposition header for the file field, even if generic (e.g., filename="file")

After running this, the system responds with a success message (no content, just an HTTP status code 204).

Bash Script Example

#!/usr/bin/env bash
#
# Stairwell Intake API - Reference file upload script
#
# This script demonstrates the full two-step upload flow:
#   1. Call the Stairwell Intake API "upload" endpoint with file metadata
#   2. If the response says the file must be uploaded, perform the actual upload
#
# The script can either:
#   - Create a 20 MB random file and upload it, or
#   - Upload an existing file you specify
#
# REQUIREMENTS
#   - bash
#   - curl
#   - jq
#   - sha256sum (Linux) or shasum (macOS)
#
# USAGE
#   chmod +x stairwell_upload_example.sh
#
#   # 1) Create a random 20 MB file and upload it
#   ./stairwell_upload_example.sh <ASSET_ID>
#
#   # 2) Upload an existing file
#   ./stairwell_upload_example.sh <ASSET_ID> /path/to/file.bin
#
# ENV OVERRIDES
#   INTAKE_API_URL  - override the default Intake API URL if needed
#
# NOTES
#   - The script prints what it is doing at each step
#   - The script exits nonzero on error
#

set -euo pipefail

INTAKE_API_URL="${INTAKE_API_URL:-https://http.intake.app.stairwell.com/v2021.05/upload}"

usage() {
  cat <<EOF
Usage:
  $0 <ASSET_ID> [FILE_PATH]

Examples:
  # Create a random 20 MB file and upload it
  $0 my-test-asset

  # Upload an existing file
  $0 ASSET_ID_HERE /path/to/file.bin

Environment:
  INTAKE_API_URL  Override the default Intake API URL (currently: ${INTAKE_API_URL})
EOF
}

if [[ $# -lt 1 || $# -gt 2 ]]; then
  usage
  exit 1
fi

ASSET_ID="$1"
FILE_PATH="${2:-}"

# Basic dependency checks
for cmd in curl jq; do
  if ! command -v "${cmd}" >/dev/null 2>&1; then
    echo "Error: '${cmd}' is required on PATH" >&2
    exit 1
  fi
done

# Create a temporary 20 MB random file if the user did not provide one
TMPDIR=""
if [[ -z "${FILE_PATH}" ]]; then
  TMPDIR="$(mktemp -d)"
  FILE_PATH="${TMPDIR}/random_20mb.bin"
  echo "Creating random 20 MB file at: ${FILE_PATH}"
  dd if=/dev/urandom of="${FILE_PATH}" bs=1M count=20 status=none
else
  if [[ ! -f "${FILE_PATH}" ]]; then
    echo "Error: file not found: ${FILE_PATH}" >&2
    exit 1
  fi
fi

# Compute SHA256 in a cross platform way
if command -v sha256sum >/dev/null 2>&1; then
  SHA256_SUM="$(sha256sum "${FILE_PATH}" | awk '{print $1}')"
elif command -v shasum >/dev/null 2>&1; then
  SHA256_SUM="$(shasum -a 256 "${FILE_PATH}" | awk '{print $1}')"
else
  echo "Error: no sha256 tool found (sha256sum or shasum is required)" >&2
  exit 1
fi

echo "Using asset ID: ${ASSET_ID}"
echo "File path:      ${FILE_PATH}"
echo "SHA256:         ${SHA256_SUM}"
echo

#
# STEP 1 - Preflight metadata call to Intake API
#
# This tells Stairwell:
#   - which asset this file belongs to
#   - what path the file has on the asset
#   - what SHA256 we expect
#
# The response tells us whether the file already exists or must be uploaded,
# and if it must be uploaded, it returns a pre signed URL, HTTP method,
# form fields, and optional headers.
#

METADATA_PAYLOAD="$(cat <<EOF
{
  "asset": { "id": "${ASSET_ID}" },
  "files": [{
    "filePath": "${FILE_PATH}",
    "expected_attributes": {
      "identifiers": [{ "sha256": "${SHA256_SUM}" }]
    },
    "origin": { "unspecified": {} }
  }]
}
EOF
)"

echo "Calling Intake API preflight at: ${INTAKE_API_URL}"
PREFLIGHT_RESP="$(
  curl -sS -f \
    --request POST \
    --url "${INTAKE_API_URL}" \
    --header 'Content-Type: application/json' \
    --data "${METADATA_PAYLOAD}"
)"

echo "Preflight response received"
# Uncomment to inspect full response for debugging
# echo "${PREFLIGHT_RESP}" | jq '.'

ACTION="$(echo "${PREFLIGHT_RESP}" | jq -r '.fileActions[0].action')"

if [[ "${ACTION}" == "null" || -z "${ACTION}" ]]; then
  echo "Error: unexpected preflight response, missing fileActions[0].action" >&2
  echo "${PREFLIGHT_RESP}" | jq '.'
  exit 1
fi

echo "Preflight action: ${ACTION}"

if [[ "${ACTION}" == "NO_ACTION_ALREADY_EXISTS" ]]; then
  echo "Result: file already exists in Stairwell, no upload required."
  # Clean up temporary file if we created one
  if [[ -n "${TMPDIR}" ]]; then
    rm -rf "${TMPDIR}"
  fi
  exit 0
fi

if [[ "${ACTION}" != "UPLOAD" ]]; then
  echo "Error: unsupported preflight action: ${ACTION}" >&2
  echo "${PREFLIGHT_RESP}" | jq '.'
  if [[ -n "${TMPDIR}" ]]; then
    rm -rf "${TMPDIR}"
  fi
  exit 1
fi

#
# STEP 2 - Perform the actual file upload
#
# Stairwell returns:
#   - uploadUrl  - where to upload the file
#   - method     - HTTP method to use (typically POST or PUT)
#   - fields     - form fields for multipart upload (if needed)
#   - headers    - optional extra HTTP headers
#

UPLOAD_URL="$(echo "${PREFLIGHT_RESP}" | jq -r '.fileActions[0].uploadUrl')"
METHOD="$(echo "${PREFLIGHT_RESP}" | jq -r '.fileActions[0].method')"

if [[ "${UPLOAD_URL}" == "null" || -z "${UPLOAD_URL}" ]]; then
  echo "Error: missing uploadUrl in preflight response" >&2
  echo "${PREFLIGHT_RESP}" | jq '.'
  if [[ -n "${TMPDIR}" ]]; then
    rm -rf "${TMPDIR}"
  fi
  exit 1
fi

if [[ "${METHOD}" == "null" || -z "${METHOD}" ]]; then
  echo "Error: missing method in preflight response" >&2
  echo "${PREFLIGHT_RESP}" | jq '.'
  if [[ -n "${TMPDIR}" ]]; then
    rm -rf "${TMPDIR}"
  fi
  exit 1
fi

echo "Upload URL: ${UPLOAD_URL}"
echo "HTTP method: ${METHOD}"

# Build form fields from .fields, if present
FORM_ARGS=()
FIELD_KEYS=()

if echo "${PREFLIGHT_RESP}" | jq -e '.fileActions[0].fields' >/dev/null 2>&1; then
  mapfile -t FIELD_KEYS < <(echo "${PREFLIGHT_RESP}" | jq -r '.fileActions[0].fields | keys[]')
fi

for k in "${FIELD_KEYS[@]}"; do
  v="$(echo "${PREFLIGHT_RESP}" | jq -r ".fileActions[0].fields[\"${k}\"]")"
  FORM_ARGS+=(--form "${k}=${v}")
done

# Build optional headers from .headers, if present
HEADER_ARGS=()
HEADER_KEYS=()

if echo "${PREFLIGHT_RESP}" | jq -e '.fileActions[0].headers' >/dev/null 2>&1; then
  mapfile -t HEADER_KEYS < <(echo "${PREFLIGHT_RESP}" | jq -r '.fileActions[0].headers | keys[]')
fi

for k in "${HEADER_KEYS[@]}"; do
  v="$(echo "${PREFLIGHT_RESP}" | jq -r ".fileActions[0].headers[\"${k}\"]")"
  HEADER_ARGS+=(-H "${k}: ${v}")
done

# The actual file part (multipart form) - this must typically be last
FORM_ARGS+=(--form "file=@${FILE_PATH}")

echo
echo "Uploading file to storage endpoint..."
curl -sS -f \
  --request "${METHOD}" \
  --url "${UPLOAD_URL}" \
  "${HEADER_ARGS[@]}" \
  "${FORM_ARGS[@]}"

echo
echo "Upload complete."
echo "Asset ID: ${ASSET_ID}"
echo "File:     ${FILE_PATH}"
echo "SHA256:   ${SHA256_SUM}"

# Cleanup if we created a temporary file
if [[ -n "${TMPDIR}" ]]; then
  rm -rf "${TMPDIR}"
fi

Python Script Example

#!/usr/bin/env python3
"""
Stairwell Intake API reference file upload script (Python)

This script demonstrates the full two step upload flow:

  1) Call the Stairwell Intake API "upload" endpoint with file metadata
  2) If the response indicates an upload is required, upload the file to the
     provided pre signed URL with the specified method, fields, and headers

It can either:
  - Create a 20 MB random file and upload it, or
  - Upload an existing file you specify

Requirements:
  - Python 3.8+
  - requests  (pip install requests)

Usage:
  python stairwell_upload_example.py <ASSET_ID> [FILE_PATH]

Examples:
  # Create a random 20 MB file and upload it
  python stairwell_upload_example.py <ASSET_ID>

  # Upload an existing file
  python stairwell_upload_example.py <ASSET_ID> /path/to/file.bin

Environment:
  INTAKE_API_URL   Override the default Intake API URL
                   (default: https://http.intake.app.stairwell.com/v2021.05/upload)
"""

import argparse
import hashlib
import json
import os
import sys
import tempfile
from typing import Dict, Any, Optional

import requests


DEFAULT_INTAKE_API_URL = os.environ.get(
    "INTAKE_API_URL",
    "https://http.intake.app.stairwell.com/v2021.05/upload",
)


def parse_args() -> argparse.Namespace:
    parser = argparse.ArgumentParser(
        description="Stairwell Intake API reference file upload client"
    )
    parser.add_argument(
        "asset_id",
        help="Asset ID to associate with the file",
    )
    parser.add_argument(
        "file_path",
        nargs="?",
        help="Optional path to existing file. If omitted, a random 20 MB file is created",
    )
    return parser.parse_args()


def create_random_file_20mb() -> str:
    """Create a temporary 20 MB file filled with random bytes and return its path."""
    temp_dir = tempfile.mkdtemp(prefix="stairwell-upload-")
    file_path = os.path.join(temp_dir, "random_20mb.bin")

    print(f"Creating random 20 MB file at: {file_path}")
    chunk_size = 1024 * 1024  # 1 MB
    total_mb = 20

    with open(file_path, "wb") as f:
        for _ in range(total_mb):
            f.write(os.urandom(chunk_size))

    return file_path


def compute_sha256(file_path: str) -> str:
    """Compute SHA256 for the given file."""
    h = hashlib.sha256()
    with open(file_path, "rb") as f:
        for chunk in iter(lambda: f.read(1024 * 1024), b""):
            h.update(chunk)
    return h.hexdigest()


def build_metadata_payload(asset_id: str, file_path: str, sha256: str) -> Dict[str, Any]:
    """Build the JSON payload for the Intake API preflight call."""
    return {
        "asset": {"id": asset_id},
        "files": [
            {
                "filePath": file_path,
                "expected_attributes": {
                    "identifiers": [
                        {"sha256": sha256},
                    ]
                },
                "origin": {"unspecified": {}},
            }
        ],
    }


def call_intake_preflight(
    intake_url: str, payload: Dict[str, Any]
) -> Dict[str, Any]:
    """Call the Intake API preflight endpoint and return the parsed JSON response."""
    print(f"Calling Intake API preflight at: {intake_url}")
    resp = requests.post(
        intake_url,
        headers={"Content-Type": "application/json"},
        data=json.dumps(payload),
        timeout=30,
    )
    resp.raise_for_status()
    result = resp.json()
    print("Preflight response received")
    return result


def extract_file_action(preflight: Dict[str, Any]) -> Dict[str, Any]:
    """Extract fileActions[0] from preflight response and perform basic validation."""
    try:
        file_actions = preflight["fileActions"]
        if not file_actions:
            raise KeyError("fileActions is empty")
        action = file_actions[0]
    except (KeyError, IndexError) as e:
        raise RuntimeError(
            f"Unexpected preflight response, missing fileActions[0]: {e}"
        ) from e

    if "action" not in action or not action["action"]:
        raise RuntimeError(
            "Unexpected preflight response, missing fileActions[0].action"
        )

    return action


def upload_file(
    action: Dict[str, Any],
    file_path: str,
) -> None:
    """
    Perform the actual file upload to the pre signed URL.

    Handles:
      - POST with multipart form fields and file
      - PUT with raw body (fields ignored, headers applied)
    """
    upload_url = action.get("uploadUrl")
    method = action.get("method", "POST").upper()
    fields: Dict[str, str] = action.get("fields") or {}
    headers: Dict[str, str] = action.get("headers") or {}

    if not upload_url:
        raise RuntimeError("Preflight action missing uploadUrl")

    print(f"Upload URL: {upload_url}")
    print(f"HTTP method: {method}")

    if method == "POST":
        # Multipart form upload, typical for S3 style POST with fields
        form_data = fields.copy()
        print("Uploading file with multipart POST")
        with open(file_path, "rb") as f:
            files = {"file": (os.path.basename(file_path), f)}
            resp = requests.post(
                upload_url,
                data=form_data,
                files=files,
                headers=headers,
                timeout=300,
            )
    elif method == "PUT":
        # Raw PUT upload, typical for presigned PUT URLs
        print("Uploading file with raw PUT")
        with open(file_path, "rb") as f:
            resp = requests.put(
                upload_url,
                data=f,
                headers=headers,
                timeout=300,
            )
    else:
        raise RuntimeError(f"Unsupported upload method: {method}")

    # Raise for any non 2xx code
    resp.raise_for_status()
    print("Upload complete")


def main() -> None:
    args = parse_args()

    asset_id = args.asset_id
    user_file_path: Optional[str] = args.file_path

    temp_dir: Optional[str] = None

    if user_file_path is None:
        # Create random 20 MB file
        file_path = create_random_file_20mb()
        temp_dir = os.path.dirname(file_path)
    else:
        file_path = user_file_path
        if not os.path.isfile(file_path):
            print(f"Error: file not found: {file_path}", file=sys.stderr)
            sys.exit(1)

    sha256 = compute_sha256(file_path)

    print(f"Using asset ID: {asset_id}")
    print(f"File path:      {file_path}")
    print(f"SHA256:         {sha256}")
    print()

    # Step 1: preflight
    payload = build_metadata_payload(asset_id, file_path, sha256)
    preflight = call_intake_preflight(DEFAULT_INTAKE_API_URL, payload)
    try:
        action = extract_file_action(preflight)
    except RuntimeError as e:
        print(f"Error: {e}", file=sys.stderr)
        print("Full preflight response:")
        print(json.dumps(preflight, indent=2))
        if temp_dir:
            try:
                # Best effort cleanup
                for fn in os.listdir(temp_dir):
                    os.remove(os.path.join(temp_dir, fn))
                os.rmdir(temp_dir)
            except OSError:
                pass
        sys.exit(1)

    action_type = action["action"]
    print(f"Preflight action: {action_type}")

    if action_type == "NO_ACTION_ALREADY_EXISTS":
        print("Result: file already exists in Stairwell, no upload required.")
        # Cleanup temporary file if we created one
        if temp_dir:
            try:
                for fn in os.listdir(temp_dir):
                    os.remove(os.path.join(temp_dir, fn))
                os.rmdir(temp_dir)
            except OSError:
                pass
        return

    if action_type != "UPLOAD":
        print(f"Error: unsupported preflight action: {action_type}", file=sys.stderr)
        print("Full preflight response:")
        print(json.dumps(preflight, indent=2))
        if temp_dir:
            try:
                for fn in os.listdir(temp_dir):
                    os.remove(os.path.join(temp_dir, fn))
                os.rmdir(temp_dir)
            except OSError:
                pass
        sys.exit(1)

    # Step 2: actual upload
    try:
        upload_file(action, file_path)
    finally:
        if temp_dir:
            try:
                for fn in os.listdir(temp_dir):
                    os.remove(os.path.join(temp_dir, fn))
                os.rmdir(temp_dir)
            except OSError:
                pass

    print()
    print("Summary")
    print(f"  Asset ID: {asset_id}")
    print(f"  File:     {file_path}")
    print(f"  SHA256:   {sha256}")


if __name__ == "__main__":
    main()