Skip to content

iscc-web

CI License

A microservice for generating International Standard Content Codes (ISCC, ISO 24138) for media files.

Try it without installing anything

A public instance runs at web.iscc.io - upload a file in the demo frontend or explore the interactive API documentation at web.iscc.io/docs.

Introduction

iscc-web is a Python ASGI application built on BlackSheep that wraps the iscc-sdk behind a REST API. Upload a media file and get back its ISCC with rich metadata; the same service extracts and embeds metadata, decodes existing ISCCs, and generates granular text fingerprints for similarity search.

Key features:

  • ISCC generation - upload any media file, receive its composite ISCC-CODE with ISCC-UNITs and technical metadata
  • Metadata extraction & embedding - read embedded metadata, or embed ISCC metadata into a copy of the file and reprocess its ISCC
  • ISCC decoding - decompose any ISCC into human-readable units via the explain endpoint
  • Plain-text simprints - granular similarity fingerprints compatible with iscc-search
  • Automatic expiry - uploaded files are deleted after a configurable timeout (default 1 hour)
  • Private by default - file downloads and deletes are restricted to the original uploader
  • Interactive API docs - OpenAPI 3 spec with a Stoplight Elements UI served at /docs
  • Demo frontend - Vue 3 single-page app for uploading files, decoding ISCCs, embedding metadata, and comparing codes

All endpoints live under the /api/v1 base path:

Endpoint Purpose
POST /iscc Upload a file and generate its ISCC
GET /iscc/{media_id} Retrieve a previously generated ISCC
POST /media Upload a file without ISCC processing
GET/DELETE /media/{media_id} Download or delete an uploaded file
GET /metadata/{media_id} Extract embedded metadata
POST /metadata/{media_id} Embed metadata and reprocess the ISCC
GET /explain/{iscc} Decompose an ISCC into its units
POST /simprint Generate granular simprints from plain text

See the REST API reference for request and response details.

Experimental features

The following features are not part of ISO 24138, and their algorithms may change before their v1.0 release:

  • POST /api/v1/iscc?semantic=true generates a Semantic-Code ISCC-UNIT for text (iscc-sct) and image (iscc-sci) content that becomes part of the composite ISCC-CODE - 5 units (Meta, Semantic, Content, Data, Instance) instead of 4. The resulting ISCC-CODE is not a standard ISO 24138 identifier. Off by default.
  • POST /api/v1/iscc includes granular simprint features in the features field by default (text content; with semantic=true also semantic simprints). Simprints are 256-bit fingerprints with UTF-8 byte based offsets and sizes. Opt out per request with ?granular=false.
  • POST /api/v1/simprint generates granular simprints from plain text - byte-identical to iscc-search's local simprint generation, so search services can delegate text processing to this service.

Quick start

docker run -p 8000:8000 ghcr.io/iscc/iscc-web:main

The :main tag tracks the main branch; immutable semver tags are published on releases. The image ships with all content processing tools and semantic-code models pre-installed.

Requires uv, which provisions the required Python version automatically:

git clone https://github.com/iscc/iscc-web.git
cd iscc-web
uv sync
uv run iscc-web

The service is now available at http://localhost:8000. Generate your first ISCC by sending the raw file bytes (not multipart form data) with the base64-encoded filename in the X-Upload-Filename header:

curl -X POST http://localhost:8000/api/v1/iscc \
    -H "X-Upload-Filename: $(printf 'image.jpg' | base64)" \
    -H "Content-Type: application/octet-stream" \
    --data-binary @image.jpg

The response is an ISCC metadata JSON object with the iscc code, its units, and technical metadata. Continue with the getting started tutorial for a full walk-through.

Documentation

Development & Contributing - Dev setup, testing, and contribution guidelines.

Source code on GitHub