CollectorVision Part 4: Running It Locally — Python Library and REST Server
Part 4 of the CollectorVision series. Part 1 has the overview.
This post covers the practical side: installing the library, running it against a webcam, and the REST server. Post 6 goes deeper into the deployment split — when to run everything locally versus offloading catalog lookup to a server.
Installing
pip install git+https://github.com/HanClinto/CollectorVision.git
Or with uv:
uv pip install git+https://github.com/HanClinto/CollectorVision.git
Requires Python 3.10+. The only ML dependency is onnxruntime. Model weights are bundled with the package. No PyPI release yet — that's coming.
Quickstart
import cv2
import collector_vision as cvg
catalog = cvg.Catalog.load("hf://HanClinto/milo/scryfall-mtg")
image = cv2.imread("my_card.jpg")
detection = cvg.NeuralCornerDetector().detect(image)
crop = detection.dewarp(image)
emb = catalog.embedder.embed(crop)
score, card_id = catalog.search(emb)[0]
print(card_id, score)
The catalog loads from HuggingFace on first run (~29 MB) and is cached after that.
Live video
For a webcam feed, call detect() on each frame and use the sharpness gate to skip unconfident frames. Accumulate scores across frames before committing to an answer.
cap = cv2.VideoCapture(0)
detector = cvg.NeuralCornerDetector()
score_map = {}
while True:
ret, frame = cap.read()
if not ret:
break
detection = detector.detect(frame)
if detection.card_present and detection.sharpness > 0.10:
crop = detection.dewarp(frame)
emb = catalog.embedder.embed(crop)
for score, card_id in catalog.search(emb, top_k=5):
score_map[card_id] = score_map.get(card_id, 0.0) + score
if score_map:
best_id = max(score_map, key=score_map.get)
if score_map[best_id] > 3.5:
print("Confirmed:", best_id)
score_map.clear()
The threshold of 3.5 is roughly equivalent to the same card winning across four consecutive frames. You can tune this.
REST server
For cases where you want to call the pipeline from a phone app, a web client, or a script over HTTP, there's a FastAPI server in examples/server/.
Install server dependencies:
pip install "git+https://github.com/HanClinto/CollectorVision.git[server]"
Start it:
python examples/server/server.py --hfd HanClinto/milo scryfall-mtg
Identify a card by uploading an image:
curl -X POST http://localhost:8000/identify/upload \
-F "file=@my_card.jpg"
The response looks like:
{
"card_present": true,
"card_id": "7286819f-6c57-4503-898c-528786ad86e9",
"confidence": 0.934,
"embedding": [0.023, -0.041, ...]
}
The embedding field is included in every response. That's used by the rolling buffer feature — clients can send back their recent frame embeddings and the server will average them before searching, which helps with noise. The server is stateless; the client owns the history.
Pre-cropped images
If you already have a clean card crop — from a flatbed scanner, for instance — you can skip detection entirely:
from PIL import Image
crop = Image.open("clean_crop.jpg")
emb = catalog.embedder.embed(crop)
hits = catalog.search(emb)
Swapping components
The library is built around protocols, not subclassing. Anything that implements detect(image) -> DetectionResult works as a corner detector. The repo has an example using OpenCV Canny edges in examples/advanced/custom_pipeline.py.
0 Comments
No comments yet. Be the first!
You'll need a GitHub account. Your comment will appear here on the next site rebuild.