GPU-Accelerated Colony Detection#

Set up and use deep-learning-based colony detectors (SAM2, micro-sam) with GPU acceleration.

Installation#

The two GPU detectors have different packaging constraints:

Detector

Package(s) needed

Available via

CUDA-capable?

Sam2Detector

torch, torchvision, sam2

PyPI (ships in phenotypic[torch])

Yes — Linux + CUDA

MicroSamDetector

micro_sam (+ torch)

conda-forge only, not on PyPI

CPU by default; user-managed CUDA possible

PhenoTypic itself is distributed via PyPI and managed with uv. micro_sam is not published on PyPI, so it is not included in any phenotypic extra. Users who need MicroSamDetector must install micro_sam themselves; the recipe below uses pixi for that.

Installing Sam2Detector (PyPI-only)#

On Linux or macOS:

uv add "phenotypic[torch]"          # torch + torchvision + sam2
# or, inside a uv-managed project:
uv sync --extra torch

The torch extra is not available on Windows — sam2 requires CUDA nvcc and has no pre-built Windows wheels. Use WSL2 (Ubuntu) instead.

Enabling micro_sam (optional, self-service)#

micro_sam is only published on conda-forge. Because PhenoTypic does not own your environment, we recommend managing the combined stack in your own project with pixi, which speaks both conda-forge and PyPI in a single lockfile. Create a pixi.toml in your project (not in PhenoTypic):

[project]
name = "my-phenotyping-project"
channels = ["conda-forge"]
platforms = ["osx-arm64", "linux-64", "win-64"]

[pypi-dependencies]
phenotypic = "*"
# Or, while developing against a local checkout:
# phenotypic = { path = "../PhenoTypic", editable = true }

[dependencies]
micro_sam = "*"

Then:

pixi install
pixi run python -m phenotypic pipeline.json /plates/ /output/

Because conda’s micro_sam pulls in CPU-only conda PyTorch, combining it with Sam2Detector’s CUDA wheels in the same environment requires extra care (the conda torch will typically win). Keep SAM2 and micro-sam work in separate environments if you need both with GPU acceleration.

MicroSamDetector is importable from phenotypic.nn even when micro_sam is missing; the ImportError is deferred to the first apply() call and points back at these instructions.

Alternative: pip + conda#

If you already manage your environment with conda:

pip install phenotypic                        # base (or phenotypic[torch] on non-Windows)
conda install -c conda-forge micro_sam        # adds MicroSamDetector support

Downloading Model Checkpoints#

Both SAM2 and micro-sam download checkpoints automatically on first use. However, on SLURM clusters the compute nodes often lack internet access, so you should pre-download checkpoints on a login node before submitting jobs.

SAM2 checkpoints#

# Download the default (tiny) SAM2 checkpoint
python -m phenotypic.nn download

# Download a specific size
python -m phenotypic.nn download --model-type sam2 --model-size large

# Download all SAM2 sizes at once
python -m phenotypic.nn download --model-type sam2 --all

# Force re-download even if cached
python -m phenotypic.nn download --model-type sam2 --model-size tiny --force

SAM2 checkpoints are stored in the torch.hub cache directory (~/.cache/torch/hub/checkpoints/ by default). Set the TORCH_HOME environment variable to change this location.

Available SAM2 sizes: tiny (~39 MB), small, base_plus, large (~900 MB).

micro-sam checkpoints#

# Download the default (vit_b_lm) micro-sam model
python -m phenotypic.nn download --model-type microsam

# Download a specific model
python -m phenotypic.nn download --model-type microsam --model-name vit_l_lm

# Download all micro-sam models
python -m phenotypic.nn download --model-type microsam --all

micro-sam stores checkpoints via platformdirs. Set MICROSAM_CACHEDIR to override the cache location.

SLURM pre-caching workflow#

On a cluster, download models on the login node first:

# On the login node (has internet access)
python -m phenotypic.nn download --model-type sam2 --model-size tiny
python -m phenotypic.nn download --model-type microsam --model-name vit_b_lm

# Verify they are cached
python -m phenotypic.nn list

# Now submit SLURM jobs -- compute nodes will use the cached checkpoints
python -m phenotypic pipeline.json /plates/ /output/

Using Sam2Detector#

Sam2Detector wraps Meta’s SAM2 automatic mask generator. It lays a grid of prompt points over the RGB image, predicts masks at each point, filters by quality, and assembles a labelled object map.

from phenotypic.nn import Sam2Detector

# Basic usage with default parameters
detector = Sam2Detector()

# Tuned for dense plates with small colonies
detector = Sam2Detector(
    model_size="small",
    points_per_side=48,
    pred_iou_thresh=0.6,
    min_mask_region_area=200,
)

# Apply to an image (downloads checkpoint on first use)
result = detector.apply(image)
print(result.num_objects)

Parameter tuning for colony detection#

  • points_per_side (default 32): Controls the density of the prompt grid. Use 16 for large, well-separated colonies. Increase to 48–64 for dense plates with many small colonies. Higher values increase inference time quadratically.

  • pred_iou_thresh (default 0.7): Minimum predicted IoU for keeping a mask. Raise to 0.85–0.95 for conservative detection (fewer false positives); lower to 0.5 to catch faint or ambiguous colonies.

  • stability_score_thresh (default 0.92): Filters masks by boundary stability. Higher values keep only masks with crisp edges.

  • min_mask_region_area (default 100): Minimum mask area in pixels. Increase to suppress agar texture, dust, and other small artefacts that SAM2 segments as objects. Typical range: 50–500 depending on image resolution.

  • model_size (default "tiny"): "tiny" is fastest and sufficient for most colony plates. Use "large" for maximum mask quality on publication figures.

Using MicroSamDetector#

MicroSamDetector uses SAM models finetuned on large-scale microscopy datasets. It is particularly effective for brightfield and darkfield microscopy images of agar plates.

from phenotypic.nn import MicroSamDetector

# Default: ViT-Base light microscopy model
detector = MicroSamDetector()

# Use the larger model for higher accuracy
detector = MicroSamDetector(model_type="vit_l_lm")

result = detector.apply(image)

Model selection#

Light microscopy models (recommended for agar plate imaging):

  • "vit_t_lm" – ViT-Tiny, fastest, good for rapid screening

  • "vit_b_lm" – ViT-Base (default), best speed/accuracy trade-off

  • "vit_l_lm" – ViT-Large, highest accuracy, most VRAM

Electron microscopy models (for organelle segmentation):

  • "vit_b_em_organelles" – ViT-Base

  • "vit_l_em_organelles" – ViT-Large

Base SAM checkpoints (without microscopy finetuning):

  • "vit_t", "vit_b", "vit_l", "vit_h"

Pipeline Integration#

GPU detectors work like any other PhenoTypic operation in a pipeline:

import phenotypic as pht
from phenotypic.nn import Sam2Detector
from phenotypic.measure import SizeMeasurer

pipeline = pht.ImagePipeline(
    ops=[Sam2Detector(model_size="tiny", points_per_side=32)],
    measurer=SizeMeasurer(),
    name="sam2_colony_pipeline",
)

# Run the pipeline
results = pipeline.operate([image])
df = pipeline.measure([image])

JSON serialization#

Pipelines containing GPU detectors can be saved and loaded just like any other pipeline. The detector parameters are serialized; the model weights are not (they are re-downloaded or loaded from cache when needed):

# Save
pipeline.to_json("sam2_pipeline.json")

# Load -- works without torch installed (model loads lazily on apply)
restored = pht.ImagePipeline.from_json("sam2_pipeline.json")

Internal state (attributes prefixed with _, such as the loaded model) is excluded from serialization. The model is rebuilt transparently on the next call to apply.

SLURM Deployment#

When a pipeline contains a GpuDetector operation (either Sam2Detector or MicroSamDetector), the CLI automatically adapts:

Local execution: Forces sequential processing (n_jobs=1) to avoid multiple workers competing for the same GPU.

SLURM execution: Automatically adds --gpus-per-node=1 to the SLURM job if GPU resources were not explicitly requested.

# GPU resources are auto-requested when the pipeline contains a GpuDetector
python -m phenotypic sam2_pipeline.json /plates/ /output/

# Override with explicit SLURM GPU arguments
python -m phenotypic sam2_pipeline.json /plates/ /output/ \
    --slurm-args slurm_gpus_per_node=2 \
    --slurm-args slurm_partition=gpu

Pre-cache checkpoints on the login node before submitting (see “Downloading Model Checkpoints” above).

Device Selection#

Both detectors accept a device parameter that controls where inference runs.

Automatic detection (default)#

With device="auto" (the default), PhenoTypic probes accelerators in priority order:

  1. CUDA – NVIDIA GPUs

  2. MPS – Apple Silicon (macOS)

  3. XPU – Intel GPUs

  4. HPU – Habana Gaudi accelerators

If none is found, a RuntimeError is raised.

Explicit device#

# Force a specific device
Sam2Detector(device="cuda")   # NVIDIA GPU
Sam2Detector(device="mps")    # Apple Silicon
Sam2Detector(device="xpu")    # Intel GPU
Sam2Detector(device="cpu")    # CPU (very slow, but always available)

When an explicit accelerator is requested but unavailable, a RuntimeError is raised with a descriptive message.

resolve_device() utility#

The device resolution logic is available as a standalone function for custom workflows:

from phenotypic.nn._checkpoint_manager import resolve_device

device = resolve_device("auto")           # raises if no accelerator
device = resolve_device("auto", allow_cpu=True)  # falls back to CPU with warning

Listing and Clearing Models#

List cached checkpoints#

python -m phenotypic.nn list

This prints a table showing all cached SAM2 and micro-sam checkpoints with their file sizes and paths.

Clear cached checkpoints#

# Clear all cached checkpoints (prompts for confirmation)
python -m phenotypic.nn clear

# Clear only SAM2 checkpoints
python -m phenotypic.nn clear --model-type sam2

# Clear only micro-sam checkpoints
python -m phenotypic.nn clear --model-type microsam

Troubleshooting#

ImportError: Sam2Detector requires the sam2 package#

PyTorch and the model packages are not installed. Install the torch extra:

uv add "phenotypic[torch]"

(Linux/macOS only — sam2 is not packaged for Windows.)

ImportError: MicroSamDetector requires the micro_sam package#

micro_sam is conda-only and must be installed separately. See Enabling micro_sam (optional, self-service) above.

RuntimeError: No accelerator available#

No GPU was detected. Options:

  • Ensure your GPU drivers and CUDA toolkit are installed correctly.

  • On macOS with Apple Silicon, ensure PyTorch >= 2.0 with MPS support.

  • Pass device="cpu" to force CPU inference (very slow):

Sam2Detector(device="cpu")

RuntimeError: device='cuda' requested but CUDA is not available#

CUDA was explicitly requested but is not available. Check:

  • nvidia-smi shows your GPU.

  • PyTorch was installed with CUDA support (torch.cuda.is_available() returns True).

  • On SLURM, the job was submitted to a GPU partition.

Out of memory (OOM) errors#

SAM models require significant GPU memory. To reduce VRAM usage:

  • Use a smaller model: Sam2Detector(model_size="tiny") instead of "large".

  • Use MicroSamDetector(model_type="vit_t_lm") for the smallest micro-sam model.

  • Reduce points_per_side (e.g., 16 instead of 32) to generate fewer candidate masks.

  • Process smaller images or downscale before detection.

Checkpoint not found on SLURM compute nodes#

Compute nodes often lack internet access. Pre-download checkpoints on the login node:

python -m phenotypic.nn download --model-type sam2 --model-size tiny
python -m phenotypic.nn download --model-type microsam --model-name vit_b_lm
python -m phenotypic.nn list  # verify

Ensure TORCH_HOME and MICROSAM_CACHEDIR (if customised) point to a shared filesystem accessible from compute nodes.