GPU-Accelerated Colony Detection#
Set up and use deep-learning-based colony detectors (SAM2, micro-sam) with GPU acceleration.
Installation#
The two GPU detectors have different packaging constraints:
Detector |
Package(s) needed |
Available via |
CUDA-capable? |
|---|---|---|---|
|
|
PyPI (ships in |
Yes — Linux + CUDA |
|
|
conda-forge only, not on PyPI |
CPU by default; user-managed CUDA possible |
PhenoTypic itself is distributed via PyPI and managed with uv. micro_sam
is not published on PyPI, so it is not included in any phenotypic
extra. Users who need MicroSamDetector must install micro_sam
themselves; the recipe below uses pixi for that.
Installing Sam2Detector (PyPI-only)#
On Linux or macOS:
uv add "phenotypic[torch]" # torch + torchvision + sam2
# or, inside a uv-managed project:
uv sync --extra torch
The torch extra is not available on Windows — sam2 requires CUDA nvcc
and has no pre-built Windows wheels. Use WSL2 (Ubuntu) instead.
Enabling micro_sam (optional, self-service)#
micro_sam is only published on conda-forge. Because PhenoTypic does not
own your environment, we recommend managing the combined stack in your own
project with pixi, which speaks both conda-forge and
PyPI in a single lockfile. Create a pixi.toml in your project (not in
PhenoTypic):
[project]
name = "my-phenotyping-project"
channels = ["conda-forge"]
platforms = ["osx-arm64", "linux-64", "win-64"]
[pypi-dependencies]
phenotypic = "*"
# Or, while developing against a local checkout:
# phenotypic = { path = "../PhenoTypic", editable = true }
[dependencies]
micro_sam = "*"
Then:
pixi install
pixi run python -m phenotypic pipeline.json /plates/ /output/
Because conda’s micro_sam pulls in CPU-only conda PyTorch, combining it
with Sam2Detector’s CUDA wheels in the same environment requires extra
care (the conda torch will typically win). Keep SAM2 and micro-sam work
in separate environments if you need both with GPU acceleration.
MicroSamDetector is importable from phenotypic.nn even when
micro_sam is missing; the ImportError is deferred to the first
apply() call and points back at these instructions.
Alternative: pip + conda#
If you already manage your environment with conda:
pip install phenotypic # base (or phenotypic[torch] on non-Windows)
conda install -c conda-forge micro_sam # adds MicroSamDetector support
Downloading Model Checkpoints#
Both SAM2 and micro-sam download checkpoints automatically on first use. However, on SLURM clusters the compute nodes often lack internet access, so you should pre-download checkpoints on a login node before submitting jobs.
SAM2 checkpoints#
# Download the default (tiny) SAM2 checkpoint
python -m phenotypic.nn download
# Download a specific size
python -m phenotypic.nn download --model-type sam2 --model-size large
# Download all SAM2 sizes at once
python -m phenotypic.nn download --model-type sam2 --all
# Force re-download even if cached
python -m phenotypic.nn download --model-type sam2 --model-size tiny --force
SAM2 checkpoints are stored in the torch.hub cache directory
(~/.cache/torch/hub/checkpoints/ by default). Set the TORCH_HOME
environment variable to change this location.
Available SAM2 sizes: tiny (~39 MB), small, base_plus, large (~900 MB).
micro-sam checkpoints#
# Download the default (vit_b_lm) micro-sam model
python -m phenotypic.nn download --model-type microsam
# Download a specific model
python -m phenotypic.nn download --model-type microsam --model-name vit_l_lm
# Download all micro-sam models
python -m phenotypic.nn download --model-type microsam --all
micro-sam stores checkpoints via platformdirs. Set MICROSAM_CACHEDIR to
override the cache location.
SLURM pre-caching workflow#
On a cluster, download models on the login node first:
# On the login node (has internet access)
python -m phenotypic.nn download --model-type sam2 --model-size tiny
python -m phenotypic.nn download --model-type microsam --model-name vit_b_lm
# Verify they are cached
python -m phenotypic.nn list
# Now submit SLURM jobs -- compute nodes will use the cached checkpoints
python -m phenotypic pipeline.json /plates/ /output/
Using Sam2Detector#
Sam2Detector wraps Meta’s SAM2 automatic mask generator. It lays a grid of
prompt points over the RGB image, predicts masks at each point, filters by
quality, and assembles a labelled object map.
from phenotypic.nn import Sam2Detector
# Basic usage with default parameters
detector = Sam2Detector()
# Tuned for dense plates with small colonies
detector = Sam2Detector(
model_size="small",
points_per_side=48,
pred_iou_thresh=0.6,
min_mask_region_area=200,
)
# Apply to an image (downloads checkpoint on first use)
result = detector.apply(image)
print(result.num_objects)
Parameter tuning for colony detection#
points_per_side(default 32): Controls the density of the prompt grid. Use 16 for large, well-separated colonies. Increase to 48–64 for dense plates with many small colonies. Higher values increase inference time quadratically.pred_iou_thresh(default 0.7): Minimum predicted IoU for keeping a mask. Raise to 0.85–0.95 for conservative detection (fewer false positives); lower to 0.5 to catch faint or ambiguous colonies.stability_score_thresh(default 0.92): Filters masks by boundary stability. Higher values keep only masks with crisp edges.min_mask_region_area(default 100): Minimum mask area in pixels. Increase to suppress agar texture, dust, and other small artefacts that SAM2 segments as objects. Typical range: 50–500 depending on image resolution.model_size(default"tiny"):"tiny"is fastest and sufficient for most colony plates. Use"large"for maximum mask quality on publication figures.
Using MicroSamDetector#
MicroSamDetector uses SAM models finetuned on large-scale microscopy
datasets. It is particularly effective for brightfield and darkfield
microscopy images of agar plates.
from phenotypic.nn import MicroSamDetector
# Default: ViT-Base light microscopy model
detector = MicroSamDetector()
# Use the larger model for higher accuracy
detector = MicroSamDetector(model_type="vit_l_lm")
result = detector.apply(image)
Model selection#
Light microscopy models (recommended for agar plate imaging):
"vit_t_lm"– ViT-Tiny, fastest, good for rapid screening"vit_b_lm"– ViT-Base (default), best speed/accuracy trade-off"vit_l_lm"– ViT-Large, highest accuracy, most VRAM
Electron microscopy models (for organelle segmentation):
"vit_b_em_organelles"– ViT-Base"vit_l_em_organelles"– ViT-Large
Base SAM checkpoints (without microscopy finetuning):
"vit_t","vit_b","vit_l","vit_h"
Pipeline Integration#
GPU detectors work like any other PhenoTypic operation in a pipeline:
import phenotypic as pht
from phenotypic.nn import Sam2Detector
from phenotypic.measure import SizeMeasurer
pipeline = pht.ImagePipeline(
ops=[Sam2Detector(model_size="tiny", points_per_side=32)],
measurer=SizeMeasurer(),
name="sam2_colony_pipeline",
)
# Run the pipeline
results = pipeline.operate([image])
df = pipeline.measure([image])
JSON serialization#
Pipelines containing GPU detectors can be saved and loaded just like any other pipeline. The detector parameters are serialized; the model weights are not (they are re-downloaded or loaded from cache when needed):
# Save
pipeline.to_json("sam2_pipeline.json")
# Load -- works without torch installed (model loads lazily on apply)
restored = pht.ImagePipeline.from_json("sam2_pipeline.json")
Internal state (attributes prefixed with _, such as the loaded model) is
excluded from serialization. The model is rebuilt transparently on the next
call to apply.
SLURM Deployment#
When a pipeline contains a GpuDetector operation (either Sam2Detector or
MicroSamDetector), the CLI automatically adapts:
Local execution: Forces sequential processing (n_jobs=1) to avoid
multiple workers competing for the same GPU.
SLURM execution: Automatically adds --gpus-per-node=1 to the SLURM
job if GPU resources were not explicitly requested.
# GPU resources are auto-requested when the pipeline contains a GpuDetector
python -m phenotypic sam2_pipeline.json /plates/ /output/
# Override with explicit SLURM GPU arguments
python -m phenotypic sam2_pipeline.json /plates/ /output/ \
--slurm-args slurm_gpus_per_node=2 \
--slurm-args slurm_partition=gpu
Pre-cache checkpoints on the login node before submitting (see “Downloading Model Checkpoints” above).
Device Selection#
Both detectors accept a device parameter that controls where inference runs.
Automatic detection (default)#
With device="auto" (the default), PhenoTypic probes accelerators in priority
order:
CUDA – NVIDIA GPUs
MPS – Apple Silicon (macOS)
XPU – Intel GPUs
HPU – Habana Gaudi accelerators
If none is found, a RuntimeError is raised.
Explicit device#
# Force a specific device
Sam2Detector(device="cuda") # NVIDIA GPU
Sam2Detector(device="mps") # Apple Silicon
Sam2Detector(device="xpu") # Intel GPU
Sam2Detector(device="cpu") # CPU (very slow, but always available)
When an explicit accelerator is requested but unavailable, a RuntimeError
is raised with a descriptive message.
resolve_device() utility#
The device resolution logic is available as a standalone function for custom workflows:
from phenotypic.nn._checkpoint_manager import resolve_device
device = resolve_device("auto") # raises if no accelerator
device = resolve_device("auto", allow_cpu=True) # falls back to CPU with warning
Listing and Clearing Models#
List cached checkpoints#
python -m phenotypic.nn list
This prints a table showing all cached SAM2 and micro-sam checkpoints with their file sizes and paths.
Clear cached checkpoints#
# Clear all cached checkpoints (prompts for confirmation)
python -m phenotypic.nn clear
# Clear only SAM2 checkpoints
python -m phenotypic.nn clear --model-type sam2
# Clear only micro-sam checkpoints
python -m phenotypic.nn clear --model-type microsam
Troubleshooting#
ImportError: Sam2Detector requires the sam2 package#
PyTorch and the model packages are not installed. Install the torch extra:
uv add "phenotypic[torch]"
(Linux/macOS only — sam2 is not packaged for Windows.)
ImportError: MicroSamDetector requires the micro_sam package#
micro_sam is conda-only and must be installed separately. See
Enabling micro_sam (optional, self-service)
above.
RuntimeError: No accelerator available#
No GPU was detected. Options:
Ensure your GPU drivers and CUDA toolkit are installed correctly.
On macOS with Apple Silicon, ensure PyTorch >= 2.0 with MPS support.
Pass
device="cpu"to force CPU inference (very slow):
Sam2Detector(device="cpu")
RuntimeError: device='cuda' requested but CUDA is not available#
CUDA was explicitly requested but is not available. Check:
nvidia-smishows your GPU.PyTorch was installed with CUDA support (
torch.cuda.is_available()returnsTrue).On SLURM, the job was submitted to a GPU partition.
Out of memory (OOM) errors#
SAM models require significant GPU memory. To reduce VRAM usage:
Use a smaller model:
Sam2Detector(model_size="tiny")instead of"large".Use
MicroSamDetector(model_type="vit_t_lm")for the smallest micro-sam model.Reduce
points_per_side(e.g., 16 instead of 32) to generate fewer candidate masks.Process smaller images or downscale before detection.
Checkpoint not found on SLURM compute nodes#
Compute nodes often lack internet access. Pre-download checkpoints on the login node:
python -m phenotypic.nn download --model-type sam2 --model-size tiny
python -m phenotypic.nn download --model-type microsam --model-name vit_b_lm
python -m phenotypic.nn list # verify
Ensure TORCH_HOME and MICROSAM_CACHEDIR (if customised) point to a
shared filesystem accessible from compute nodes.