Tutorial 7: Measuring and Exporting#

Detection tells you where colonies are. Measurement tells you what they are — how big, how round, how bright. In this tutorial you will add measurements to a pipeline, extract a DataFrame of colony features, and export the results for downstream analysis.

What you will learn:

  1. Add measurement operations to a pipeline

  2. Use pipeline.apply_and_measure() to get a DataFrame

  3. Understand the output columns

  4. Export to CSV and Parquet

Imports#

[1]:
import phenotypic as pht
from phenotypic.data import load_yeast_plate
from phenotypic.enhance import GaussianBlur, CLAHE
from phenotypic.detect import OtsuDetector
from phenotypic.measure import MeasureSize, MeasureShape, MeasureIntensity

Build a Pipeline with Measurements#

The meas parameter accepts a list of measurement operations. Each one extracts a different set of features from the detected colonies.

[2]:
plate = load_yeast_plate()

pipeline = pht.ImagePipeline(
    ops=[GaussianBlur(sigma=2.0), CLAHE(clip_limit=0.01), OtsuDetector()],
    meas=[MeasureSize(), MeasureShape(), MeasureIntensity()],
)

Apply and Measure#

.apply_and_measure() runs the full pipeline (enhance → detect → measure) and returns a pandas DataFrame with one row per detected colony.

[3]:
df = pipeline.apply_and_measure(plate)
print(f"Measured {len(df)} colonies across {df.shape[1]} features")
df.head()
Measured 9 colonies across 50 features
[3]:
Metadata_FileSuffix Metadata_BitDepth Metadata_ImageType Metadata_ImageName ObjectLabel Bbox_CenterRR Bbox_CenterCC Bbox_MinRR Bbox_MinCC Bbox_MaxRR ... Intensity_MaximumIntensity Intensity_MeanIntensity Intensity_MedianIntensity Intensity_StandardDeviationIntensity Intensity_CoefficientVarianceIntensity Intensity_LowerQuartileIntensity Intensity_UpperQuartileIntensity Intensity_InterquartileRangeIntensity Intensity_Density Intensity_ConvexDensity
0 .png 8 GridImage RhodotorulaYeastCenterCrop 1 189.366746 190.568832 101 108 279 ... 0.686547 0.585904 0.595740 0.042424 0.072411 0.562494 0.616740 0.054246 0.585904 24.268837
1 .png 8 GridImage RhodotorulaYeastCenterCrop 2 180.748935 1003.205256 107 930 258 ... 0.695336 0.591194 0.600607 0.043489 0.073566 0.577300 0.618635 0.041335 0.591194 21.350965
2 .png 8 GridImage RhodotorulaYeastCenterCrop 3 179.459997 1409.055778 107 1338 253 ... 0.672282 0.575144 0.582211 0.037242 0.064757 0.559643 0.599775 0.040132 0.575144 20.379085
3 .png 8 GridImage RhodotorulaYeastCenterCrop 4 184.220957 598.966905 116 534 251 ... 0.659846 0.565152 0.574650 0.041081 0.072696 0.552847 0.590872 0.038024 0.565152 18.114211
4 .png 8 GridImage RhodotorulaYeastCenterCrop 5 164.000000 1074.000000 164 1074 165 ... 0.417128 0.417128 0.417128 0.000000 0.000000 0.417128 0.417128 0.000000 0.417128 NaN

5 rows × 50 columns

Explore the Columns#

Each measurement operation contributes its own set of columns. Let’s see what we got.

[4]:
print("All columns:")
for col in df.columns:
    print(f"  {col}")
All columns:
  Metadata_FileSuffix
  Metadata_BitDepth
  Metadata_ImageType
  Metadata_ImageName
  ObjectLabel
  Bbox_CenterRR
  Bbox_CenterCC
  Bbox_MinRR
  Bbox_MinCC
  Bbox_MaxRR
  Bbox_MaxCC
  Bbox_IntensityWeightedCenterRR
  Bbox_IntensityWeightedCenterCC
  Bbox_DistWeightedCenterRR
  Bbox_DistWeightedCenterCC
  Grid_RowNum
  Grid_ColNum
  Grid_RowMajorIdx
  Grid_ColMajorIdx
  Size_Area
  Size_IntegratedIntensity
  Shape_Area
  Shape_Perimeter
  Shape_Circularity
  Shape_ConvexArea
  Shape_MedianRadius
  Shape_MeanRadius
  Shape_MaxRadius
  Shape_MinFeretDiameter
  Shape_MaxFeretDiameter
  Shape_Eccentricity
  Shape_Solidity
  Shape_Extent
  Shape_BboxArea
  Shape_MajorAxisLength
  Shape_MinorAxisLength
  Shape_Compactness
  Shape_Orientation
  Intensity_IntegratedIntensity
  Intensity_MinimumIntensity
  Intensity_MaximumIntensity
  Intensity_MeanIntensity
  Intensity_MedianIntensity
  Intensity_StandardDeviationIntensity
  Intensity_CoefficientVarianceIntensity
  Intensity_LowerQuartileIntensity
  Intensity_UpperQuartileIntensity
  Intensity_InterquartileRangeIntensity
  Intensity_Density
  Intensity_ConvexDensity

Here are the highlights from each measurement:

MeasureSize:

  • Size_Area — colony size in pixels

  • Size_IntegratedIntensity — sum of grayscale pixel values

MeasureShape:

  • Shape_Circularity — how round the colony is (1.0 = perfect circle)

  • Shape_Solidity — ratio of colony area to convex hull area

  • Shape_Eccentricity — elongation (0 = circular, approaching 1 = elongated)

  • Shape_MajorAxisLength / Shape_MinorAxisLength — fitted ellipse axes

MeasureIntensity:

  • Intensity_MeanIntensity / Intensity_MedianIntensity — average colony brightness

  • Intensity_StandardDeviationIntensity — variation within the colony

  • Intensity_MinimumIntensity / Intensity_MaximumIntensity — intensity extremes

Quick Statistics#

Since the result is a standard pandas DataFrame, you can use all the usual pandas methods to explore it.

[5]:
df[["Size_Area", "Shape_Circularity", "Intensity_MeanIntensity"]].describe()
[5]:
Size_Area Shape_Circularity Intensity_MeanIntensity
count 9.000000 7.000000 9.000000
mean 13022.000000 0.645563 0.550580
std 7822.634291 0.067261 0.066774
min 1.000000 0.510927 0.417128
25% 13265.000000 0.637557 0.565152
50% 15974.000000 0.654747 0.575144
75% 17006.000000 0.674367 0.585904
max 22337.000000 0.729419 0.611042

Export to CSV#

For sharing with collaborators or importing into spreadsheet software, export to CSV.

[6]:
df.to_csv("colony_measurements.csv")
print("Saved to colony_measurements.csv")
Saved to colony_measurements.csv

Export to Parquet#

For large datasets, Parquet is more efficient — it is compressed, preserves column types, and loads much faster than CSV.

[7]:
df.to_parquet("colony_measurements.parquet")
print("Saved to colony_measurements.parquet")
Saved to colony_measurements.parquet

Clean Up#

[8]:
import os
os.remove("colony_measurements.csv")
os.remove("colony_measurements.parquet")

Summary#

You have extracted colony features and exported them for analysis:

  • ``meas=[MeasureSize(), MeasureShape(), MeasureIntensity()]`` — add measurements to a pipeline

  • ``pipeline.apply_and_measure(plate)`` — run the full pipeline and get a DataFrame

  • ``.to_csv()`` / ``.to_parquet()`` — export for downstream tools

The result is a standard pandas DataFrame, so you can filter, group, plot, and analyze it with any tool in the Python ecosystem.

Next up: Tutorial 8: Using Prefab Pipelines — discover PhenoTypic’s pre-built pipelines for common organisms and plate types.