Tutorial 7: Measuring and Exporting#

Detection tells you where colonies are. Measurement tells you what they are — how big, how round, how bright. In this tutorial you will add measurements to a pipeline, extract a DataFrame of colony features, and export the results for downstream analysis.

What you will learn:

Add measurement operations to a pipeline
Use pipeline.apply_and_measure() to get a DataFrame
Understand the output columns
Export to CSV and Parquet

Imports#

[1]:

import phenotypic as pht
from phenotypic.data import load_yeast_plate
from phenotypic.enhance import GaussianBlur, CLAHE
from phenotypic.detect import OtsuDetector
from phenotypic.measure import MeasureSize, MeasureShape, MeasureIntensity

Build a Pipeline with Measurements#

The meas parameter accepts a list of measurement operations. Each one extracts a different set of features from the detected colonies.

[2]:

plate = load_yeast_plate()

pipeline = pht.ImagePipeline(
    ops=[GaussianBlur(sigma=2.0), CLAHE(clip_limit=0.01), OtsuDetector()],
    meas=[MeasureSize(), MeasureShape(), MeasureIntensity()],
)

Apply and Measure#

.apply_and_measure() runs the full pipeline (enhance → detect → measure) and returns a pandas DataFrame with one row per detected colony.

[3]:

df = pipeline.apply_and_measure(plate)
print(f"Measured {len(df)} colonies across {df.shape[1]} features")
df.head()

Measured 9 colonies across 50 features

[3]:

	Metadata_FileSuffix	Metadata_BitDepth	Metadata_ImageType	Metadata_ImageName	ObjectLabel	Bbox_CenterRR	Bbox_CenterCC	Bbox_MinRR	Bbox_MinCC	Bbox_MaxRR	...	Intensity_MaximumIntensity	Intensity_MeanIntensity	Intensity_MedianIntensity	Intensity_StandardDeviationIntensity	Intensity_CoefficientVarianceIntensity	Intensity_LowerQuartileIntensity	Intensity_UpperQuartileIntensity	Intensity_InterquartileRangeIntensity	Intensity_Density	Intensity_ConvexDensity
0	.png	8	GridImage	RhodotorulaYeastCenterCrop	1	189.366746	190.568832	101	108	279	...	0.686547	0.585904	0.595740	0.042424	0.072411	0.562494	0.616740	0.054246	0.585904	24.268837
1	.png	8	GridImage	RhodotorulaYeastCenterCrop	2	180.748935	1003.205256	107	930	258	...	0.695336	0.591194	0.600607	0.043489	0.073566	0.577300	0.618635	0.041335	0.591194	21.350965
2	.png	8	GridImage	RhodotorulaYeastCenterCrop	3	179.459997	1409.055778	107	1338	253	...	0.672282	0.575144	0.582211	0.037242	0.064757	0.559643	0.599775	0.040132	0.575144	20.379085
3	.png	8	GridImage	RhodotorulaYeastCenterCrop	4	184.220957	598.966905	116	534	251	...	0.659846	0.565152	0.574650	0.041081	0.072696	0.552847	0.590872	0.038024	0.565152	18.114211
4	.png	8	GridImage	RhodotorulaYeastCenterCrop	5	164.000000	1074.000000	164	1074	165	...	0.417128	0.417128	0.417128	0.000000	0.000000	0.417128	0.417128	0.000000	0.417128	NaN

5 rows × 50 columns

Explore the Columns#

Each measurement operation contributes its own set of columns. Let’s see what we got.

[4]:

print("All columns:")
for col in df.columns:
    print(f"  {col}")

All columns:
  Metadata_FileSuffix
  Metadata_BitDepth
  Metadata_ImageType
  Metadata_ImageName
  ObjectLabel
  Bbox_CenterRR
  Bbox_CenterCC
  Bbox_MinRR
  Bbox_MinCC
  Bbox_MaxRR
  Bbox_MaxCC
  Bbox_IntensityWeightedCenterRR
  Bbox_IntensityWeightedCenterCC
  Bbox_DistWeightedCenterRR
  Bbox_DistWeightedCenterCC
  Grid_RowNum
  Grid_ColNum
  Grid_RowMajorIdx
  Grid_ColMajorIdx
  Size_Area
  Size_IntegratedIntensity
  Shape_Area
  Shape_Perimeter
  Shape_Circularity
  Shape_ConvexArea
  Shape_MedianRadius
  Shape_MeanRadius
  Shape_MaxRadius
  Shape_MinFeretDiameter
  Shape_MaxFeretDiameter
  Shape_Eccentricity
  Shape_Solidity
  Shape_Extent
  Shape_BboxArea
  Shape_MajorAxisLength
  Shape_MinorAxisLength
  Shape_Compactness
  Shape_Orientation
  Intensity_IntegratedIntensity
  Intensity_MinimumIntensity
  Intensity_MaximumIntensity
  Intensity_MeanIntensity
  Intensity_MedianIntensity
  Intensity_StandardDeviationIntensity
  Intensity_CoefficientVarianceIntensity
  Intensity_LowerQuartileIntensity
  Intensity_UpperQuartileIntensity
  Intensity_InterquartileRangeIntensity
  Intensity_Density
  Intensity_ConvexDensity

Here are the highlights from each measurement:

MeasureSize:

Size_Area — colony size in pixels
Size_IntegratedIntensity — sum of grayscale pixel values

MeasureShape:

Shape_Circularity — how round the colony is (1.0 = perfect circle)
Shape_Solidity — ratio of colony area to convex hull area
Shape_Eccentricity — elongation (0 = circular, approaching 1 = elongated)
Shape_MajorAxisLength / Shape_MinorAxisLength — fitted ellipse axes

MeasureIntensity:

Intensity_MeanIntensity / Intensity_MedianIntensity — average colony brightness
Intensity_StandardDeviationIntensity — variation within the colony
Intensity_MinimumIntensity / Intensity_MaximumIntensity — intensity extremes

Quick Statistics#

Since the result is a standard pandas DataFrame, you can use all the usual pandas methods to explore it.

[5]:

df[["Size_Area", "Shape_Circularity", "Intensity_MeanIntensity"]].describe()

[5]:

	Size_Area	Shape_Circularity	Intensity_MeanIntensity
count	9.000000	7.000000	9.000000
mean	13022.000000	0.645563	0.550580
std	7822.634291	0.067261	0.066774
min	1.000000	0.510927	0.417128
25%	13265.000000	0.637557	0.565152
50%	15974.000000	0.654747	0.575144
75%	17006.000000	0.674367	0.585904
max	22337.000000	0.729419	0.611042

Export to CSV#

For sharing with collaborators or importing into spreadsheet software, export to CSV.

[6]:

df.to_csv("colony_measurements.csv")
print("Saved to colony_measurements.csv")

Saved to colony_measurements.csv

Export to Parquet#

For large datasets, Parquet is more efficient — it is compressed, preserves column types, and loads much faster than CSV.

[7]:

df.to_parquet("colony_measurements.parquet")
print("Saved to colony_measurements.parquet")

Saved to colony_measurements.parquet

Clean Up#

[8]:

import os
os.remove("colony_measurements.csv")
os.remove("colony_measurements.parquet")

Summary#

You have extracted colony features and exported them for analysis:

``meas=[MeasureSize(), MeasureShape(), MeasureIntensity()]`` — add measurements to a pipeline
``pipeline.apply_and_measure(plate)`` — run the full pipeline and get a DataFrame
``.to_csv()`` / ``.to_parquet()`` — export for downstream tools

The result is a standard pandas DataFrame, so you can filter, group, plot, and analyze it with any tool in the Python ecosystem.

Next up: Tutorial 8: Using Prefab Pipelines — discover PhenoTypic’s pre-built pipelines for common organisms and plate types.