Tutorial 7: Measuring and Exporting#
Detection tells you where colonies are. Measurement tells you what they are — how big, how round, how bright. In this tutorial you will add measurements to a pipeline, extract a DataFrame of colony features, and export the results for downstream analysis.
What you will learn:
Add measurement operations to a pipeline
Use
pipeline.apply_and_measure()to get a DataFrameUnderstand the output columns
Export to CSV and Parquet
Imports#
[1]:
import phenotypic as pht
from phenotypic.data import load_yeast_plate
from phenotypic.enhance import GaussianBlur, CLAHE
from phenotypic.detect import OtsuDetector
from phenotypic.measure import MeasureSize, MeasureShape, MeasureIntensity
Build a Pipeline with Measurements#
The meas parameter accepts a list of measurement operations. Each one extracts a different set of features from the detected colonies.
[2]:
plate = load_yeast_plate()
pipeline = pht.ImagePipeline(
ops=[GaussianBlur(sigma=2.0), CLAHE(clip_limit=0.01), OtsuDetector()],
meas=[MeasureSize(), MeasureShape(), MeasureIntensity()],
)
Apply and Measure#
.apply_and_measure() runs the full pipeline (enhance → detect → measure) and returns a pandas DataFrame with one row per detected colony.
[3]:
df = pipeline.apply_and_measure(plate)
print(f"Measured {len(df)} colonies across {df.shape[1]} features")
df.head()
Measured 9 colonies across 50 features
[3]:
| Metadata_FileSuffix | Metadata_BitDepth | Metadata_ImageType | Metadata_ImageName | ObjectLabel | Bbox_CenterRR | Bbox_CenterCC | Bbox_MinRR | Bbox_MinCC | Bbox_MaxRR | ... | Intensity_MaximumIntensity | Intensity_MeanIntensity | Intensity_MedianIntensity | Intensity_StandardDeviationIntensity | Intensity_CoefficientVarianceIntensity | Intensity_LowerQuartileIntensity | Intensity_UpperQuartileIntensity | Intensity_InterquartileRangeIntensity | Intensity_Density | Intensity_ConvexDensity | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | .png | 8 | GridImage | RhodotorulaYeastCenterCrop | 1 | 189.366746 | 190.568832 | 101 | 108 | 279 | ... | 0.686547 | 0.585904 | 0.595740 | 0.042424 | 0.072411 | 0.562494 | 0.616740 | 0.054246 | 0.585904 | 24.268837 |
| 1 | .png | 8 | GridImage | RhodotorulaYeastCenterCrop | 2 | 180.748935 | 1003.205256 | 107 | 930 | 258 | ... | 0.695336 | 0.591194 | 0.600607 | 0.043489 | 0.073566 | 0.577300 | 0.618635 | 0.041335 | 0.591194 | 21.350965 |
| 2 | .png | 8 | GridImage | RhodotorulaYeastCenterCrop | 3 | 179.459997 | 1409.055778 | 107 | 1338 | 253 | ... | 0.672282 | 0.575144 | 0.582211 | 0.037242 | 0.064757 | 0.559643 | 0.599775 | 0.040132 | 0.575144 | 20.379085 |
| 3 | .png | 8 | GridImage | RhodotorulaYeastCenterCrop | 4 | 184.220957 | 598.966905 | 116 | 534 | 251 | ... | 0.659846 | 0.565152 | 0.574650 | 0.041081 | 0.072696 | 0.552847 | 0.590872 | 0.038024 | 0.565152 | 18.114211 |
| 4 | .png | 8 | GridImage | RhodotorulaYeastCenterCrop | 5 | 164.000000 | 1074.000000 | 164 | 1074 | 165 | ... | 0.417128 | 0.417128 | 0.417128 | 0.000000 | 0.000000 | 0.417128 | 0.417128 | 0.000000 | 0.417128 | NaN |
5 rows × 50 columns
Explore the Columns#
Each measurement operation contributes its own set of columns. Let’s see what we got.
[4]:
print("All columns:")
for col in df.columns:
print(f" {col}")
All columns:
Metadata_FileSuffix
Metadata_BitDepth
Metadata_ImageType
Metadata_ImageName
ObjectLabel
Bbox_CenterRR
Bbox_CenterCC
Bbox_MinRR
Bbox_MinCC
Bbox_MaxRR
Bbox_MaxCC
Bbox_IntensityWeightedCenterRR
Bbox_IntensityWeightedCenterCC
Bbox_DistWeightedCenterRR
Bbox_DistWeightedCenterCC
Grid_RowNum
Grid_ColNum
Grid_RowMajorIdx
Grid_ColMajorIdx
Size_Area
Size_IntegratedIntensity
Shape_Area
Shape_Perimeter
Shape_Circularity
Shape_ConvexArea
Shape_MedianRadius
Shape_MeanRadius
Shape_MaxRadius
Shape_MinFeretDiameter
Shape_MaxFeretDiameter
Shape_Eccentricity
Shape_Solidity
Shape_Extent
Shape_BboxArea
Shape_MajorAxisLength
Shape_MinorAxisLength
Shape_Compactness
Shape_Orientation
Intensity_IntegratedIntensity
Intensity_MinimumIntensity
Intensity_MaximumIntensity
Intensity_MeanIntensity
Intensity_MedianIntensity
Intensity_StandardDeviationIntensity
Intensity_CoefficientVarianceIntensity
Intensity_LowerQuartileIntensity
Intensity_UpperQuartileIntensity
Intensity_InterquartileRangeIntensity
Intensity_Density
Intensity_ConvexDensity
Here are the highlights from each measurement:
MeasureSize:
Size_Area— colony size in pixelsSize_IntegratedIntensity— sum of grayscale pixel values
MeasureShape:
Shape_Circularity— how round the colony is (1.0 = perfect circle)Shape_Solidity— ratio of colony area to convex hull areaShape_Eccentricity— elongation (0 = circular, approaching 1 = elongated)Shape_MajorAxisLength/Shape_MinorAxisLength— fitted ellipse axes
MeasureIntensity:
Intensity_MeanIntensity/Intensity_MedianIntensity— average colony brightnessIntensity_StandardDeviationIntensity— variation within the colonyIntensity_MinimumIntensity/Intensity_MaximumIntensity— intensity extremes
Quick Statistics#
Since the result is a standard pandas DataFrame, you can use all the usual pandas methods to explore it.
[5]:
df[["Size_Area", "Shape_Circularity", "Intensity_MeanIntensity"]].describe()
[5]:
| Size_Area | Shape_Circularity | Intensity_MeanIntensity | |
|---|---|---|---|
| count | 9.000000 | 7.000000 | 9.000000 |
| mean | 13022.000000 | 0.645563 | 0.550580 |
| std | 7822.634291 | 0.067261 | 0.066774 |
| min | 1.000000 | 0.510927 | 0.417128 |
| 25% | 13265.000000 | 0.637557 | 0.565152 |
| 50% | 15974.000000 | 0.654747 | 0.575144 |
| 75% | 17006.000000 | 0.674367 | 0.585904 |
| max | 22337.000000 | 0.729419 | 0.611042 |
Export to CSV#
For sharing with collaborators or importing into spreadsheet software, export to CSV.
[6]:
df.to_csv("colony_measurements.csv")
print("Saved to colony_measurements.csv")
Saved to colony_measurements.csv
Export to Parquet#
For large datasets, Parquet is more efficient — it is compressed, preserves column types, and loads much faster than CSV.
[7]:
df.to_parquet("colony_measurements.parquet")
print("Saved to colony_measurements.parquet")
Saved to colony_measurements.parquet
Clean Up#
[8]:
import os
os.remove("colony_measurements.csv")
os.remove("colony_measurements.parquet")
Summary#
You have extracted colony features and exported them for analysis:
``meas=[MeasureSize(), MeasureShape(), MeasureIntensity()]`` — add measurements to a pipeline
``pipeline.apply_and_measure(plate)`` — run the full pipeline and get a DataFrame
``.to_csv()`` / ``.to_parquet()`` — export for downstream tools
The result is a standard pandas DataFrame, so you can filter, group, plot, and analyze it with any tool in the Python ecosystem.
Next up: Tutorial 8: Using Prefab Pipelines — discover PhenoTypic’s pre-built pipelines for common organisms and plate types.