Segmentation Lab¶
Image segmentation is a core technique in image analysis and computer vision, enabling us to divide an image or 3D volume into distinct, meaningful regions. This process is essential for tasks such as identifying anatomical structures in medical scans, detecting objects in autonomous driving, or analyzing materials in industrial inspection. By isolating relevant components from complex data, segmentation serves as a foundation for quantitative analysis, visualization, and decision-making in scientific and engineering workflows.
This repository offers a guided, hands-on approach to building a 3D image segmentation pipeline from scratch in Python.
Working with a simple synthetic volumetric dataset, learners can focus on understanding the algorithmic workflow without the complexity of real-world data. Through a series of notebooks, youβll progress step by step through the essential stages of an image processing pipeline, culminating in a custom implementation of segmentation using classical methods such as thresholding and marker-based watershed.
Finally, the custom-built pipeline is compared against a highly optimized library implementation that leverages compiled C code for performance, illustrating the trade-offs between educational clarity and production-ready efficiency. This approach not only deepens understanding of the underlying algorithms but also highlights best practices for reproducible, scalable image processing.
This exemplar was developed at Imperial College London by David BΓΌchner in collaboration with Aurash Karimi from Research Software Engineering and Jianliang Gao from Research Computing & Data Science at the Early Career Researcher Institute.
Learning Outcomes π¶
After completing this exemplar, students will:
-
Understand and implement core 3D segmentation algorithms β build thresholding and watershed methods from first principles and apply them to synthetic volumetric datasets.
-
Create and manipulate synthetic 3D data β design simple volumetric structures that make segmentation workflows and algorithmic logic easier to internalize.
-
Visualize segmentation results effectively β use Python tools (NumPy, Matplotlib) to explore interactive slice views and 3D voxel renderings.
-
Analyze and compare segmentation approaches β evaluate the trade-offs between custom-built pipelines and optimized library (SciPy, scikit-image) implementations for performance and scalability
Target Audience π―¶
Students and researchers in materials science, biomedical engineering, and computational imaging who want hands-on experience with 3D segmentation and practical insights into building image processing pipelines from scratch.
Prerequisites β ¶
Academic π¶
- Basic proficiency in Python β including functions, loops, and file handling.
- Familiarity with NumPy for numerical array operations. Numerical Computing in Python with NumPy & SciPy
- Introductory understanding of image processing concepts (e.g., grayscale intensity).
System π»¶
- Python 3.11+, Anaconda, 2 GB disk space
Getting Started π¶
This project is organized as a series of Jupyter notebooks that guide you through the 3D segmentation pipeline:
Stepwise Learning (Notebooks 01β06)¶
- 01_creating_synthetic_3dimage.ipynb to 06_quantitative_analysis.ipynb: These six notebooks take you through the pipeline step by step. Each notebook focuses on a key stage, from creating synthetic volumetric data to implementing thresholding and watershed segmentation methods.
- Work through them in order to build a solid understanding of each component before combining them into a full workflow.
Full Pipeline Comparison (Notebook 07)¶
- 07_complete_pipeline.ipynb: This notebook brings everything together. It runs the complete segmentation pipeline in one place, showing:
- Your custom implementation built in previous steps.
- A library-based implementation using highly optimized Python modules (leveraging compiled C code).
- This side-by-side- This side-by-side comparison illustrates performance and design trade-offs between educational clarity and production-ready solutions.
Disciplinary Background π¬¶
This exemplar grew out of my research on medical and micro-CT imaging of packed bed adsorbers for COβ capture. Imaging allows us to see how much and how fast materials absorb COββconnecting 3D structure to material performance.
When I started, I relied heavily on ready-made libraries like scikit-image and commercial software. They worked, but I often felt like a βblack boxβ was doing the thinking for me. To truly understand what was happeningβand to customize algorithms for my dataβI needed to go deeper. Re-implementing methods like thresholding and watershed from scratch was a turning point. It gave me clarity on how segmentation works, why certain choices matter, and what assumptions are baked into high-level tools.
Thatβs the experience this exemplar aims to share. By building a segmentation pipeline step by step, youβll gain not just technical skills, but a deeper intuition for the algorithms that power modern imaging workflows. Whether you work in materials science, biomedical engineering, or computational imaging, this understanding will help you move beyond βblack boxβ solutions and make informed, creative decisions with your data.
Software Tools π οΈ¶
This exemplar uses the following tools and libraries:
- Python β the core programming language for building the pipeline.
- NumPy β for numerical array operations and synthetic data generation.
- Matplotlib β for 2D and 3D visualization.
- SciPy β for scientific computing utilities used in image processing workflows.
- scikit-image β for image processing and comparison with optimized implementations.
Project Structure ποΈ¶
Overview of code organisation and structure.
βββ docs/ # Documentation files
βββ images/ # Figures and visual outputs
βββ notebooks/ # Jupyter notebooks for stepwise learning
β βββ 01_creating_synthetic_3dimage.ipynb
β βββ 02_thresholding.ipynb
β βββ 03_distance_transform.ipynb
β βββ 04_find_seedpoints.ipynb
β βββ 05_watershed_segmentation.ipynb
β βββ 06_quantitative_analysis.ipynb
β βββ 07_complete_pipeline.ipynb
βββ src/ # Source code for pipeline components
β βββ data/ # Data handling
β βββ image_processing/ # Core image processing modules
β βββ analytical_information.py
β βββ distance_transform.py
β βββ local_extrema.py
β βββ otsu_method.py
β βββ watershed_segmentation.py
β βββ shape_creation.py # Synthetic data creation
β βββ visualisation.py # Visualisation
βββ utils/ # Helper functions and utilities
notebooks for tutorials and exercises - src for core code, potentially divided into further modules - data within src for datasets - docs for documentation Best Practice Notes π¶
-
Modular Code Design
Each processing step is implemented as a separate Python function or module insrc/image_processing/. This promotes clarity, reusability, and easier debugging. -
Notebook Organization
The six stepwise notebooks (01β06) are structured to build progressively, with exercises included to reinforce learning. The final notebook (07) integrates all steps for comparison with optimized libraries. -
Documentation and Comments
Functions include inline comments explaining algorithmic logic. Each notebook starts with an overview of its objectives and ends with optional exercises. -
Environment Management
Arequirements.txtfile is provided to ensure consistent dependencies across systems. Using virtual environments (e.g.,venvorconda) is recommended.
Estimated Time β³¶
| Task | Estimated Time |
|---|---|
| Reading background and setup instructions | 30 minutes |
| Stepwise notebooks (01β06) | |
| 01 β Create synthetic 3D data | 15 minutes |
| 02 β Apply thresholding | 30 minutes |
| 03 β Compute distance transform | 45 minutes |
| 04 β Find seed points | 15 minutes |
| 05 β Implement watershed segmentation | 60 minutes |
| 06 β Quantitative analysis (optional) | 15 minutes |
| Full pipeline integration (Notebook 07) | 1 hour |
Additional Resources π¶
- Borgefors, G. (1986). "Distance transformations in digital images." Computer Vision, Graphics, and Image Processing, 34(3), 344-371.
- Wikipedia:
- https://en.wikipedia.org/wiki/Distance_transform#Chamfer_distance_transform
- https://en.wikipedia.org/wiki/Watershed_(image_processing)
- https://en.wikipedia.org/wiki/Otsu%27s_method
- Stack Overflow: https://stackoverflow.com/questions/53678520/speed-up-computation-for-distance-transform-on-image-in-python
- https://www.slingacademy.com/article/understanding-numpy-roll-function-6-examples/
- Chityala, Ravishankar, and Sridevi Pudipeddi. "Image Processing and Acquisition Using Python. Second edition." Boca Raton: Chapman & Hall/CRC, 2020. Print.
- Original paper: Otsu, N. (1979). "A threshold selection method from gray-level histograms." IEEE Transactions on Systems, Man, and Cybernetics, 9(1), 62-66.
Licence π¶
This project is licensed under the BSD-3-Clause license.