Skip to content

EuBI-Bridge

Perform parallelised conversion of image data collections to OME-Zarr using EuBI-Bridge

On the terminal browse into the directory named example_images:

cd /path/to/data/example_images

Activate the conda environment ngff_workshop:

conda activate ngff_workshop

Configure the memory limit

eubi configure_cluster --memory_limit 5GB
eubi show_config

Unary Conversion

Given a dataset structured as follows:

📂 multichannel_timeseries
├── 📄 Channel1-T0001.tif
├── 📄 Channel1-T0002.tif
├── 📄 Channel1-T0003.tif
├── 📄 Channel1-T0004.tif
├── 📄 Channel2-T0001.tif
├── 📄 Channel2-T0002.tif
├── 📄 Channel2-T0003.tif
└── 📄 Channel2-T0004.tif

To convert each TIFF into a separate OME-Zarr container (unary conversion):

eubi to_zarr multichannel_timeseries multichannel_timeseries_zarr

This produces:

📂 multichannel_timeseries_zarr
├── 📄 Channel1-T0001.zarr
├── 📄 Channel1-T0002.zarr
├── 📄 Channel1-T0003.zarr
├── 📄 Channel1-T0004.zarr
├── 📄 Channel2-T0001.zarr
├── 📄 Channel2-T0002.zarr
├── 📄 Channel2-T0003.zarr
└── 📄 Channel2-T0004.zarr

Aggregative Conversion (Concatenation Along Dimensions)

To concatenate images along specific dimensions, EuBI-Bridge needs to be informed of file patterns that specify image dimensions. For this example, the file pattern for the channel dimension is Channel, which is followed by the channel index, and the file pattern for the time dimension is T, which is followed by the time index.

For concatenation along the time dimension:

eubi to_zarr multichannel_timeseries multichannel_timeseries_concat-t_zarr --channel_tag Channel --time_tag T --concatenation_axes t
Output:

📂 multichannel_timeseries_time_concat-t_zarr
├── 📄 Channel1-T_tset.zarr
└── 📄 Channel2-T_tset.zarr

Important note: if the --channel_tag were not provided, the tool would not be aware of the multiple channels in the image and try to concatenate all images into a single one-channeled OME-Zarr. Therefore, when an aggregative conversion is performed, all dimensions existing in the input files must be specified via their respective tags.

For multidimensional concatenation (channel + time):

eubi to_zarr multichannel_timeseries multichannel_timeseries_concat-ct_zarr --channel_tag Channel --time_tag T --concatenation_axes ct

Note that both axes are specified with the argument --concatenation_axes ct.

Output:

📂 multichannel_timeseries_concat-ct_zarr
└── 📄 Channel_cset-T_tset.zarr

Handling Nested Directories

For datasets stored in nested directories such as:

📂 multichannel_timeseries_nested
├── 📁 Channel1
│    ├── 📄 T0001.tif
│    ├── 📄 T0002.tif
│    ├── 📄 T0003.tif
│    ├── 📄 T0004.tif
├── 📁 Channel2
│    ├── 📄 T0001.tif
│    ├── 📄 T0002.tif
│    ├── 📄 T0003.tif
│    ├── 📄 T0004.tif

EuBI-Bridge automatically detects the nested structure. For multidimensional concatenation:

eubi to_zarr multichannel_timeseries_nested multichannel_timeseries_nested_concat-ct_zarr --channel_tag Channel --time_tag T --concatenation_axes ct

Output:

multichannel_timeseries_nested_concat-ct_zarr
└── Channel_cset-T_tset.zarr

Selective Data Conversion Using Wildcards

To process only specific files, wildcards can be used. For example, to concatenate only lsm images from the pff directory:

eubi to_zarr "pff/*.lsm" lsm_to_zarr

Note: When using wildcards, the input directory path must be enclosed in quotes as shown in the example above.

Output:

📂 lsm_to_zarr
├── 📁 FtsZ2-1_GFP_KO2-1_no10G.zarr
└── 📁 FtsZ2-1_GFP_KO2-1_no16G.zarr

Handling Categorical Dimension Patterns

For datasets where channel names are categorical such as in:

📂 blueredchannels_timeseries_nested
├── 📁 Blue
│    ├── 📄 T0001.tif
│    ├── 📄 T0002.tif
│    ├── 📄 T0003.tif
│    └── 📄 T0004.tif
└── 📁 Red
    ├── 📄 T0001.tif
    ├── 📄 T0002.tif
    ├── 📄 T0003.tif
    └── 📄 T0004.tif

One can run the exact same command:

eubi to_zarr blueredchannels_timeseries_nested blueredchannels_timeseries_nested_concat-ct_zarr --channel_tag Blue,Red --time_tag T --concatenation_axes ct

Output:

📂 blueredchannels_timeseries_nested_concat-ct_zarr
└── 📄 BlueRed_cset-T_tset.zarr

Extraction of Single Series from a Multi-series Dataset

eubi to_zarr pff/17_03_18.lif lif_series_to_zarr --series 21 --no_distributed True

Output:

📂 lif_series_to_zarr
└── 📁 17_03_18.lif-17_03_18_FtsZ2-2_no11.zarr