batch_crop_and_clip.py User Guide
Use scripts/batch_crop_and_clip.py to automate two steps across many subvolumes:
- Run
analyze_snapshot.pyon non-overlapping crop boxes of a snapshot. - For each crop, run
batch_clip.pyacross multiple slab origins inside that crop.
Outputs are organized per-crop and per-slab so results do not overwrite.
What it does
- Tiles a user-specified volume with fixed-size crop boxes (skips partial edge tiles).
- For each crop: calls
analyze_snapshot.pywith your smoothing, nsig, and export options. - If
--dump-filament-manifoldsis provided, it also exports the filament-manifold VTKs for each crop. - If
--dump-clustersis provided, it exports cluster critical points (maxima) as VTP/VTU for each crop, which is the primary cluster representation used downstream. - Within each crop: marches slab origins along z (or whichever axis you choose in
batch_clip.py, default z) with a fixed step and thickness, callingbatch_clip.pyto produce 3D/2D clips and summary stats. - Handles unit conversion between input snapshot units and output units (e.g., kpc/h to mpc/h).
- Skips crops that contain no particles (does not abort the batch).
Requirements
- Python with access to
scripts/analyze_snapshot.pyandscripts/batch_clip.py. - ParaView
pvpythonon PATH for thebatch_clip.pycalls. - DisPerSE binaries available to
analyze_snapshot.py(as usual). - Per-crop topology stats CSVs are generated automatically via
ndtopo_stats.py.
To keep pvpython on PATH whenever you activate the conda environment:
mkdir -p "$CONDA_PREFIX/etc/conda/activate.d" "$CONDA_PREFIX/etc/conda/deactivate.d"
cat > "$CONDA_PREFIX/etc/conda/activate.d/paraview_path.sh" <<'EOF'
export _PV_OLD_PATH="$PATH"
export PATH="/usr/bin:/bin:/usr/sbin:/sbin:/Applications/ParaView-6.0.1.app/Contents/bin:$PATH"
EOF
cat > "$CONDA_PREFIX/etc/conda/deactivate.d/paraview_path.sh" <<'EOF'
export PATH="$_PV_OLD_PATH"
unset _PV_OLD_PATH
EOFCLI
python scripts/batch_crop_and_clip.py \
--snapshot <snapshot.hdf5> \
--output-root <OUTPUT_ROOT> \
--crop-size DX DY DZ \
--x-range XMIN XMAX --y-range YMIN YMAX --z-range ZMIN ZMAX \
--slab-step <STEP> --slab-thickness <THICK> \
[--mse-nsig 5.0] [--dump-manifolds JE1a] [--dump-filament-manifolds JE2a] [--dump-clusters JE3a] [--dump-arcs U] \
[--netconv-smooth 20] [--skelconv-smooth 20] \
[--stride 1] [--delaunay-btype periodic] \
[--resample-dims NX NY NZ] [--scalar-name log_field_value] \
[--png-percentile-range PLOW PHIGH] \
[--png-transparent|--no-png-transparent] \
[--png-align-composite|--no-png-align-composite] \
[--write-per-point-csv] \
[--write-topology-scalars-csv|--no-write-topology-scalars-csv] \
[--skip-slabs] \
[--input-unit kpc/h] [--output-unit mpc/h] \
[--analyze-script path/to/analyze_snapshot.py] \
[--clip-script path/to/batch_clip.py]- Crop sizes/ranges and slab params use the same units as
--input-unit(default kpc/h). Outputs and slab origins passed tobatch_clip.pyare converted to--output-unit(default mpc/h). --slab-stepsets spacing between slab origins;--slab-thicknesssets the clip depth for each slab.--resample-dimssets the grid for density resampling insidebatch_clip.py(default 500 500 100).--png-percentile-rangeclamps PNG coloring to the given percentile range whenbatch_clip.pyrenders images (e.g.,1 99).--png-transparentcontrols PNG alpha; use--no-png-transparentto force the background color to render.--png-align-compositealigns composite overlays to the density bounds; disable with--no-png-align-compositeif overlays are already aligned.--write-per-point-csvadds a per‑point topology CSV (one row per Delaunay ID) next to the per‑crop stats file.--skip-slabsruns onlyanalyze_snapshot.pyand topology stats for each crop, skippingbatch_clip.py.- Smoothing tags for output filenames come from
--netconv-smoothand--skelconv-smooth. ndtopo_stats.pysupports--delaunay-id-field,--walls-id-field,--filaments-id-field, and--*-cell-modeif you need to adjust how IDs are matched.
Outputs and layout
Under --output-root, each crop gets its own folder:
output-root/
crop_x<x0>-<x1>_y<y0>-<y1>_z<z0>-<z1>/
<crop_prefix>_*.vtu/vtp/... # analyze_snapshot outputs
<crop_prefix>_topology_stats.csv # ndtopo_stats results for this crop
<crop_prefix>_topology_stats_topology_scalars.csv # topology-scalars stats (optional)
<crop_prefix>_topology_points.csv # per-point topology rows (optional)
slab_z<origin>/
<prefix>_walls_3d.vtu
<prefix>_filaments_3d.vtp
<prefix>_density_3d.vtu / .vti / averaged vti
<prefix>_walls.vtu / <prefix>_filaments.vtp / <prefix>_walls_filaments.vtm
<prefix>_clusters.vtp (if cluster critical points were provided)
<prefix>_summary_stats.csv
PNGs if batch_clip was called with --save-pngs (the driver passes it by default; PNG options such as `--png-dpi`, `--png-colormap`, `--png-log-range`, `--png-background`, `--png-transparent`, `--png-align-composite`, `--png-hide-orientation-axes`, `--png-lighting`, and `--composite-filaments-source` are forwarded)
Naming notes:
- Manifolds:
<crop_prefix>_sX_manifolds_<TAG>_S###.vtu(persistence tag + smoothing). - Filament manifolds (optional):
<crop_prefix>_sX_filament_manifolds_<TAG>_S###.vtu(persistence tag + smoothing). - Cluster critical points (recommended):
<crop_prefix>_cluster_critpoints_<TAG>_S000.vtpand.vtu. - Filaments:
<crop_prefix>_sX_arcs_<ARC>_S###.vtp(persistence tag + smoothing). - Delaunay:
<crop_prefix>_delaunay_S###.vtu(smoothing tag). - Unsmoothed conversions (e.g., via
ndtopo_stats.py --write-vtk) use_S000. - DisPerSE concatenates multiple manifold tags into one file (e.g.,
JE1a2a). Run per-tag if you need separate manifold outputs.
Crop and slab prefixes embed coordinates (and slab origin) so runs do not collide.
Examples
Run a batch over 500×500×100 (kpc/h) tiles, stepping slabs every 10 (mpc/h after conversion):
python scripts/batch_crop_and_clip.py \
--snapshot data/snap_010.hdf5 \
--output-root outputs/quijote_batches \
--crop-size 500000 500000 100000 \
--x-range 0 1000000 --y-range 0 1000000 --z-range 0 200000 \
--slab-step 10 --slab-thickness 10 \
--mse-nsig 5.0 \
--dump-manifolds JE1a --dump-filament-manifolds JE2a --dump-clusters JE3a --dump-arcs U \
--netconv-smooth 20 --skelconv-smooth 20 \
--resample-dims 500 500 100 \
--scalar-name log_field_valueMore Examples
python scripts/batch_crop_and_clip.py \
--snapshot data/snapdir_004 \
--output-root outputs/quijote_batches_004_w_clusters_points_4_5 \
--crop-size 500000 500000 100000 \
--mse-nsig 4.5 \
--x-range 0 1000000 --y-range 0 1000000 --z-range 0 1000000 \
--slab-step 5 --slab-thickness 5 \
--dump-manifolds J1a --dump-arcs U \
--dump-filament-manifolds J2a \
--dump-clusters J3a \
--netconv-smooth 20 --skelconv-smooth 20 \
--png-percentile-range 0 99 \
--write-per-point-csv \
--no-png-lighting \
--no-png-transparent \
--export-delaunay-points \
--skip-slabsTips
- If a crop contains no particles, the script logs “[skip] Crop box contains no particles.” and moves on.
- Ensure
pvpythonis resolvable from the environment; otherwise set PATH accordingly or adjust the--clip-scriptinvocation to include its full path. - Keep
--input-unit/--output-unitconsistent with your snapshot and downstream expectations; the driver handles the z-origin conversion for slabs automatically.***