analyze_snapshot.py User Guide

Use scripts/analyze_snapshot.py to run the full DisPerSE topology extraction pipeline on a Gadget/Quijote HDF5 snapshot.

What it does

  1. Builds an NDfield catalog (optionally cropped/decimated).
  2. Runs delaunay_3D to reconstruct the Delaunay tessellation (with optional block decomposition).
  3. Runs mse to compute the Morse–Smale complex and dump manifolds/filaments.
  4. Converts DisPerSE outputs to VTK-family formats (VTU/VTP/etc.) via netconv/skelconv.
  5. For cropped runs, coordinates are rebased to the crop origin for periodic tessellation, then VTK outputs are shifted back to global coordinates.

Requirements

  • Python environment with h5py, hdf5plugin, numpy (conda env disperse).
  • DisPerSE binaries (delaunay_3D, mse, netconv, skelconv) on PATH, or pointed to via --disperse-bin-dir.

CLI

Basic Inputs

Argument Description
--input PATH HDF5 snapshot (Quijote-style). Defaults to data/snap_010.hdf5.
--output-dir DIR Folder for all generated artifacts. Created automatically.
--output-prefix NAME Basename for files inside output-dir. Default based on input file name.
--parttype NAME Particle group to use (default PartType1).
--coords-input PATH Skip snapshot reading and reuse an existing NDfield (ANDFIELD COORDS).
--network-input PATH Reuse an existing Delaunay NDnet (skip delaunay_3D).
--manifolds-input PATH Reuse an existing manifolds NDnet (skip mse).

Sampling Controls

  • --target-count N — approximate number of particles to keep. The script derives a stride.
  • --stride N — keep every Nth particle (overrides --target-count).
  • --chunk-size N — number of particles streamed per read (default 2M).
  • --crop-box xmin ymin zmin xmax ymax zmax — restricts the analysis to a rectangular sub-volume in the snapshot’s input units (kpc/h by default).

Units

  • --input-unit {kpc/h,mpc/h} — units used by --input.
  • --output-unit {kpc/h,mpc/h} — units stored in downstream NDfields/VTK files.

Delaunay Stage

  • --periodic — enforce periodic boundary conditions (crops are rebased to the origin internally; VTK outputs are shifted back). (THIS GENERATES ARTIFACTS, USE –delaunay-btype INSTEAD).
  • --delaunay-blocks NCHUNKS NTHREADS — activate DisPerSE’s block decomposition to reduce memory; the script automatically switches to the gathered _G.NDnet. (DON’T USE, THROWS ERROR).
  • --delaunay-btype {mirror,periodic,smooth,void} — Delaunay boundary extrapolation.
  • --export-delaunay — convert the raw Delaunay NDnet via netconv.
    • --delaunay-format ... — output format for the above (defaults to VTU).
    • --delaunay-smooth N — smoothing iterations passed to netconv.

Morse–Smale / Manifolds

  • --mse-nsig s1 [s2 ...] — persistence thresholds in sigma units (default 3.5).
  • --persistence-cut c1 [c2 ...] — absolute persistence thresholds (override --mse-nsig).
  • --mse-threads N — OpenMP threads for mse.
  • --mse-vertex-as-minima — forward -vertexAsMinima.
  • --dump-manifolds SPEC — descriptor for mse -dumpManifolds (default JD1d).
  • --dump-filament-manifolds TAG — optional second manifolds tag (e.g., JE2a) to dump filament manifolds; runs mse a second time.
  • --dump-clusters TAG — export cluster critical points (maxima) using the given tag (e.g., JE3a).
    • The pipeline exports cluster critical points from the skeleton as a VTP/VTU point set. This is the primary cluster representation used downstream because DisPerSE’s 3‑manifold exports are volumetric and often empty.
    • Cluster log_field_value is computed as ln(field_value) for consistency with the other structures.
  • --dump-cluster-manifolds TAG — deprecated alias for --dump-clusters (kept for backward compatibility).
  • --dump-arcs CUID — one or more -dumpArcs descriptors (e.g., U, CUD). Repeatable.
    • Note: DisPerSE concatenates multiple manifold tags into one output (e.g., JE1a2a). Use --dump-filament-manifolds for one extra tag, or run separate passes if you need more.

Conversion / Output Formats

  • --netconv-format ... — how to convert manifolds (VTU/PLY/NDnet/etc.).
  • --netconv-smooth N — smoothing iterations for netconv.
  • --skip-netconv — leave manifolds in NDnet form.
  • --skel-input TAG=PATH — provide external NDskl files for conversion.
  • --skelconv-format ... — skeleton conversion output (default VTP).
  • --skelconv-smooth N — smoothing iterations for skelconv.
  • --skip-skelconv — skip skeleton conversion.

Workflow Control

  • --stop-after {ndfield,delaunay,mse} — halt after the named stage.
  • --keep-ndfield — keep (rather than delete) the intermediate coords_stride*.AND.
  • --disperse-bin-dir DIR — directory containing DisPerSE binaries (delaunay_3D, mse, netconv, skelconv).

Examples

Full Snapshot Pipeline

python scripts/analyze_snapshot.py \
  --input data/snap_010.hdf5 \
  --output-dir outputs/snap_010_thinned \
  --target-count 2_000_000 \
  --delaunay-btype periodic \
  --export-delaunay \
  --mse-nsig 3.0 \
  --dump-manifolds JE1a \
  --dump-filament-manifolds JE2a \
  --dump-clusters JE3a \
  --dump-arcs U \
  --netconv-smooth 20 \
  --skelconv-smooth 20

Cropped Sub-volume

python scripts/analyze_snapshot.py \
  --input data/snap_010.hdf5 \
  --output-dir outputs/snap_010_subbox \
  --crop-box 0 0 0 500000 500000 100000 \
  --stride 1 \
  --delaunay-btype periodic \
  --export-delaunay \
  --mse-nsig 3.0 \
  --dump-manifolds JE1a \
  --dump-arcs U \
  --netconv-smooth 20 \
  --skelconv-smooth 20

Conversion-only (reuse existing NDnets)

python scripts/analyze_snapshot.py \
  --output-dir outputs/snap_010 \
  --manifolds-input outputs/snap_010/snap_010_manifolds_JD1d.NDnet \
  --skel-input U=outputs/snap_010/snap_010.U.NDskl \
  --netconv-format vtu \
  --skelconv-format vtp \
  --skelconv-smooth 5

Notes

Naming conventions (VTK/NDnet/NDskl)

  • Delaunay: <prefix>_delaunay_S###.vtu (or _S000 for unsmoothed conversions).
  • Manifolds (walls): <prefix>_sX_manifolds_<TAG>_S###.vtu (NDnet: <prefix>_sX_manifolds_<TAG>.NDnet).
  • Filament manifolds (optional): <prefix>_sX_filament_manifolds_<TAG>_S###.vtu.
  • Cluster critical points (recommended): <prefix>_cluster_critpoints_<TAG>_S000.vtp and .vtu.
  • Skeletons: <prefix>_sX_arcs_<TAG>_S###.vtp (NDskl: <prefix>_sX_arcs_<TAG>.NDskl). A matching .vtu is also written for convenience.
  • Persistence tags (sX) precede the role/tag; smoothing (S###) is appended by converters.

Pipeline stages

  1. NDfield Creation — reads the snapshot (or uses --coords-input), applies optional cropping and decimation, writes an ANDFIELD COORDS file. --stop-after ndfield lets you inspect this stage alone.
  2. Delaunay Tessellation — runs delaunay_3D on the NDfield to produce an NDnet. Pass --stop-after delaunay to examine the tessellation.
  3. Morse–Smale Analysis — runs mse on the NDnet, applying persistence thresholds and dumping manifolds/filaments. Reuse with --manifolds-input or stop with --stop-after mse.
  4. Conversion — converts manifolds (via netconv) and skeletons (via skelconv) into VTK formats for visualization.
  5. Summary & Cleanup — prints a recap of all produced files. Use --keep-ndfield to retain the coordinate catalog; otherwise it is deleted automatically.