Split `positions_from_delta()` into individual functions #413

ntessore · 2024-11-11T22:32:40Z

Add your issue here

The current positions_from_delta() function carries out a large number of steps:

Apply bias model to given delta map.
Convert biased delta to mean number count in each map pixel.
Apply a visibility map.
Compute observed number counts from a distribution (e.g., Poisson).
Accumulate the galaxies in the number count map into batches of a given size (e.g., 1_000_000)
Sample random positions for each galaxy within its pixel.

This structure is very rigid, and makes it difficult to implement things such as #136.

We should split this up into individual functions, so that they can be mixed and matched with different implementations:

for i, delta in enumerate(matter):
    # mean number count in each pixel
    nbar = glass.nbar_from_density(arcmin2=ngal[i], nside=nside)
    # galaxy density from linear bias model, could be a complicated model
    gal = nbar * (1 + beff[i] * delta)
    # apply visibility map for this shell, could be a complicated model
    gal *= vmap[i]
    # Poisson-sample galaxy indices from map in batches
    for ipix in glass.batched_poisson(gal, rng=rng):
        # randomly sample galaxy positions within each pixel
        lon, lat = glass.random_healpix_positions(nside, ipix)

This needs a bit of experimentation for multi-dimensional arrays, but in principle seems feasible.

The text was updated successfully, but these errors were encountered:

paddyroddy · 2025-02-17T20:49:25Z

We should split this up into individual functions, so that they can be mixed and matched with different implementations:

for i, delta in enumerate(matter):
# mean number count in each pixel
nbar = glass.nbar_from_density(arcmin2=ngal[i], nside=nside)
# galaxy density from linear bias model, could be a complicated model
gal = nbar * (1 + beff[i] * delta)
# apply visibility map for this shell, could be a complicated model
gal *= vmap[i]
# Poisson-sample galaxy indices from map in batches
for ipix in glass.batched_poisson(gal, rng=rng):
# randomly sample galaxy positions within each pixel
lon, lat = glass.random_healpix_positions(nside, ipix)

This needs a bit of experimentation for multi-dimensional arrays, but in principle seems feasible.

Just to clarify on this, are you saying this is how you envisage the function being split up? Been trying to map the current function to this in my head, and it's not entirely clear to me.

ntessore · 2025-02-17T21:05:40Z

Just to clarify on this, are you saying this is how you envisage the function being split up? Been trying to map the current function to this in my head, and it's not entirely clear to me.

Yes, something along those lines. The functional split in the current positions_from_delta() is roughly this:

glass/glass/points.py

Lines 229 to 240 in fa012ac

    
           # broadcast inputs to common shape of extra dimensions 
        
           inputs: list[tuple[float | NDArray[np.float64], int]] = [(ngal, 0), (delta, 1)] 
        
           if bias is not None: 
        
               inputs.append((bias, 0)) 
        
           if vis is not None: 
        
               inputs.append((vis, 1)) 
        
           dims, *rest = glass.core.array.broadcast_leading_axes(*inputs) 
        
           ngal, delta, *rest = rest 
        
           if bias is not None: 
        
               bias, *rest = rest 
        
           if vis is not None: 
        
               vis, *rest = rest

Figure out how many "populations" of objects we are dealing with, by broadcasting all inputs together. This is probably going to be the most tricky part to generalise neatly when not doing all subsequent steps in one function.

glass/glass/points.py

Lines 242 to 243 in fa012ac

    
           # iterate the leading dimensions 
        
           for k in np.ndindex(dims):

This loops over the leading dimensions ("populations") and processes each "population" independently. This will need some thought on how to do it without having this big loop over everything. For the time being, imagine we are dealing with just one population of objects.

glass/glass/points.py

Lines 244 to 249 in fa012ac

    
           # compute density contrast from bias model, or copy 
        
           n = ( 
        
               np.copy(delta[k]) 
        
               if bias is None 
        
               else bias_model_callable(delta[k], bias[k]) 
        
           )

Applies the bias model to delta.

glass/glass/points.py

Lines 251 to 257 in fa012ac

    
           # remove monopole if asked to 
        
           if remove_monopole: 
        
               n -= np.mean(n, keepdims=True) 
        
           # turn into number count, modifying the array in place 
        
           n += 1 
        
           n *= ARCMIN2_SPHERE / n.size * ngal[k]

Computes the expected number of objects per pixel nbar = ARCMIN2_SPHERE / n.size * ngal[k].
Then turns the biased delta_b in to an expected number count n = nbar * (1 + delta_b).

glass/glass/points.py

Lines 259 to 261 in fa012ac

    
           # apply visibility if given 
        
           if vis is not None: 
        
               n *= vis[k]

Apply a visibility by multiplying it with the expected number count n = vis * n.

glass/glass/points.py

Lines 263 to 267 in fa012ac

    
           # clip number density at zero 
        
           np.clip(n, 0, None, out=n) 
        
           # sample actual number in each pixel 
        
           n = rng.poisson(n)

Sample the actual number of galaxies in each pixel from the Poisson distribution.

glass/glass/points.py

Lines 269 to 315 in fa012ac

    
           # total number of points 
        
           count = n.sum() 
        
           # don't go through pixels if there are no points 
        
           if count == 0: 
        
               continue 
        
           # for converting randomly sampled positions to HEALPix indices 
        
           npix = n.shape[-1] 
        
           nside = healpix.npix2nside(npix) 
        
           # create a mask to report the count in the right axis 
        
           cmask: int | NDArray[np.int_] 
        
           if dims: 
        
               cmask = np.zeros(dims, dtype=int) 
        
               cmask[k] = 1 
        
           else: 
        
               cmask = 1 
        
           # sample the map in batches 
        
           step = 1000 
        
           start, stop, size = 0, 0, 0 
        
           while count: 
        
               # tally this group of pixels 
        
               q = np.cumsum(n[stop : stop + step]) 
        
               # does this group of pixels fill the batch? 
        
               if size + q[-1] < min(batch, count): 
        
                   # no, we need the next group of pixels to fill the batch 
        
                   stop += step 
        
                   size += q[-1] 
        
               else: 
        
                   # how many pixels from this group do we need? 
        
                   stop += int(np.searchsorted(q, batch - size, side="right")) 
        
                   # if the first pixel alone is too much, use it anyway 
        
                   if stop == start: 
        
                       stop += 1 
        
                   # sample this batch of pixels 
        
                   ipix = np.repeat(np.arange(start, stop), n[start:stop]) 
        
                   lon, lat = healpix.randang(nside, ipix, lonlat=True, rng=rng) 
        
                   # next batch 
        
                   start, size = stop, 0 
        
                   # keep track of remaining number of points 
        
                   count -= ipix.size 
        
                   # yield the batch 
        
                   yield lon, lat, ipix.size * cmask 
        
           # make sure that the correct number of pixels was sampled 
        
           assert np.sum(n[stop:]) == 0  # noqa: S101

Sample the individual galaxies in each pixel, randomly distributed over sub pixels, in batches.

mwiet · 2025-02-24T09:28:42Z

I have a similar local implementation of this which creates a new classes within glass.observations that act like a visibility mask for a given combination of tomographic bins and redshift shells: glass.observations.angular_variable_depth_mask and glass.observations.angular_los_variable_depth_mask. The class constructs a lookup table based on a model of variable depth akin to SALMO. It is designed to output the ratio between the galaxy counts when variable depth occurs and the galaxy counts, so it can be applied as a factor to the visibility mask when sampling galaxies from a matter field with glass.points.positions_from_delta.

The implementation adds two novel features:

The interpolation between the variable depth tracer variable and the change in galaxy density can be interpolated with any function (previously, in SALMO only linear interpolation was possible).
The effect of variable depth along the line-of-sight and in the angular direction can be modelled as fully independent (previously, in SALMO the selection of galaxy shapes and galaxy redshifts was always assumed to be the same).

I have documented and tested the code (incl. tests on the KiDS-1000). It seems stable and consistent with previous results, so I have created a pull request. Please provide any feedback on any additional functionalities and/or changes to make it fit within the rest of the GLASS code base.

ntessore added api An (incompatible) API change science Science improvement or question labels Nov 11, 2024

ntessore self-assigned this Nov 11, 2024

paddyroddy assigned paddyroddy and unassigned ntessore Jan 23, 2025

mwiet mentioned this issue Feb 24, 2025

gh-413: Implementation of variable depth #535

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split `positions_from_delta()` into individual functions #413

Split `positions_from_delta()` into individual functions #413

ntessore commented Nov 11, 2024 •

edited

Loading

paddyroddy commented Feb 17, 2025

ntessore commented Feb 17, 2025

mwiet commented Feb 24, 2025

Split positions_from_delta() into individual functions #413

Split positions_from_delta() into individual functions #413

Comments

ntessore commented Nov 11, 2024 • edited Loading

Add your issue here

paddyroddy commented Feb 17, 2025

ntessore commented Feb 17, 2025

mwiet commented Feb 24, 2025

Split `positions_from_delta()` into individual functions #413

Split `positions_from_delta()` into individual functions #413

ntessore commented Nov 11, 2024 •

edited

Loading