Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split positions_from_delta() into individual functions #413

Open
ntessore opened this issue Nov 11, 2024 · 3 comments
Open

Split positions_from_delta() into individual functions #413

ntessore opened this issue Nov 11, 2024 · 3 comments
Assignees
Labels
api An (incompatible) API change science Science improvement or question

Comments

@ntessore
Copy link
Collaborator

ntessore commented Nov 11, 2024

Add your issue here

The current positions_from_delta() function carries out a large number of steps:

  • Apply bias model to given delta map.
  • Convert biased delta to mean number count in each map pixel.
  • Apply a visibility map.
  • Compute observed number counts from a distribution (e.g., Poisson).
  • Accumulate the galaxies in the number count map into batches of a given size (e.g., 1_000_000)
  • Sample random positions for each galaxy within its pixel.

This structure is very rigid, and makes it difficult to implement things such as #136.

We should split this up into individual functions, so that they can be mixed and matched with different implementations:

for i, delta in enumerate(matter):
    # mean number count in each pixel
    nbar = glass.nbar_from_density(arcmin2=ngal[i], nside=nside)
    # galaxy density from linear bias model, could be a complicated model
    gal = nbar * (1 + beff[i] * delta)
    # apply visibility map for this shell, could be a complicated model
    gal *= vmap[i]
    # Poisson-sample galaxy indices from map in batches
    for ipix in glass.batched_poisson(gal, rng=rng):
        # randomly sample galaxy positions within each pixel
        lon, lat = glass.random_healpix_positions(nside, ipix)

This needs a bit of experimentation for multi-dimensional arrays, but in principle seems feasible.

@ntessore ntessore added api An (incompatible) API change science Science improvement or question labels Nov 11, 2024
@ntessore ntessore self-assigned this Nov 11, 2024
@paddyroddy paddyroddy assigned paddyroddy and unassigned ntessore Jan 23, 2025
@paddyroddy
Copy link
Member

We should split this up into individual functions, so that they can be mixed and matched with different implementations:

for i, delta in enumerate(matter):
# mean number count in each pixel
nbar = glass.nbar_from_density(arcmin2=ngal[i], nside=nside)
# galaxy density from linear bias model, could be a complicated model
gal = nbar * (1 + beff[i] * delta)
# apply visibility map for this shell, could be a complicated model
gal *= vmap[i]
# Poisson-sample galaxy indices from map in batches
for ipix in glass.batched_poisson(gal, rng=rng):
# randomly sample galaxy positions within each pixel
lon, lat = glass.random_healpix_positions(nside, ipix)

This needs a bit of experimentation for multi-dimensional arrays, but in principle seems feasible.

Just to clarify on this, are you saying this is how you envisage the function being split up? Been trying to map the current function to this in my head, and it's not entirely clear to me.

@ntessore
Copy link
Collaborator Author

Just to clarify on this, are you saying this is how you envisage the function being split up? Been trying to map the current function to this in my head, and it's not entirely clear to me.

Yes, something along those lines. The functional split in the current positions_from_delta() is roughly this:

glass/glass/points.py

Lines 229 to 240 in fa012ac

# broadcast inputs to common shape of extra dimensions
inputs: list[tuple[float | NDArray[np.float64], int]] = [(ngal, 0), (delta, 1)]
if bias is not None:
inputs.append((bias, 0))
if vis is not None:
inputs.append((vis, 1))
dims, *rest = glass.core.array.broadcast_leading_axes(*inputs)
ngal, delta, *rest = rest
if bias is not None:
bias, *rest = rest
if vis is not None:
vis, *rest = rest

Figure out how many "populations" of objects we are dealing with, by broadcasting all inputs together. This is probably going to be the most tricky part to generalise neatly when not doing all subsequent steps in one function.

glass/glass/points.py

Lines 242 to 243 in fa012ac

# iterate the leading dimensions
for k in np.ndindex(dims):

This loops over the leading dimensions ("populations") and processes each "population" independently. This will need some thought on how to do it without having this big loop over everything. For the time being, imagine we are dealing with just one population of objects.

glass/glass/points.py

Lines 244 to 249 in fa012ac

# compute density contrast from bias model, or copy
n = (
np.copy(delta[k])
if bias is None
else bias_model_callable(delta[k], bias[k])
)

Applies the bias model to delta.

glass/glass/points.py

Lines 251 to 257 in fa012ac

# remove monopole if asked to
if remove_monopole:
n -= np.mean(n, keepdims=True)
# turn into number count, modifying the array in place
n += 1
n *= ARCMIN2_SPHERE / n.size * ngal[k]

Computes the expected number of objects per pixel nbar = ARCMIN2_SPHERE / n.size * ngal[k].
Then turns the biased delta_b in to an expected number count n = nbar * (1 + delta_b).

glass/glass/points.py

Lines 259 to 261 in fa012ac

# apply visibility if given
if vis is not None:
n *= vis[k]

Apply a visibility by multiplying it with the expected number count n = vis * n.

glass/glass/points.py

Lines 263 to 267 in fa012ac

# clip number density at zero
np.clip(n, 0, None, out=n)
# sample actual number in each pixel
n = rng.poisson(n)

Sample the actual number of galaxies in each pixel from the Poisson distribution.

glass/glass/points.py

Lines 269 to 315 in fa012ac

# total number of points
count = n.sum()
# don't go through pixels if there are no points
if count == 0:
continue
# for converting randomly sampled positions to HEALPix indices
npix = n.shape[-1]
nside = healpix.npix2nside(npix)
# create a mask to report the count in the right axis
cmask: int | NDArray[np.int_]
if dims:
cmask = np.zeros(dims, dtype=int)
cmask[k] = 1
else:
cmask = 1
# sample the map in batches
step = 1000
start, stop, size = 0, 0, 0
while count:
# tally this group of pixels
q = np.cumsum(n[stop : stop + step])
# does this group of pixels fill the batch?
if size + q[-1] < min(batch, count):
# no, we need the next group of pixels to fill the batch
stop += step
size += q[-1]
else:
# how many pixels from this group do we need?
stop += int(np.searchsorted(q, batch - size, side="right"))
# if the first pixel alone is too much, use it anyway
if stop == start:
stop += 1
# sample this batch of pixels
ipix = np.repeat(np.arange(start, stop), n[start:stop])
lon, lat = healpix.randang(nside, ipix, lonlat=True, rng=rng)
# next batch
start, size = stop, 0
# keep track of remaining number of points
count -= ipix.size
# yield the batch
yield lon, lat, ipix.size * cmask
# make sure that the correct number of pixels was sampled
assert np.sum(n[stop:]) == 0 # noqa: S101

Sample the individual galaxies in each pixel, randomly distributed over sub pixels, in batches.

@mwiet
Copy link
Contributor

mwiet commented Feb 24, 2025

I have a similar local implementation of this which creates a new classes within glass.observations that act like a visibility mask for a given combination of tomographic bins and redshift shells: glass.observations.angular_variable_depth_mask and glass.observations.angular_los_variable_depth_mask. The class constructs a lookup table based on a model of variable depth akin to SALMO. It is designed to output the ratio between the galaxy counts when variable depth occurs and the galaxy counts, so it can be applied as a factor to the visibility mask when sampling galaxies from a matter field with glass.points.positions_from_delta.

The implementation adds two novel features:

  • The interpolation between the variable depth tracer variable and the change in galaxy density can be interpolated with any function (previously, in SALMO only linear interpolation was possible).
  • The effect of variable depth along the line-of-sight and in the angular direction can be modelled as fully independent (previously, in SALMO the selection of galaxy shapes and galaxy redshifts was always assumed to be the same).

I have documented and tested the code (incl. tests on the KiDS-1000). It seems stable and consistent with previous results, so I have created a pull request. Please provide any feedback on any additional functionalities and/or changes to make it fit within the rest of the GLASS code base.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api An (incompatible) API change science Science improvement or question
Projects
None yet
Development

No branches or pull requests

3 participants