[Bug] Bottleneck in NewNarrowband Datasets #291

IsaiahHarvi · 2025-02-27T15:23:46Z

Version

1.0.0 (Default)

System Information

OS: Ubuntu 24
Environment: Devcontainer

Torch 2.6.0

Description

There bottleneck in generating samples from the NewNarrowband dataset found here:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    ...
    10057    0.023    0.000  496.589    0.049 dsp.py:370(multistage_polyphase_resampler)
    10057    0.137    0.000  489.180    0.049 dsp.py:481(polyphase_fractional_resampler)
    14582    0.046    0.000  484.712    0.033 dsp.py:550(prototype_polyphase_filter)
    14582    4.082    0.000  484.666    0.033 dsp.py:1003(low_pass_iterative_design)
    10057    0.562    0.000  478.585    0.048 dsp.py:518(prototype_polyphase_filter_interpolation)
    61671   68.314    0.001  470.223    0.008 _fir_filter_design.py:251(firwin)
    ...

This was found with a 1024 sampling rate, and found a significant sped up by lowering from 1024. The number of calls for firwin is significant. Its possible there are some optimizations that can be made here, otherwise, live dataset generation is not practical with its inefficiencies at relatively small sampling rates.

Note: Found the same issue with NewWideband, but haven't profiled it.

How to Reproduce the Bug

   metadata = NarrowbandMetadata(
              num_iq_samples_dataset=4096,
              sample_rate=1024,
              fft_size=2048,
              impairment_level=2,
              snr_db_min=0,
              snr_db_max=100,
              class_list=target_mods,
              signal_duration_percent_min=100,
              num_samples=None,  # infinite
          ),

dataset = NewNarrowband(metadata)

# inside of some getitem method in a torch dataset
data, meta = next(dataset)

...

The text was updated successfully, but these errors were encountered:

MattCarrickPL · 2025-02-27T15:35:12Z

Thanks for reporting this. Known issue for us and something we are working on. If you have separate breakouts for narrowband vs wideband profiling I'd be interested to see that. Our internal profiling showed that wideband is the problem with narrowband being less so. The problem comes down to the fact that for each signal generated it needs to be resampled to the appropriate bandwidth which is an expensive operation it an of itself, especially for small bandwidths, and also because we recompute the weights for the resampling PFB each time. I've got a couple ideas on how to improve that processing which will make it into the code in the near future.

FWIW the sample rate field doesn't change how fast or slow data is processed, it is just metadata that is used internally to we can report results in "real world" terms. The dataset generation runs as fast as it can irrespective of the sample rate, it's just compute limited right now.

IsaiahHarvi · 2025-02-27T16:49:44Z

Thanks for the response! I appreciate the additional info. Haven't profiled WB just yet but planning to get to it shortly.

IsaiahHarvi added the type: bug Something isn't working label Feb 27, 2025

IsaiahHarvi changed the title ~~[Bug]~~ [Bug] Bottleneck in NewNarrowband and NewWideband Datasets Feb 27, 2025

IsaiahHarvi changed the title ~~[Bug] Bottleneck in NewNarrowband and NewWideband Datasets~~ [Bug] Bottleneck in NewNarrowband Datasets Feb 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Bottleneck in NewNarrowband Datasets #291

[Bug] Bottleneck in NewNarrowband Datasets #291

IsaiahHarvi commented Feb 27, 2025 •

edited

Loading

MattCarrickPL commented Feb 27, 2025

IsaiahHarvi commented Feb 27, 2025

[Bug] Bottleneck in NewNarrowband Datasets #291

[Bug] Bottleneck in NewNarrowband Datasets #291

Comments

IsaiahHarvi commented Feb 27, 2025 • edited Loading

Version

System Information

Description

How to Reproduce the Bug

MattCarrickPL commented Feb 27, 2025

IsaiahHarvi commented Feb 27, 2025

IsaiahHarvi commented Feb 27, 2025 •

edited

Loading