Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Bottleneck in NewNarrowband Datasets #291

Open
IsaiahHarvi opened this issue Feb 27, 2025 · 2 comments
Open

[Bug] Bottleneck in NewNarrowband Datasets #291

IsaiahHarvi opened this issue Feb 27, 2025 · 2 comments
Labels
type: bug Something isn't working

Comments

@IsaiahHarvi
Copy link
Contributor

IsaiahHarvi commented Feb 27, 2025

Version

1.0.0 (Default)

System Information

OS: Ubuntu 24
Environment: Devcontainer

Torch 2.6.0

Description

There bottleneck in generating samples from the NewNarrowband dataset found here:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    ...
    10057    0.023    0.000  496.589    0.049 dsp.py:370(multistage_polyphase_resampler)
    10057    0.137    0.000  489.180    0.049 dsp.py:481(polyphase_fractional_resampler)
    14582    0.046    0.000  484.712    0.033 dsp.py:550(prototype_polyphase_filter)
    14582    4.082    0.000  484.666    0.033 dsp.py:1003(low_pass_iterative_design)
    10057    0.562    0.000  478.585    0.048 dsp.py:518(prototype_polyphase_filter_interpolation)
    61671   68.314    0.001  470.223    0.008 _fir_filter_design.py:251(firwin)
    ...

This was found with a 1024 sampling rate, and found a significant sped up by lowering from 1024. The number of calls for firwin is significant. Its possible there are some optimizations that can be made here, otherwise, live dataset generation is not practical with its inefficiencies at relatively small sampling rates.

Note: Found the same issue with NewWideband, but haven't profiled it.

How to Reproduce the Bug

   metadata = NarrowbandMetadata(
              num_iq_samples_dataset=4096,
              sample_rate=1024,
              fft_size=2048,
              impairment_level=2,
              snr_db_min=0,
              snr_db_max=100,
              class_list=target_mods,
              signal_duration_percent_min=100,
              num_samples=None,  # infinite
          ),

dataset = NewNarrowband(metadata)

# inside of some getitem method in a torch dataset
data, meta = next(dataset)

...
@IsaiahHarvi IsaiahHarvi added the type: bug Something isn't working label Feb 27, 2025
@IsaiahHarvi IsaiahHarvi changed the title [Bug] [Bug] Bottleneck in NewNarrowband and NewWideband Datasets Feb 27, 2025
@IsaiahHarvi IsaiahHarvi changed the title [Bug] Bottleneck in NewNarrowband and NewWideband Datasets [Bug] Bottleneck in NewNarrowband Datasets Feb 27, 2025
@MattCarrickPL
Copy link
Collaborator

Thanks for reporting this. Known issue for us and something we are working on. If you have separate breakouts for narrowband vs wideband profiling I'd be interested to see that. Our internal profiling showed that wideband is the problem with narrowband being less so. The problem comes down to the fact that for each signal generated it needs to be resampled to the appropriate bandwidth which is an expensive operation it an of itself, especially for small bandwidths, and also because we recompute the weights for the resampling PFB each time. I've got a couple ideas on how to improve that processing which will make it into the code in the near future.

FWIW the sample rate field doesn't change how fast or slow data is processed, it is just metadata that is used internally to we can report results in "real world" terms. The dataset generation runs as fast as it can irrespective of the sample rate, it's just compute limited right now.

@IsaiahHarvi
Copy link
Contributor Author

Thanks for the response! I appreciate the additional info. Haven't profiled WB just yet but planning to get to it shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants