Releases: Labbeti/aac-datasets
Releases · Labbeti/aac-datasets
Version 0.5.2
Version 0.5.1
[0.5.1] 2024-03-04
Fixed
- WavCaps download preparation (#3).
safe_rmdir
function when sub-directories are deleted.
Version 0.5.0
[0.5.0] 2024-01-05
Changed
- Update typing for paths with python class
Path
. - Refactor functional interface to load raw metadata for each dataset.
- Refactor class variables to init arguments.
- Faster AudioCaps download with
ThreadPoolExecutor
.
Version 0.4.1
[0.4.1] 2023-10-25
Added
AudioCaps.DOWNLOAD_AUDIO
class variable for compatibility with audiocaps-download 1.0.
Changed
Version 0.4.0
[0.4.0] 2023-09-25
Added
- First experimental implementation of WavCaps dataset.
- Subsets
dcase_t2a_audio
anddcase_t2a_captions
from the DCASE Challenge task 6b, in Clotho dataset. - Subset
train_v2
for AudioCaps dataset. - Dataset cards as separate dataclasses for each dataset.
- Get and set global user paths for root, ffmpeg and ytdl.
- Base class for all datasets to simplify manipulation of loaded data.
Changed
- Rename
test
subset todcase_aac_test
,analysis
subset todcase_aac_analysis
from the DCASE Challenge task 6a, in Clotho dataset. - Function
get_install_info
now returnspackage_path
.
Version 0.3.3
[0.3.3] 2023-05-11
Added
- Script check.py now check if the audio files exists.
- Option
VERIFY_FILES
for Clotho and MACS datasets to validate checksums. CITATION
global constant for each dataset.
Changed
- Methods
at
andgetitem
now use correct typing when passing an integer, list, slice or None values.
Fixed
- Python minimal version in README and pyproject.toml.
- Transform applied in
getitem
method when argument is not an integer. - Incompatibility with
torchaudio>=2.0
. - Remove 'tags' from AudioCaps columns when with_tags=False.
Version 0.3.2
[0.3.2] 2023-01-30
Added
AudioCaps.load_class_labels_indices
to load AudioSet classes map externally.- Compatibility and tests from Python 3.7 to 3.10.
Changed
- Attributes in datasets classes are now weakly private.
- Documentation theme and descriptions.
Fixed
- Workflow badge with Github changes. (badges/shields#8671)
Version 0.3.1
[0.3.1] 2022-10-31
Changed
- AudioCaps, Clotho and MACS order are now defined by their order in the corresponding captions CSV files when available.
- Update documentation usage and main page.
Fixed
- Workflow when requirements cache is invalid.
Version 0.3.0
[0.3.0] 2022-09-28
Added
- Add
column_names
,info
andshape
properties in datasets. - Add
is_loaded
andset_transform
methods in datasets. - Add column argument for method
getitem
in datasets. - Entrypoints for command line scripts
aac-datasets-check
,aac-datasets-download
andaac-datasets-info
.
Changed
- Enforce datasets order to sort by filename to avoid different orders returned by
os.listdir
. - Function
check_directory
now returns the length of each dataset found in directory. - Rename
get_field
methods in datasets byat
and add support for Iterable of keys and None key. - Change
at
arguments order and names. - Split
BasicCollate
into 2 classes:BasicCollate
without padding andAdvancedCollate
with padding options. - Weak private methods are now strongly private in datasets.
- Rename
item_transform
totransform
in datasets. - Rename
load_tags
towith_tags
inAudioCaps
.
Fixed
- AudioCaps loading when
with_tags
is False. - Clotho files download.
Version 0.2.0
[0.2.0] 2022-08-30
Added
- CHANGELOG file.
- First version of the API documentation.
- Supports slicing and list indexing for the three datasets.
- Competence values for MACS annotators.
- Fields scene_label and identifier from TAU Urban acoustic scene dataset in MACS.
- Add
examples/dataloader.ipynb
notebook.
Changed
- Update README with PyPI install and software citation.
- Download functions returns the datasets downloaded.
- MACS now have a subset parameter.
- Underscores in functions names to avoid import private functions.
- Function
aac_datasets.check.check_directory
now returns only the list of subsets loaded. - Replace function
torchaudio.datasets.utils.download_url
bytorch.hub.download_url_to_file
to keep compatibility with future torchaudio version v0.12. - Rename
get_raw
methods in datasets byget_field
and add support for slicing and multi-indexing.
Fixed
- LICENCE.txt and MACS_competence.yaml download for MACS dataset.
- Clotho download archives files.
Removed
- Transforms dictionary in datasets.
- Argument item_type in datasets.
- Method
get
in datasets.