Quickstart Example updates #496

eroell · 2024-08-15T15:08:34Z

1. System Info

System Version: macOS 14.3 (23D2057)
Kernel Version: Darwin 23.3.0

Python 3.11.9

Name: pypots
Version: 0.7.1

Name: tsdb
Version: 0.6.2

2. Information

The official example scripts
My own created scripts

3. Reproduction

In a Jupyter Notebook; First install pypots

!pip install pypots

Try to run the example as on the Quick-start doc page:

import numpy as np
from sklearn.preprocessing import StandardScaler
from pygrinder import mcar
from pypots.data import load_specific_dataset
from pypots.imputation import SAITS
from pypots.utils.metrics import calc_mae

# Data preprocessing. Tedious, but PyPOTS can help. 🤓
data = load_specific_dataset('physionet_2012')  # PyPOTS will automatically download and extract it.
X = data['X']
num_samples = len(X['RecordID'].unique())
X = X.drop(['RecordID', 'Time'], axis = 1)
X = StandardScaler().fit_transform(X.to_numpy())
X = X.reshape(num_samples, 48, -1)
X_ori = X  # keep X_ori for validation
X = mcar(X, 0.1)  # randomly hold out 10% observed values as ground truth
dataset = {"X": X}  # X for model input
print(X.shape)  # (11988, 48, 37), 11988 samples, 48 time steps, 37 features

# initialize the model
saits = SAITS(
    n_steps=48,
    n_features=37,
    n_layers=2,
    d_model=256,
    d_ffn=128,
    n_heads=4,
    d_k=64,
    d_v=64,
    dropout=0.1,
    epochs=10,
    saving_path="examples/saits", # set the path for saving tensorboard logging file and model checkpoint
    model_saving_strategy="best", # only save the model with the best validation performance
)

# train the model. Here I use the whole dataset as the training set, because ground truth is not visible to the model.
saits.fit(dataset)
# impute the originally-missing values and artificially-missing values
imputation = saits.impute(dataset)
# calculate mean absolute error on the ground truth (artificially-missing values)
indicating_mask = np.isnan(X) ^ np.isnan(X_ori)  # indicating mask for imputation error calculation
mae = calc_mae(imputation, np.nan_to_num(X_ori), indicating_mask)  # calculate mean absolute error on the ground truth (artificially-missing values)

# the best model has been already saved, but you can still manually save it with function save_model() as below
saits.save_model(saving_dir="examples/saits",file_name="manually_saved_saits_model")
# you can load the saved model into a new initialized model
saits.load_model("examples/saits/manually_saved_saits_model")

The data = load_specific_dataset('physionet_2012') generated with the newer pypots version 0.7.1 (and I suspect the earlier versions, too, without tracking everyhing in detail here) is not compatible with these instructions anymore.

Further, saits.save_model and saits.load_model seem to have been updated to saits.save and saits.load, with new arguments.

I will open a PR trying to fix this example.

4. Expected behavior

A running example :)

The text was updated successfully, but these errors were encountered:

eroell · 2024-08-22T06:15:07Z

Fixed by merged #497

eroell added the bug Something isn't working label Aug 15, 2024

eroell mentioned this issue Aug 15, 2024

Doc update Quickstart Example #497

Merged

4 tasks

eroell closed this as completed Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quickstart Example updates #496

Quickstart Example updates #496

eroell commented Aug 15, 2024

eroell commented Aug 22, 2024

Quickstart Example updates #496

Quickstart Example updates #496

Comments

eroell commented Aug 15, 2024

1. System Info

2. Information

3. Reproduction

4. Expected behavior

eroell commented Aug 22, 2024