Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quickstart Example updates #496

Closed
1 of 2 tasks
eroell opened this issue Aug 15, 2024 · 1 comment
Closed
1 of 2 tasks

Quickstart Example updates #496

eroell opened this issue Aug 15, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@eroell
Copy link
Contributor

eroell commented Aug 15, 2024

1. System Info

System Version: macOS 14.3 (23D2057)
Kernel Version: Darwin 23.3.0

Python 3.11.9

Name: pypots
Version: 0.7.1

Name: tsdb
Version: 0.6.2

2. Information

  • The official example scripts
  • My own created scripts

3. Reproduction

In a Jupyter Notebook; First install pypots

!pip install pypots

Try to run the example as on the Quick-start doc page:

import numpy as np
from sklearn.preprocessing import StandardScaler
from pygrinder import mcar
from pypots.data import load_specific_dataset
from pypots.imputation import SAITS
from pypots.utils.metrics import calc_mae

# Data preprocessing. Tedious, but PyPOTS can help. 🤓
data = load_specific_dataset('physionet_2012')  # PyPOTS will automatically download and extract it.
X = data['X']
num_samples = len(X['RecordID'].unique())
X = X.drop(['RecordID', 'Time'], axis = 1)
X = StandardScaler().fit_transform(X.to_numpy())
X = X.reshape(num_samples, 48, -1)
X_ori = X  # keep X_ori for validation
X = mcar(X, 0.1)  # randomly hold out 10% observed values as ground truth
dataset = {"X": X}  # X for model input
print(X.shape)  # (11988, 48, 37), 11988 samples, 48 time steps, 37 features

# initialize the model
saits = SAITS(
    n_steps=48,
    n_features=37,
    n_layers=2,
    d_model=256,
    d_ffn=128,
    n_heads=4,
    d_k=64,
    d_v=64,
    dropout=0.1,
    epochs=10,
    saving_path="examples/saits", # set the path for saving tensorboard logging file and model checkpoint
    model_saving_strategy="best", # only save the model with the best validation performance
)

# train the model. Here I use the whole dataset as the training set, because ground truth is not visible to the model.
saits.fit(dataset)
# impute the originally-missing values and artificially-missing values
imputation = saits.impute(dataset)
# calculate mean absolute error on the ground truth (artificially-missing values)
indicating_mask = np.isnan(X) ^ np.isnan(X_ori)  # indicating mask for imputation error calculation
mae = calc_mae(imputation, np.nan_to_num(X_ori), indicating_mask)  # calculate mean absolute error on the ground truth (artificially-missing values)

# the best model has been already saved, but you can still manually save it with function save_model() as below
saits.save_model(saving_dir="examples/saits",file_name="manually_saved_saits_model")
# you can load the saved model into a new initialized model
saits.load_model("examples/saits/manually_saved_saits_model")

The data = load_specific_dataset('physionet_2012') generated with the newer pypots version 0.7.1 (and I suspect the earlier versions, too, without tracking everyhing in detail here) is not compatible with these instructions anymore.

Further, saits.save_model and saits.load_model seem to have been updated to saits.save and saits.load, with new arguments.

I will open a PR trying to fix this example.

4. Expected behavior

A running example :)

@eroell eroell added the bug Something isn't working label Aug 15, 2024
@eroell
Copy link
Contributor Author

eroell commented Aug 22, 2024

Fixed by merged #497

@eroell eroell closed this as completed Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant