Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DATEDIFF being cast to date depending on column name #306

Open
mvankessel-EMC opened this issue Feb 25, 2025 · 3 comments
Open

DATEDIFF being cast to date depending on column name #306

mvankessel-EMC opened this issue Feb 25, 2025 · 3 comments

Comments

@mvankessel-EMC
Copy link

mvankessel-EMC commented Feb 25, 2025

I initially suspected a misstranslation in OHDSI/SqlRender#383, but I don't think that's the case.

With the migration to duckdb in Andromeda I noticed some strange behavior that casts the result of DATEDIFF to either double or date depending on the column name used for the result:

connectionDetails <- Eunomia::getEunomiaConnectionDetails()
con <- DatabaseConnector::connect(connectionDetails)
#> Connecting using SQLite driver

DatabaseConnector::renderTranslateExecuteSql(
  connection = con,
  sql = "
  SELECT
    1 AS cohort_definition_id,
    person_id AS subject_id,
    observation_period_start_date AS cohort_start_date,
    observation_period_end_date AS cohort_end_date
  INTO cohort_table
  FROM observation_period;"
)
#>   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%
#> Executing SQL took 0.0188 secs

DatabaseConnector::renderTranslateQuerySql(
  connection = con,
  sql = "
  SELECT
    cohort_definition_id,
    subject_id,
    cohort_start_date,
    cohort_end_date,
    DATEDIFF(DAY, CAST(19700101 AS DATE), cohort_start_date) AS new_start,
    DATEDIFF(DAY, CAST(19700101 AS DATE), cohort_start_date) AS new_start_date
  FROM cohort_table
  ") |>
  head()
#>   COHORT_DEFINITION_ID SUBJECT_ID COHORT_START_DATE COHORT_END_DATE NEW_START
#> 1                    1          6        1963-12-31      2007-02-06     -2193
#> 2                    1         13        2009-04-26      2019-04-14     14360
#> 3                    1         27        2002-01-30      2018-11-21     11717
#> 4                    1         16        1971-10-14      2017-11-02       651
#> 5                    1         55        2009-05-30      2019-03-23     14394
#> 6                    1         60        1990-11-21      2019-01-23      7629
#>   NEW_START_DATE
#> 1     1969-12-31
#> 2     1970-01-01
#> 3     1970-01-01
#> 4     1970-01-01
#> 5     1970-01-01
#> 6     1970-01-01

DatabaseConnector::disconnect(con)

# DBI with DatabaseConnector
con <- DatabaseConnector::connect(
  DatabaseConnector::createDbiConnectionDetails(
    dbms = "sqlite",
    drv = RSQLite::SQLite(),
    connectionDetails$server()
  )
)

DatabaseConnector::renderTranslateQuerySql(
  connection = con,
  sql = "
  SELECT
    cohort_definition_id,
    subject_id,
    cohort_start_date,
    cohort_end_date,
    DATEDIFF(DAY, CAST(19700101 AS DATE), cohort_start_date) AS new_start,
    DATEDIFF(DAY, CAST(19700101 AS DATE), cohort_start_date) AS new_start_date
  FROM cohort_table
  ") |>
  head()
#>   COHORT_DEFINITION_ID SUBJECT_ID COHORT_START_DATE COHORT_END_DATE NEW_START
#> 1                    1          6        1963-12-31      2007-02-06     -2193
#> 2                    1         13        2009-04-26      2019-04-14     14360
#> 3                    1         27        2002-01-30      2018-11-21     11717
#> 4                    1         16        1971-10-14      2017-11-02       651
#> 5                    1         55        2009-05-30      2019-03-23     14394
#> 6                    1         60        1990-11-21      2019-01-23      7629
#>   NEW_START_DATE
#> 1     1969-12-31
#> 2     1970-01-01
#> 3     1970-01-01
#> 4     1970-01-01
#> 5     1970-01-01
#> 6     1970-01-01

DatabaseConnector::disconnect(con)

# DBI
con <- DBI::dbConnect(drv = RSQLite::SQLite(), connectionDetails$server())
cohort_table <- dplyr::tbl(con, "cohort_table")

translatedSql <- SqlRender::translate(
  "
  SELECT
    cohort_definition_id,
    subject_id,
    cohort_start_date,
    cohort_end_date,
    DATEDIFF(DAY, CAST(19700101 AS DATE), cohort_start_date) AS new_start,
    DATEDIFF(DAY, CAST(19700101 AS DATE), cohort_start_date) AS new_start_date
  FROM cohort_table
    ", targetDialect = "sqlite"
)

DBI::dbGetQuery(conn = con, translatedSql) |> head()
#>   cohort_definition_id subject_id cohort_start_date cohort_end_date new_start
#> 1                    1          6        -189475200      1170720000     -2193
#> 2                    1         13        1240704000      1555200000     14360
#> 3                    1         27        1012348800      1542758400     11717
#> 4                    1         16          56246400      1509580800       651
#> 5                    1         55        1243641600      1553299200     14394
#> 6                    1         60         659145600      1548201600      7629
#>   new_start_date
#> 1          -2193
#> 2          14360
#> 3          11717
#> 4            651
#> 5          14394
#> 6           7629
DBI::dbDisconnect(con)

Created on 2025-02-25 with reprex v2.1.1

@schuemie
Copy link
Member

Yes, this is done so that all HADES code can run on Eunomia. SQLite does not support dates natively (I know the extensions do add date types, but that has severe limitations making it unusable). For now, if you want to query fields that are not dates they cannot have their name end in '_date'.

@mvankessel-EMC
Copy link
Author

Thanks for the clarification. Is this documented somewhere? As it wasn't clear to me that I should expect this behavior.

Also to clarify for my own understanding, this only applies on SQLite, but not on other database platforms that do support dates, i.e. DuckDB?

@schuemie
Copy link
Member

This wasn't documented. I just added it, it will be included in the next release.

Yes, this only applies to SQLite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants