-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add way to write DATE types to Hyper #100
Comments
Thanks for the report. This is “by design” in today’s world because there isnt a first class dtype in pandas for dates. Your workaround is the suggested approach, though if you really want date and not date time in the extract it falls short. I think could use a keyword argument that allows you to explicitly store date time dtypes as dates - interested in trying a PR for that? |
Thanks will, i'll make a PR. |
I'm trying to make a PR for kwargs for casting datetime.date to pd.datetime. Can you grant me permission? Thanks def frame_to_hyper( |
You shouldn’t need any extra access. Make sure you fork the repo then push the branch to your fork, then make a pull request from there.
The instructions in the contributing guide should help so make sure to give that a look. Ping if you get stuck again.
Thanks!
Get Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
From: Hadi <notifications@github.com>
Sent: Saturday, May 9, 2020 10:57:39 PM
To: innobi/pantab <pantab@noreply.github.com>
Cc: will_ayd <will_ayd@innobi.io>; Comment <comment@noreply.github.com>
Subject: Re: [innobi/pantab] ENH: Add way to write DATE types to Hyper (#100)
I'm trying to make a PR for kwargs for casting datetime.date to pd.datetime. Can you grant me permission? Thanks
def frame_to_hyper(
df: pd.DataFrame,
database: Union[str, pathlib.Path],
*,
table: pantab_types.TableType,
table_mode: str = "w",
**kwargs: Union[str, list]
) -> None:
"""See api.rst for documentation"""
if 'date_column' in kwargs:
date_column = kwargs.get('date_column')
if isinstance(date_column, list):
for col in date_column:
df[col] = pd.to_datetime(df[col])
elif isinstance(date_column, str):
df[date_column] = pd.to_datetime(df[date_column])
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#100 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAEU4UMDIGFY56S4UBHX4S3RQY65HANCNFSM4M3S2LZA>.
|
So there is a discussion of adding this as a type upstream in pandas: I think any work we do here would have to wait on that, so let's see if that gets traction |
The date field seems to have stalled in pandas, can this be considered again? We have a fair few dates in our project, and would love to use pantab for this. |
@joshuataylor have you looked at hyperarrow? It is a similar tool but with arrow as a back end you get first class DATE support |
I didn't know that library existed, awesome work 😍 . Will give it a go. |
Is this still open? Running into this issue right now using pandas. TypeError: Invalid value "datetime.date(2023, 10, 5)" found (row 0 column 5) |
@jstrauss18 your column dtype is likely object. If you want to write time stamps make sure you use a datetime dtype column. Pandas does not natively support plain DATE types (pyarrow does, but pantab currently does not leverage pyarrow types) |
Sorry I'm not familiar with databricks so can't give specific advice. You might want to try StackOverflow for something more tailored. Most I/O methods in pandas provide a As a hack you could try |
Hello, |
Your best bet will be the keep track of the pantab 4.0 development which will be a significant overhaul of the code base |
Any idea about the release date and who is handling this |
I am maintaining a checklist of things in #219 - feel free to comment there or ask questions. As far as a release date...I do not know. I am looking at using some new technology so there are many variables at play. This being an open source project things get developed as myself or anyone in the community has time and interest, which also adds another layer. The best thing I can say is "maybe a couple of months" but without any guarantee :-) |
I'm trying to write a dataframe that contains datetime.date object to hyper using pantab.frame_to_hyper method and it raisers TypeError.
Steps to reproduce the problem:
import pandas as pd
import datetime
date = datetime.date(2020,5,8)
df = pd.DataFrame({'Date': [date,date,date], 'Col' : list('ABC') })
df.head()
df.info()
import pantab
from tableauhyperapi import TableName
table = TableName('Extract','Extract')
pantab.frame_to_hyper(df, 'random_db.hyper', table=table)
=> TypeError: Invalid value "datetime.date(2020, 5, 8)" found (row 0 column 0)
converting datetime.date to pd.datetime solves the problem
df.iloc[0,0]
df['Date'] = pd.to_datetime(df['Date'])
pantab.frame_to_hyper(df, 'random_db.hyper', table=table)
other info:
OS: macOS Catalina 10.15.3
pandas version 1.0.0
pantab version 1.1.0
Thanks
Hadi
The text was updated successfully, but these errors were encountered: