Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retaining other columns in input dataframe #33

Open
plubbe opened this issue Sep 4, 2024 · 1 comment
Open

Retaining other columns in input dataframe #33

plubbe opened this issue Sep 4, 2024 · 1 comment

Comments

@plubbe
Copy link

plubbe commented Sep 4, 2024

Is there any way we can enable the software's output to include the additional columns of the input?

I'm thinking specifically of presence/absence signifiers, which are currently stripped from the input after thinning. Users are then left with the unenviable task of re-determining which points are presence (1) and which are (pseudo)absence. In my case I now seem to have generated a lot of data that isn't in the input data.frame and therefore has no presence/absence designation (i.e., I tried using dplyr to join the dataframes, pre-thinning and post-thinning, and ended up with a whole load of NAs). Retaining the extra columns would prevent this issue and would make the package useable at any stage of the data preparation process (i.e., regardless of which columns are associated with the input as long as the minimum columns are present)

@e42mercury
Copy link

I second this request and in the meantime, I can offer a workaround in Excel.

  1. In the output file, I added a new column that multiplies Lat*Long. This creates a unique "Lat/Long ID" for each row.
  2. In the original file, I did the same and pasted this records into the same spreadsheet.
  3. I used VLOOKUP (exact match) to match records according to their "Lat/Long ID", and I was able to join the data in the thinned file and the original file.

In my case, I want to retain the gbifID from the GBIF database. This comes directly from the GBIF database, so it is a good way to me (and others) to keep track of which record we're looking at.

Hope this helps. -Erik

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants