You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Given a local directory (as a "file" URI) as the "source" and a regular expression "pattern" to match desired file name/path, this adapter should crawl the file system starting at that directory and make a Sample for each matching file. Matching groups in the regular expression should map to domain variables. The range should include the "uri" text variable with the absolute "file" URI. When modeling a file list dataset (like any other granule list) the domain variable should be chosen to provide uniqueness of samples. Other variables of interest could be included in the range.
Stretch goals (could become new tickets but might influence the design):
Consider adding the uri variable as the first in the range. But that may complicate the mapping of matching groups to variables. (LaTiS v2 puts it after all other variables, but doesn't use domain variables appropriately to support uniqueness.)
Consider order implications. Crawl to preserve order so we can avoid sorting?
Consider relative file paths with a "baseUri" metadata property. Or hiding the baseUri altogether from the user while preserving it so a zip writer could access the files.
Option to include file size (only if it is defined in the model?)
Be smart about only reading directories that might possibly have a match. For example, if the file path has dates encoded in it, only crawl those directories that fall within the selected data range. (A hybrid approach with a granule list generator might work better for this.)
See latis' FileListAdapter and variations.
The text was updated successfully, but these errors were encountered:
dlindhol
changed the title
Add a file list adapter
Review file list adapter
Oct 6, 2020
Given a local directory (as a "file" URI) as the "source" and a regular expression "pattern" to match desired file name/path, this adapter should crawl the file system starting at that directory and make a Sample for each matching file. Matching groups in the regular expression should map to domain variables. The range should include the "uri" text variable with the absolute "file" URI. When modeling a file list dataset (like any other granule list) the domain variable should be chosen to provide uniqueness of samples. Other variables of interest could be included in the range.
Stretch goals (could become new tickets but might influence the design):
See latis'
FileListAdapter
and variations.The text was updated successfully, but these errors were encountered: