Issue:
When you tried via the UI to create a snapshot of a dataset with a large number of files, the snapshot did not include all the files.
Root Cause:
The Admin can check if these two settings in the Central Config are set too low. Domino limits these because the Amazon Elastic File System (EFS) can be slow and the process of creating a snapshot can be time intensive, especially when there are a large number of files.
- com.cerebro.domino.dataset.maxFileListingLength: The maximum number of files that will be listed in the snapshot view (default is 1000 files).
- com.cerebro.domino.dataset.fileCacheTimeout: The amount of time to work on getting a listing (default is 2s).
Increasing these two settings would help with listing all the files correctly in the frontend, which in turn, would make taking the snapshot more accurate. However, please note that there might be a downside to doing so performance-wise.
Resolution:
For datasets with a large number of files, Domino recommends that you create the snapshot via the Domino CLI by running this command:
domino create-snapshot <project-owner>/<project-name>/<dataset-name>
Notes:
- Do not modify the files in the Dataset until the snapshot is done.
- When you create a snapshot via the CLI, you have more control over specifying the paths to include in the snapshot.
- In contrast, when you create a snapshot via the UI, the frontend explicitly passes all the file paths (which is why pagination limit is a limiting factor). This was intentional because datasets are mutable and new files/folders could be created by other users while a snapshot is being taken.
Notes/Information:
- See Install the Domino Command Line (CLI) for more information.
Comments
0 comments
Please sign in to leave a comment.