If your execution has failed and you see a similar error to the following error in the logs,2021-05-20 10:15:51 : Critical error in run 60a681feff7cec790b12d872: java.lang.IllegalStateException: Blob transfer (download) failed because 24 operations failed: Download blob 616d0cf42662fd16915448aec68f5d52d99e67bb -> java.io.IOException: No space left on device
then you have exceeded the disk space available to your execution. From the user side this should be a relatively straightforward fix. Remember that your executions execute inside a 'compute instance'. These instances have finite amounts of all resources, memory, cpu, and disk. So to correct the problem as a user try the following steps...
1) Try a new hardware tier.
2) Reduce the amount of data you are using in your project and remember git repositories also count towards this.
3) Move project data to datasets.
Changes you make to the volume size will not impact existing workspaces. Instead, the changes will be applied to subsequent, new workspaces.
Estimating how much room you need can be a little tricky, it is a combination of your project files, any files being imported from an external repository and if you are generating lots of data in your run that should be taken in to consideration too.
It is still highly recommended to move large project files to a dataset.
The Gory Details
The information below this point is mostly for administrators.
Domino v4.x and above
At v4.0 we have switched the architecture from docker to kubernetes and now each execution gets its own volume and is no longer impacted by caching and shared resource issues. That said, these volumes have a finite size still and can still be filled. From v4.0-v4.4 the default disk size is 10GB. After v4.4 the default size is 15GB. This default size can also be configured in the Central Config with the following setting...
Users can additionally change their individual disk size in the project settings of their project in v4.4 and above.
Domino v3.6 and below
When an execution starts, we copy the user's project files and the docker image for the user's Compute Environment to a partition on the executor so that they are available to the docker container that is started for the execution.
If the data moved to the executor instance for the execution exceeds available disk space the execution can fail on startup.
Under the hood, we are caching project data and the environment images to try and reduce startup times. Over time, if an executor is long lived and used by projects using a variety of different compute images that all get cached, the disk partition can also fill.