Chenguang Xiao

Logo

Chenguang is currently a PhD candidate in computer science.

View My GitHub Profile

12 June 2023

Manage your quota on computer cluster

by Chenguang Xiao

When you are using a computer cluster, you are usually given a quota of disk space as home directory. This home directory allows you to store necessary script files that customize your working environment, for example, .bashrc, .vimrc, .tmux.conf, etc. With this target, the quota of your home directory is usually not large, e.g., 10GB. This is a fair small space when you put your project data in it or faild to manage your files.

In this post, I will introduce some useful commands to help you manage your quota on computer cluster. This is done in two steps:

The first step is to set up a soft link from your project storage to your home directory. This make you access your project storage from your home directory painlessly while keep your home directory clean. As the data you put in your project storage can be accessed from your home directory but not take up the space of your home directory. For example, if you have a project storage /cluster/project/ and your home directory is /cluster/user_name/home/, you can set up a soft link from /cluster/project/ to /cluster/user_name/home/project/ by

ln -s /cluster/project/ /cluster/user_name/home/project/

Then you can access your project storage by cd ~/project/ from your home directory.

Clean up your home directory if necessary

The second step is to clean up your home directory. Even you already set up proper soft links to store large files to project storage, you may still run out of your quota gradually. This is because you may have some temporary files or large files that you do not need any more. You may find the .cache directory in your home directory takes up a 5GB space as lots of applications store their temporary files in it. By running ls ~/.cache/, you can see results like

total 10K
drwxr-xr-x. 3 user_name users 4.0K Nov 11  2022 matplotlib
drwxr-xr-x. 5 user_name users 4.0K May 24 11:38 pip
drwxr-xr-x. 2 user_name users 4.0K Jan 14 15:45 seaborn
drwxr-xr-x. 3 user_name users 4.0K Aug 10  2022 torch

Obvirously, pip, matplotlib, seaborn, torch, typescript, and yarn pleace their temporary files in .cache.

Further more, you can use du -sh * to see the size of each directory in your home directory, ans sort the results by sort -h from samller folder to larger with help of pipleline |. As we want to see the usage of .cache fordler, we can run

cd ~/.cache
du -sh $(ls -A) | sort -h

The results are like

131K    matplotlib
257K    seaborn
188M    torch
5.2G    pip

Obvirously, the pip folder takes up a large space. You may want to clean up the pip folder with rm -rf ~/.cache/pip/* or just use pip cache purge to clean up the pip cache.

Similarly, you can use du -sh $(ls -A) | sort -h to see the size of all folders in your home directory and clean up the large folders that you do not need any more.

If you encounter a Error Illegal variable name. when runing du -sh $(ls -A) | sort -h, you can try du -sh `ls -A` | sort -h -k 1 instead.

Technique details:

tags: