runai-bgu submit Manual
Introduction
runai-bgu submit is a command-line interface (CLI) for submitting and managing workloads on the BGU HPC cluster.
It simplifies the process of running trainings and interactive workspaces, providing a unified interface for both.
This manual explains how to use runai-bgu submit to launch trainings and workspaces, including resource management, templates, and advanced options.
Quick Start
To submit a job, use:
$ runai-bgu submit \ (1)
<workload_image> \ (2)
--name myjob \ (3)
--cpu 2 \ (4)
--memory 2Gi \ (5)
--gpu-memory 4Gi (6)
| 1 | The submit command. |
| 2 | The workload type you want to use (see list bellow for options). |
| 3 | The name of the job. Also -n. |
| 4 | Number of CPU cores. Also -c. |
| 5 | RAM resource limit, e.g., 2Gi. Also -m. |
| 6 | vRAM resource limit, e.g., 4Gi.
|
Available Workload Images
-
python -
pycharm -
vscode -
cmd -
R -
jupyter -
julia -
matlab -
gmx -
namd -
gurobi -
stata -
sumo -
oneapi -
mathematica
| Both gpu-memory and memory flags accept digital information units. See this article for the difference between Gi and G. |
Using Templates
Templates allow you to reuse resource configurations. You can use either user templates (--ut) or group templates (--gt):
$ runai-bgu submit pycharm -n dev-test --ut reg-train
This will create a job named dev-test using the pycharm workload and the resources defined in your user template reg-train.
To use a group template:
$ runai-bgu submit pycharm -n dev-test --gt reg-train
When using a template, you cannot specify --cpu, --memory, or --gpu/--gpu-memory directly.
|
See the template CLI guide for more details.
Advanced Options
You can pass additional flags to control job behavior and debugging. These are passed after the main arguments:
- Debugging
--loglevel-
Set log level for submit and port-forward (info, warn, debug, error).
--loglevel-port-
Log level for port-forward only.
--loglevel-submit-
Log level for submit only.
- Toggles
--attach-
Attach to the job after submission.
--stdin-
Allocate stdin.
--preemptible-
Allow preemptible scheduling.
--tty-
Allocate a TTY.
- String Flags
--working-dir-
Set working directory in the container.
--custom-url-
Custom URL for job.
--pod-running-timeout-
Timeout for pod running state.
--node-pools-
Specify node pools.
--node-type-
Specify node type.
--toleration-
Set tolerations.
--environment-
Set environment variables.
--project-
Specify project.
- Integer Flags
--backoff-limit-
Set backoff limit for job retries.
--completions-
Number of completions for the job.
--parallelism-
Number of parallel jobs.
Running Custom Commands
You can run a custom command inside your job by appending -- followed by your command surrounded by quotes. For example:
$ runai-bgu submit python -n train-model \
--ut train-ultra --backoff-limit 3 \
--completions 3 --parallelism 3 \
--loglevel warn \
-- "python train.py" (1) (2)
| 1 | Note the space between the double dash (--) and the command itself |
| 2 | Note The qoutes surrounding the whole command |
This will create a job called train-model with the resources defined in the train-ultra user template, set job control flags, and execute python train.py inside the container.
Interactive Workspaces
For interactive development (e.g., with PyCharm, VSCode, or Jupyter), use the corresponding workload type. The CLI will handle port forwarding and SSH setup automatically.
Example:
$ runai-bgu submit pycharm -n dev-pycharm --cpu 4 --memory 8Gi --gpu-memory 8Gi
After submission, follow the CLI instructions to connect your IDE.
Tips
-
Job names must be unique. If a job with the same name is terminating, the CLI will wait for it to finish before submitting a new one.
-
Use templates to standardize resource requests across your team.
-
For troubleshooting, increase verbosity with
--loglevel debug.
More Information
-
For a full list of workloads and their capabilities, see the project documentation.
-
For template management, see the template CLI guide.
-
For workspace-specific setup (e.g., PyCharm remote dev), see the following guides:
-
Workspaces
-
Trainings
-
If you encounter issues, please contact your cluster administrator or