runai-bgu submit Manual

Introduction

runai-bgu submit is a command-line interface (CLI) for submitting and managing workloads on the BGU HPC cluster. It simplifies the process of running trainings and interactive workspaces, providing a unified interface for both.

This manual explains how to use runai-bgu submit to launch trainings and workspaces, including resource management, templates, and advanced options.

Quick Start

To submit a job, use:

Example 1. Required Arguments

$ runai-bgu submit \ (1)
  <workload_image> \ (2)
  --name myjob \ (3)
  --cpu 2 \ (4)
  --memory 2Gi \ (5)
  --gpu-memory 4Gi (6)

1	The `submit` command.
2	The workload type you want to use (see list bellow for options).
3	The name of the job. Also `-n`.
4	Number of CPU cores. Also `-c`.
5	RAM resource limit, e.g., `2Gi`. Also `-m`.
6	vRAM resource limit, e.g., `4Gi`. workload_image The workload type you want to use. See collapsible list below for available options are.

Available Workload Images

python
pycharm
vscode
cmd
R
jupyter
julia
matlab
gmx
namd
gurobi
stata
sumo
oneapi
mathematica

Both gpu-memory and memory flags accept digital information units. See this article for the difference between Gi and G.

Using Templates

Templates allow you to reuse resource configurations. You can use either user templates (--ut) or group templates (--gt):

$ runai-bgu submit pycharm -n dev-test --ut reg-train

This will create a job named dev-test using the pycharm workload and the resources defined in your user template reg-train.

To use a group template:

$ runai-bgu submit pycharm -n dev-test --gt reg-train

When using a template, you cannot specify --cpu, --memory, or --gpu/--gpu-memory directly.

See the template CLI guide for more details.

Advanced Options

You can pass additional flags to control job behavior and debugging. These are passed after the main arguments:

Debugging
--loglevel: Set log level for submit and port-forward (info, warn, debug, error).
--loglevel-port: Log level for port-forward only.
--loglevel-submit: Log level for submit only.
Toggles
--attach: Attach to the job after submission.
--stdin: Allocate stdin.
--preemptible: Allow preemptible scheduling.
--tty: Allocate a TTY.
String Flags
--working-dir: Set working directory in the container.
--custom-url: Custom URL for job.
--pod-running-timeout: Timeout for pod running state.
--node-pools: Specify node pools.
--node-type: Specify node type.
--toleration: Set tolerations.
--environment: Set environment variables.
--project: Specify project.
Integer Flags
--backoff-limit: Set backoff limit for job retries.
--completions: Number of completions for the job.
--parallelism: Number of parallel jobs.

Running Custom Commands

You can run a custom command inside your job by appending -- followed by your command surrounded by quotes. For example:

$ runai-bgu submit python -n train-model \
  --ut train-ultra --backoff-limit 3 \
  --completions 3 --parallelism 3 \
  --loglevel warn \
  -- "python train.py" (1) (2)

1	Note the space between the double dash (`--`) and the command itself
2	Note The qoutes surrounding the whole command

This will create a job called train-model with the resources defined in the train-ultra user template, set job control flags, and execute python train.py inside the container.

Interactive Workspaces

For interactive development (e.g., with PyCharm, VSCode, or Jupyter), use the corresponding workload type. The CLI will handle port forwarding and SSH setup automatically.

Example:

$ runai-bgu submit pycharm -n dev-pycharm --cpu 4 --memory 8Gi --gpu-memory 8Gi

After submission, follow the CLI instructions to connect your IDE.

Tips

Job names must be unique. If a job with the same name is terminating, the CLI will wait for it to finish before submitting a new one.
Use templates to standardize resource requests across your team.
For troubleshooting, increase verbosity with --loglevel debug.

More Information

For a full list of workloads and their capabilities, see the project documentation.
For template management, see the template CLI guide.
For workspace-specific setup (e.g., PyCharm remote dev), see the following guides:
- Workspaces
  
  Running PyCharm Workspaces
  
  Running VS Code Workspaces
  
  Running VS Code Browser Workspaces
  
  Running PyCharm Browser Workspaces
  
  Running Jupyter Notebook Workspaces
  
  Running RStudio Browser Workspaces
  
  Running Matlab Browser Workspaces
  
  Running Stata Browser Workspaces
  
  Running SUMO X11 Workspaces
  
  Running Stata X11 Workspaces
- Trainings
  
  Running Python Trainings
  
  Running Mathematica Trainings
  
  Running Matlab Trainings
  
  Running R Trainings
  
  Running Julia Trainings
  
  Running Gromacs Training
  
  Running NAMD Training
  
  Running CMD Trainings
  
  Running Gurobi Trainings
  
  Running OneAPI Trainings
  
  Running Jupyter Trainings
  
  Running ImageJ Workloads

If you encounter issues, please contact your cluster administrator or