runai-bgu submit Manual

Introduction

runai-bgu submit is a command-line interface (CLI) for submitting and managing workloads on the BGU HPC cluster. It simplifies the process of running trainings and interactive workspaces, providing a unified interface for both.

This manual explains how to use runai-bgu submit to launch trainings and workspaces, including resource management, templates, and advanced options.

Quick Start

To submit a job, use:

Example 1. Required Arguments
$ runai-bgu submit \ (1)
  <workload_image> \ (2)
  --name myjob \ (3)
  --cpu 2 \ (4)
  --memory 2Gi \ (5)
  --gpu-memory 4Gi (6)
1 The submit command.
2 The workload type you want to use (see list bellow for options).
3 The name of the job. Also -n.
4 Number of CPU cores. Also -c.
5 RAM resource limit, e.g., 2Gi. Also -m.
6 vRAM resource limit, e.g., 4Gi.
workload_image

The workload type you want to use. See collapsible list below for available options are.

Available Workload Images
  • python

  • pycharm

  • vscode

  • cmd

  • R

  • jupyter

  • julia

  • matlab

  • gmx

  • namd

  • gurobi

  • stata

  • sumo

  • oneapi

  • mathematica

Both gpu-memory and memory flags accept digital information units. See this article for the difference between Gi and G.

Using Templates

Templates allow you to reuse resource configurations. You can use either user templates (--ut) or group templates (--gt):

$ runai-bgu submit pycharm -n dev-test --ut reg-train

This will create a job named dev-test using the pycharm workload and the resources defined in your user template reg-train.

To use a group template:

$ runai-bgu submit pycharm -n dev-test --gt reg-train
When using a template, you cannot specify --cpu, --memory, or --gpu/--gpu-memory directly.

See the template CLI guide for more details.

Advanced Options

You can pass additional flags to control job behavior and debugging. These are passed after the main arguments:

Debugging
--loglevel

Set log level for submit and port-forward (info, warn, debug, error).

--loglevel-port

Log level for port-forward only.

--loglevel-submit

Log level for submit only.

Toggles
--attach

Attach to the job after submission.

--stdin

Allocate stdin.

--preemptible

Allow preemptible scheduling.

--tty

Allocate a TTY.

String Flags
--working-dir

Set working directory in the container.

--custom-url

Custom URL for job.

--pod-running-timeout

Timeout for pod running state.

--node-pools

Specify node pools.

--node-type

Specify node type.

--toleration

Set tolerations.

--environment

Set environment variables.

--project

Specify project.

Integer Flags
--backoff-limit

Set backoff limit for job retries.

--completions

Number of completions for the job.

--parallelism

Number of parallel jobs.

Running Custom Commands

You can run a custom command inside your job by appending -- followed by your command surrounded by quotes. For example:

$ runai-bgu submit python -n train-model \
  --ut train-ultra --backoff-limit 3 \
  --completions 3 --parallelism 3 \
  --loglevel warn \
  -- "python train.py" (1) (2)
1 Note the space between the double dash (--) and the command itself
2 Note The qoutes surrounding the whole command

This will create a job called train-model with the resources defined in the train-ultra user template, set job control flags, and execute python train.py inside the container.

Interactive Workspaces

For interactive development (e.g., with PyCharm, VSCode, or Jupyter), use the corresponding workload type. The CLI will handle port forwarding and SSH setup automatically.

Example:

$ runai-bgu submit pycharm -n dev-pycharm --cpu 4 --memory 8Gi --gpu-memory 8Gi

After submission, follow the CLI instructions to connect your IDE.

Tips

  • Job names must be unique. If a job with the same name is terminating, the CLI will wait for it to finish before submitting a new one.

  • Use templates to standardize resource requests across your team.

  • For troubleshooting, increase verbosity with --loglevel debug.