diff --git a/.gitignore b/.gitignore
new file mode 100644
index 00000000..8776d852
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,4 @@
+components/ai/
+components/button.tsx
+components/card3.tsx
+components/toc.tsx
diff --git a/README.md b/README.md
index 10688ca3..d326151f 100644
--- a/README.md
+++ b/README.md
@@ -14,59 +14,12 @@ The easiest way to start contributing is to find the `.mdx` file that is used to
In case you want to generate the static pages locally (could be useful for large changes) see below.
-1. Clone CERIT repo with some objects common to all eInfra docs: `git clone https://github.com/CERIT-SC/fumadocs` (you can do this only once, you just need to have the repo content somewhere.) There are several files that are needed to compile the docs, however they should not be copied to the repo and should be only used temporarily (ask Lukas Hejtmanek if in doubt).
-
-2. Clone the Metacentrum docs repo, checkout to the main branch:
-
-```
-git clone https://github.com/CESNET/metacentrum-user-docs
-
-git checkout remotes/origin/main
-
-# or you can make the remote branch local as:
-git checkout -b main origin/main
-
-```
-
-3. Make a small script similar to the following:
-
+1. Clone the CESNET/metacentrum-user-docs repo `git clone https://github.com/CESNET/metacentrum-user-docs`
+2. Clone CERIT-SC/fumadocs repo with some objects common to all eInfra docs: `git clone https://github.com/CERIT-SC/fumadocs`
+3. Copy the required files `cp -r fumadocs/components/* metacentrum-user-docs/components/`
+4. Enter the directory `cd metacentrum-user-docs`
+5. Run the build
```bash
-#!/bin/bash
-
-# path to where the Metacentrum docs repo
-repodir="/home/melounova/meta/metacentrum-user-docs"
-
-# path to the CERIT fumadocs repo
-fumadir="/home/melounova/meta/fumadocs"
-
-# Copy some stuff from CERIT repo to Metacentrum repo
-cd ${repodir}/components
-cp -r ${fumadir}/components/* .
-cd ${repodir}
-
-# run the build
-docker run -it --rm -p 3000:3000 -e STARTPAGE=/en/docs -v ${repodir}/public:/opt/fumadocs/public -v ${repodir}/components:/opt/fumadocs/components -v ${repodir}/content/docs:/opt/fumadocs/content/docs cerit.io/docs/fuma:v15.0.12 pnpm dev
-
-# remove again the stuff borrowed from CERIT repo
-cd ${repodir}/components ; rm -r ai ; rm button.tsx card3.tsx sidebar.tsx toc.tsx
+docker run -it --rm -p 3000:3000 -e STARTPAGE=/en/docs -v ./public:/opt/fumadocs/public -v ./components:/opt/fumadocs/components -v ./content/docs:/opt/fumadocs/content/docs cerit.io/docs/fuma:v16.4.6 pnpm dev
```
-
-4. run the script (as sudo if needed); in a browser, see the docs at `http://localhost:3000/en/docs/welcome`
-
-
-**Notes**
-
-- 8 GB of mem is just barely enough to run the build on an older ntb
-
-
-
-
-
-
-
-
-
-
-
-
-
+6. Documentation will be available at `http://localhost:3000/en/docs/welcome` and automatically rebuilt on source change.
diff --git a/content/docs/access/account.mdx b/content/docs/access/account.mdx
index 3c17f1f5..d424db91 100644
--- a/content/docs/access/account.mdx
+++ b/content/docs/access/account.mdx
@@ -27,7 +27,7 @@ Expired accounts can be renewed at any time during the year [here](https://metav
## How to start with MetaCentrum
-A comprehensive tutorial for new users is [here](https://docs.metacentrum.cz/en/docs/computing/basic-tutorial).
+A comprehensive tutorial for new users is [here](https://docs.metacentrum.cz/en/docs/computing/run-basic-job).
## Group data access
diff --git a/content/docs/computing/advanced.mdx b/content/docs/computing/advanced.mdx
new file mode 100644
index 00000000..29b1cee7
--- /dev/null
+++ b/content/docs/computing/advanced.mdx
@@ -0,0 +1,360 @@
+---
+title: Running jobs (advanced)
+---
+
+This guide covers advanced topics for running jobs on MetaCentrum. If you're new to MetaCentrum, start with the [Getting started guide](./run-basic-job).
+
+## Kerberos authentication
+
+MetaCentrum uses Kerberos for internal authentication. Tickets expire after 10 hours.
+
+```bash
+klist # List tickets
+kdestroy # Delete tickets
+kinit # Create new ticket
+```
+
+On ticket expiration, use `kinit` to regenerate. For OnDemand users, restart the web server via **Help → Restart Web Server**.
+
+For detailed Kerberos information, see [Kerberos security page](../access/security/kerberos).
+
+## Detailed resource configuration
+
+### Resource specification methods
+
+Resources can be specified in two ways:
+1. On the command line with `qsub`
+2. Inside the batch script on lines beginning with `#PBS`
+
+```bash
+# On command line
+qsub -l select=1:ncpus=4:mem=4gb:scratch_local=10gb -l walltime=1:00:00 myJob.sh
+```
+
+
+ If both resource specifications are present (CLI and script), the values on CLI have priority.
+
+
+### Chunk-wide vs job-wide resources
+
+According to PBS terminology, a **chunk** is a subset of computational nodes on which the job runs. Resources can be:
+
+- **Chunk-wide**: Applied to each chunk separately (e.g., `ncpus`, `mem`, `scratch_local`)
+- **Job-wide**: Applied to the job as a whole (e.g., `walltime`, software licenses)
+
+
+ For most "normal" jobs, the number of chunks is 1 (default value). See [PBS resources guide](./resources/resources) for complex parallel computing scenarios.
+
+
+### Scratch directories
+
+Four scratch types are available. Default: `scratch_local`.
+
+**Recommended:**
+```bash
+qsub -I -l select=1:ncpus=2:mem=4gb:scratch_local=1gb -l walltime=2:00:00
+```
+
+Access scratch via `$SCRATCHDIR`. Use `go_to_scratch ` to access scratch after job failure.
+
+For detailed scratch type information, see [Scratch storage guide](./infrastructure/scratch-storages).
+
+## Interactive jobs
+
+### Starting interactive jobs
+
+Request interactive session: `qsub -I -l select=1:ncpus=4 -l walltime=2:00:00`
+
+Jobs are auto-terminated when walltime expires.
+
+### When useful
+
+- Testing software, input formats, resource estimates
+- Compiling, processing/moving large data
+- Running [GUI applications](../software/graphical-access)
+
+### Example
+
+Interactive jobs are useful for software testing, compiling, and data processing:
+
+```bash
+qsub -I -l select=1:ncpus=4 -l walltime=2:00:00
+# Once on compute node:
+module add mambaforge
+mamba create -n my_env
+mamba activate my_env
+python my_script.py
+```
+
+## Job ID details
+
+Job IDs identify jobs for tracking and management: `13010171.pbs-m1.metacentrum.cz` (full form required).
+
+Get your job ID:
+- After `qsub` command
+- Inside jobs: `echo $PBS_JOBID`
+- From qstat: `qstat -u username`
+
+## Job monitoring and management
+
+### Job states
+
+PBS Pro uses different codes to mark job state within the PBS ecosystem:
+
+| State | Description |
+|-------|-------------|
+| Q | Queued |
+| H | Held. Job is suspended by the server, user, or administrator. Job stays in held state until released by user or administrator. |
+| R | Running |
+| S | Suspended (substate of R) |
+| E | Exiting after having run |
+| F | Finished |
+| X | Finished (subjobs only) |
+| W | Waiting. Job is waiting for its requested execution time or delayed due to stagein failure. |
+
+### Advanced qstat commands
+
+```bash
+qstat -u user123 # list all jobs (running or queued)
+qstat -xu user123 # list finished jobs
+qstat -f # full details of running/queued job
+qstat -xf # full details of finished job
+```
+
+For more detailed job monitoring and history, see [Job tracking](./jobs/job-tracking).
+
+### qstat output interpretation
+
+```
+Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
+-------------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
+11733550.pbs-m1 user123 q_2h myJob.sh -- 1 1 1gb 00:05 Q --
+```
+
+Key headers: `S`=status, `NDS`=nodes, `TSK`=tasks, `Memory`=requested memory, `Time`=elapsed.
+
+### Job deletion
+
+**Delete a submitted/running job:**
+
+```bash
+qdel 21732596.pbs-m1.metacentrum.cz
+```
+
+**Force deletion (if plain qdel doesn't work):**
+
+```bash
+qdel -W force 21732596.pbs-m1.metacentrum.cz
+```
+
+## PBS server and queues
+
+**Essential commands**: `qsub` (submit), `qstat` (query), `qdel` (delete)
+
+**Queues**: Jobs route automatically from routing queue to execution queues (`q_1h`, `q_1d`, etc.). Don't specify a queue unless necessary.
+
+
+ View all queues at [PBSmon](https://metavo.metacentrum.cz/pbsmon2/queues/list). For more on queues, see [Queues guide](./resources/queues).
+
+
+## Output files and error handling
+
+When a job completes, two files are created in the submission directory: `jobname.o` (STDOUT) and `jobname.e` (STDERR). The `.e` file is the first place to look if a job fails.
+
+For detailed output file handling, see [Job tracking guide](./jobs/job-tracking).
+
+## Exit status interpretation
+
+Exit status indicates how a batch job finished (interactive jobs always return 0).
+
+```bash
+qstat -xf job_ID | grep Exit_status # Get exit status
+```
+
+
+ For jobs >24h old, use `pbs-get-job-history` or [PBSmon](https://metavo.metacentrum.cz/pbsmon2/jobs/detail).
+
+
+**Ranges**:
+- `X < 0`: PBS killed job (resource exceeded)
+- `0 <= X < 256`: Shell/top process exit
+- `X >= 256`: OS signal (subtract 256 for signal code; use `kill -l` to list signals)
+
+**Common statuses**: `-23`=missing Kerberos, `-25`=exceeded CPUs, `-27`=exceeded memory, `-29`=exceeded walltime, `0`=normal, `271`=SIGTERM (qdel)
+
+## Scratch cleanup
+
+When a job ends with an error, data may remain in scratch. Clean up after retrieving useful data.
+
+### Manual cleanup
+
+Log in to the compute node and remove scratch contents:
+
+```bash
+ssh user123@node.fzu.cz
+cd /scratch/user123/job_JOBID
+rm -r *
+```
+
+
+ Use `go_to_scratch ` to access scratch after job failure. The scratch directory itself is deleted automatically.
+
+
+### Automatic cleanup with trap
+
+```bash
+trap 'clean_scratch' EXIT TERM # Clean on normal exit or termination
+trap 'echo "$PBS_JOBID failed at $SCRATCHDIR" >> log.txt' TERM # Log for manual cleanup
+```
+
+The `trap` command ensures scratch cleanup even when jobs fail. See [Trap command guide](./jobs/trap-command) for details.
+
+## Custom output paths
+
+By default, job output files go to the submission directory (`$PBS_O_WORKDIR`). You can change this:
+
+```bash
+qsub -o /custom-path/myOutputFile -e /custom-path/myErrorFile script.sh
+```
+
+Or in the batch script:
+
+```bash
+#PBS -o /custom-path/myOutputFile
+#PBS -e /custom-path/myErrorFile
+```
+
+For more on output file customization, see [PBS resources guide](./resources/resources).
+
+## Job arrays
+
+Job arrays allow you to run many similar jobs with a single submission instead of submitting each one individually.
+
+### Submitting a job array
+
+```bash
+qsub -J X-Y[:Z] script.sh
+```
+
+- `X` – first index of the job
+- `Y` – last index of the job
+- `Z` – optional index step
+
+**Example:** `qsub -J 2-7:2 script.sh` creates subjobs with indexes 2, 4, 6.
+
+### Array job format
+
+The main job is displayed with `[]` (e.g., `969390[]`). Each subjob has an ID like `969390[1].pbs-m1.metacentrum.cz`.
+
+### Array job variables
+
+Inside your script, use:
+
+```bash
+$PBS_ARRAY_INDEX # Index of the current subjob
+$PBS_ARRAY_ID # Job ID of the main job
+```
+
+### Monitoring array jobs
+
+```bash
+qstat -t # List all subjobs
+qstat -f 969390'[]' -x | grep array_state_count # See overall status
+```
+
+For more on job arrays, see [Job arrays guide](./jobs/job-arrays).
+
+## Job dependencies
+
+Make a job wait until another job completes successfully.
+
+### Submit with dependencies
+
+```bash
+qsub -W depend=afterok:job1_ID.pbs-m1.metacentrum.cz job2_script.sh
+```
+
+This submits `job2_script.sh` to run only after `job1_ID` completes with exit code 0.
+
+### Modify existing job dependencies
+
+```bash
+qalter -W depend=afterok:job1_ID.pbs-m1.metacentrum.cz job2_ID.pbs-m1.metacentrum.cz
+```
+
+## Modifying job attributes
+
+Modify **queued** jobs (status Q) with `qalter`:
+
+```bash
+qalter -l select=1:ncpus=32:mem=10gb job_ID.pbs-m1.metacentrum.cz
+qalter -l walltime=02:00:00 job_ID.pbs-m1.metacentrum.cz
+```
+
+
+ Walltime can only be modified within the queue's maximum. You must specify the entire `-l` attribute with `qalter`.
+
+
+For running jobs, see "Extend walltime" below. For more, see [Modify job attributes guide](./jobs/modify-job-attributes).
+
+## Extend walltime for running jobs
+
+Extend walltime of **running** jobs with `qextend`:
+
+```bash
+qextend job_ID.pbs-m1.metacentrum.cz 01:00:00 # hh:mm:ss or seconds
+```
+
+**Limits**: Max 20 times/month AND 1440 CPU-hours/month (CPU-hours = walltime × ncpus)
+
+```bash
+qextend info # Check your quota
+```
+
+Array jobs require support contact: meta@cesnet.cz
+
+For more, see [Extend walltime guide](./jobs/extend-walltime).
+
+## Module span management
+
+For conflicting modules, use subshells to isolate environments:
+
+```bash
+(module add python/3.8.0-gcc; python script.py) # Independent module environment
+```
+
+```bash
+module display module_name # Show module details
+```
+
+`module display` shows key variables: `PATH`, `LD_LIBRARY_PATH`, `LIBRARY_PATH`.
+
+For more, see [Software modules guide](../software/modules).
+
+## Research group annual report
+
+Submit annual reports by end of January: group name/members/contact, research interests, contributions (hardware, software), projects, publications.
+
+Send to [annual-report@metacentrum.cz](mailto:annual-report@metacentrum.cz).
+
+## Additional resources
+
+- [Parallel computing](./parallel-comput) – for running MPI/OpenMP jobs
+- [GPU computing](./gpu-comput) – for GPU-accelerated workloads
+- [PBS resources](./resources/resources) – detailed resource specification guide
+- [Job tracking](./jobs/job-tracking) – detailed job monitoring and history
+- [Email notifications](./jobs/email-notif) – configure job status emails
+- [Software modules](../software/modules) – advanced module management
+- [Frontend and storage details](./infrastructure/frontend-storage) – understanding the architecture
+- [Finished jobs](./jobs/finished-jobs) – retrieving information about completed jobs
+- [Containers](../software/containers) – using Apptainer/Singularity images
+
+## Web-based job running with usegalaxy.cz
+
+As an alternative to command-line job submission, use **usegalaxy.cz** – a web-based platform providing thousands of tools, large data quotas (250 GB for e‑INFRA CZ login), and workflow support.
+
+**Access**: https://usegalaxy.cz – log in with e-INFRA CZ or Life Science credentials
+
+**When useful**: Web interface preference, available Galaxy tools, workflow building, avoiding script writing
+
+**More resources**: For detailed features and quotas, see [usegalaxy.cz guide](../graphical/usegalaxy).
diff --git a/content/docs/computing/basic-tutorial.mdx b/content/docs/computing/basic-tutorial.mdx
deleted file mode 100644
index 582aabab..00000000
--- a/content/docs/computing/basic-tutorial.mdx
+++ /dev/null
@@ -1,38 +0,0 @@
----
-title: A comprehensive manual for beginners
----
-
-*This page is currently under construction and will be completed soon.*
-
-
-Please [let us know](https://docs.metacentrum.cz/en/docs/support) if you think any part of the manual could be expanded, or if you feel that any information is missing. We would appreciate your opinion for further improvement.
-
-
-This manual is intended for new users who are unfamiliar with MetaCentrum NGI (National Grid Infrastructure) or any similar infrastructure. However, it isn't easy to summarise the detailed use of the entire infrastructure in a reasonably short guide. The following tutorial aims to provide the minimum necessary knowledge to enable new users to start calculating without any crucial problems as soon as possible.
-
-## If you are confused, please let us know
-
-Sometimes, problems can be too complicated, and error messages can be too cryptic. Please do not hesitate to contact [user support](https://docs.metacentrum.cz/en/docs/support) as soon as possible and ask for help. We are here to help you. But this is also important for us. It is sometimes necessary to catch inconsistent system behaviour in the act, because it is not possible to detect the cause of an error from the system logs afterwards.
-
-## Available software tools
-
-We provide access to several hundred application tools in thousands of individual modules. For this reason, we are unable to periodically monitor and update all of them, and also, our [list of available tools](https://docs.metacentrum.cz/en/docs/software/alphabet) only includes those that require some further description. We rely on our users to [inform us](https://docs.metacentrum.cz/en/docs/support) when a tool needs updating, and we recommend simpler [installations in the home directory](https://docs.metacentrum.cz/en/docs/software/install-software) (with our assistance if required).
-
-## Where is my data located, and why are there so many frontend servers?
-
-The main principle of the grid infrastructure is that it connects various compute clusters that are distributed across the Czech Republic. The same scheme applies to [storage servers](https://docs.metacentrum.cz/en/docs/computing/infrastructure/mount-storages), where data is located, and [frontend servers](https://docs.metacentrum.cz/en/docs/computing/infrastructure/frontends), which are the main access points. This is to distribute the user load across multiple localities and prevent all users from losing access to all data in the event of a technical problem.
-
-Users can, however, use any frontend server to access data on any storage server. Each frontend server is mounted on an individual storage server, and all storage servers are accessible across MetaCentrum. You can use the standard Linux command `cd` to move yourself (or the current working directory) between storage servers. For example, the frontend server `nympha` (`nympha.metacentrum.cz`) is mounted on the storage server `storage-plzen1.metacentrum.cz`, which is accessible from compute nodes and other frontend servers via the path `/storage/plzen1/home/user_name`. However, switching to a different storage server is simple if needed.
-
-```bash
-local_user_name@local_pc_name:~$ ssh user_name@nympha.metacentrum.cz
-...
-(BOOKWORM)user_name@nympha:~$ pwd
-/storage/plzen1/home/user_name
-(BOOKWORM)user_name@nympha:~$ cd /storage/brno2/home/user_name
-(BOOKWORM)user_name@nympha:/storage/brno2/home/user_name$ pwd
-/storage/brno2/home/user_name
-(BOOKWORM)user_name@nympha:/storage/brno2/home/user_name$ cd /storage/brno12-cerit/home/user_name
-(BOOKWORM)user_name@nympha:/storage/brno12-cerit/home/user_name$ pwd
-/storage/brno12-cerit/home/user_name
-```
diff --git a/content/docs/computing/concepts.mdx b/content/docs/computing/concepts.mdx
deleted file mode 100644
index a32a0faf..00000000
--- a/content/docs/computing/concepts.mdx
+++ /dev/null
@@ -1,192 +0,0 @@
----
-title: Basic terms
----
-
-import FrontendTable from '@/components/frontends';
-
-## Frontends, storages, homes
-
-There are several **frontends** (login nodes) to access the grid. Each frontend has a native **home directory** on one of the **storages**.
-
-There are several storages (large-capacity harddisc arrays). They are named according to their physical location (a city).
-
-```bash
-user123@user123-XPS-13-9370:~$ ssh skirit.metacentrum.cz
-user123@skirit.ics.muni.cz's password:
-...
-(BUSTER)user123@skirit:~$ pwd # print current directory
-/storage/brno2/home/user123 # "brno2" is native storage for "skirit" frontend
-```
-
-**List of frontends together with their native /home directories**
-
-
-
-**Frontend do's and dont's**
-
-Frontend usage policy is different from the one on computational nodes. The frontend nodes are shared by all users, the command typed by any user is performed immediately and there is no resource planning. Frontend node are not intended for heavy computing.
-
-Frontends should be used only for:
-
-- preparing inputs, data pre- and postprocessing
-- managing batch jobs
-- light compiling and testing
-
-
- The resource load on frontend is monitored continuously. Processes not adhering to usage rules will be terminated without warning. For large compilations, running benchmark calculations or moving massive data volumes (> 10 GB, > 10 000 files), use interative job.
-
-
-## PBS server
-
-A set of instructions performed on computational nodes is **computational job**. Jobs require a set of **resources** such as CPUs, memory or time. A **scheduling system** plans execution of the jobs so as optimize the load and usage of computational nodes.
-
-The server on which the scheduling system is called **PBS server** or **PBS scheduler**.
-
-On the current scheduler `pbs-m1.metacentrum.cz` the **[OpenPBS](https://www.openpbs.org/)** is used.
-
-The most important PBS Pro commands are:
-
-- `qsub` - submit a computational job
-- `qstat` - query status of a job
-- `qdel` - delete a job
-
-## Resources
-
-Every jobs need to have defined set of computational resources at the point of submission. The resources can be specified
-
-- on CLI as `qsub` command options, or
-- inside the batch script on lines beginning with `#PBS` header.
-
-In the PBS terminology, a **chunk** is a subset of computational nodes on which the job runs. In most cases the concept of chunks is useful for parallelized computing only and "normal" jobs run on one chunk. We cannot avoid the concept of chunks, though, as the specification of resources differ according to whether they can be applied on a job as a whole or on a chunk.
-
-According to PBS internal logic, the resources are either **chunk-wide** or **job-wide**.
-
-**Job-wide** resources are defined for the job as a whole, e.g. maximal duration of the job or a license to run a commercial software. These cannot be divided in parts and distributed among computational nodes on which the job runs. Every job-wide resource is defined in the form of `-l =`, e.g. `-l walltime=1:00:00`.
-
-**Chunk-wide** resources can be ascribed to every chunk separately and differently.
-
-
- For the purpose of this intro, we assume that the number of chunks is always 1, which is also a default value. To see more complicated examples about per-chunk resource distribution, see [advanced chapter on PBS resources](../computing/resources/resources).
-
-
-Chunk-wide resources are defined as options of `select` statement in pairs `=` divided by `:`.
-
-The essential resources are:
-
-| Resource name | Keyword | Chunk-wide or job-wide? |
-|---------------|---------|-------------------------|
-| no. of CPUs | ncpus | chunk |
-| Memory | mem | chunk |
-| Maximal duration of the job | walltime | job |
-| Type and volume of space for temporary data | scratch\_local | chunk |
-
-There are a deal more resources than the ones shown here; for example, it is possible to specify a type of computational nodes' OS or their physical placement, software licences, speed of CPU, number pf GPU cards and more. For detailed information see [PBS options detailed page]().
-
-Examples:
-
-```bash
- qsub -l select=1:ncpus=2:mem=4gb:scratch_local=1gb -l walltime=2:00:00 myJob.sh
-```
-
-where
-
- ncpus is number of processors (2 in this example)
- mem is the size of memory that will be reserved for the job (4 GB in this example, default 400 MB),
- scratch_local specifies the size and type of scratch directory (1 GB in this example, no default)
- walltime is the maximum time the job will run, set in the format hh:mm:ss (2 hours in this example, default 24 hours)
-
-## Queues
-
-When the job is submitted, it is added to one of the **queues** managed by the scheduler. Queues can be defined arbitrarily by the admins based on various criteria - usually on walltime, but also on number of GPU cards, size of memory etc. Some queues are reserved for defined groups of users ("private" queues).
-
-Unless you [have a reason to send job to a specific queue](../computing/resources/queues), do not specify any. The job will be submitted into a default queue and from there routed to one of execution queues.
-
-The default queue is only **routing** one: it serves to sort jobs into another queues according to the job's walltime - e.g. `q_1h` (1-hour jobs), `q_1d` (1-day jobs), etc.
-
-The latter queues are **execution** ones, i.e. they serve to actually run the jobs.
-
-In PBSmon, the [list of queues for all planners can be found](https://metavo.metacentrum.cz/pbsmon2/queues/list).
-
-
-
-. . .
-
-
-
-
-with respective meaning of icons:
-
-| Icon | meaning |
-|----|----|
-|  | routing queue
(to send jobs into) |
-|  | execution queue
(not to send jobs into) |
-|  | private queue
(limited for a group of users) |
-
-## Modules
-
-The software istalled in Metacentrum is packed (together with dependencies, libraries and environment variables) in so-called **modules**.
-
-To be able to use a particular software, you must **load a module**.
-
-Key command to work with software is `module`, see `module --help` on any frontend.
-
-**Basic commands**
-
-```bash
-module avail orca/ # list versions of installed Orca
-
-module add orca # load Orca module (default version)
-module load orca # dtto
-
-module list # list currently loaded modules
-
-module unload orca # unload module orca
-module purge # unload all currently loaded modules
-```
-
-For more complicated examples of module usage, see [advanced chapter on modules](../software/modules).
-
-## Scratch directory
-
-Most application produce some large temporary files during the calculation.
-
-To store these files, as well as all the input data, on the computational node, a disc space must be reserved for them.
-
-
- If your HPC job crashes or fails to copy data back from scratch to your home directory, don't worry! Your output files remain stored in the scratch. To access these files, simply use the command `go_to_scratch `, replacing `` with your actual job ID. Please retrieve your data promptly since scratch storage is temporary and may be purged after a certain period to free up space for other users.
-
-
-This is a purpose of **scratch directory** on computational node.
-
-
- There is no default scratch directory and the user must always specify its type and volume.
-
-
-Currently we offer four types of scratch storage:
-
-| Type | Available on every node? | Location on machine | `$SCRATCHDIR` value | Key characteristic |
-|------| -------------------------|---------------------|-------------------|----------------------|
-| local | yes | `/scratch/USERNAME/job_JOBID` | `scratch_local`| universal, large capacity, available everywhere |
-| ssd | no | `/scratch.ssd/USERNAME/job_JOBID` | `scratch_ssd`| fast I/O operations |
-| shared | no | `/scratch.shared/USERNAME/job_JOBID` | `scratch_shared`| can be shared by more jobs |
-| shm | no | `/dev/shm/scratch.shm/USERNAME/job_JOBID` | `scratch_shm`| exists in RAM, ultra fast |
-
-As a default choice, we recommend users to use **local scratch**:
-
-```bash
-qsub -I -l select=1=ncpus=2:mem=4gb:scratch_local=1gb -l walltime=2:00:00
-```
-
-To access the scratch directory, use the system variable `SCRATCHDIR`:
-
-```bash
-(BULLSEYE)user123@skirit:~$ qsub -I -l select=1:ncpus=2:mem=4gb:scratch_local=1gb -l walltime=2:00:00
-qsub: waiting for job 14429322.pbs-m1.metacentrum.cz to start
-qsub: job 14429322.pbs-m1.metacentrum.cz ready
-
-user123@glados12:~$ echo $SCRATCHDIR
-/scratch.ssd/user123/job_14429322.pbs-m1.metacentrum.cz
-user123@glados12:~$ cd $SCRATCHDIR
-user123@glados12:/scratch.ssd/user123/job_14429322.pbs-m1.metacentrum.cz$
-```
-
diff --git a/content/docs/computing/meta.json b/content/docs/computing/meta.json
index 1037d570..d71ca36f 100644
--- a/content/docs/computing/meta.json
+++ b/content/docs/computing/meta.json
@@ -1,9 +1,8 @@
{
"title": "Grid computing",
"pages": [
- "concepts",
- "basic-tutorial",
"run-basic-job",
+ "advanced",
"jobs",
"infrastructure",
"resources",
diff --git a/content/docs/computing/run-basic-job.mdx b/content/docs/computing/run-basic-job.mdx
index c0c0716e..7542a580 100644
--- a/content/docs/computing/run-basic-job.mdx
+++ b/content/docs/computing/run-basic-job.mdx
@@ -1,358 +1,117 @@
---
-title: Run simple job
+title: Getting started
---
-Welcome to the basic guide on how to run calculations in the Metacentrum grid service. You will learn how to
-- navigate between **frontends**, **home directories** and **storages**,
-- make use of **batch** and **interactive** job,
-- **submit a job** to a **PBS server**,
-- set up **resources** for a job,
-- retrieve job **output**.
+## Welcome to MetaCentrum
-
-
- 1. have a Metacentrum account
- 2. be able to login to a frontend node
- 3. have elementary knowledge of the Linux command line
-
-*If anything is missing, see [Access](../access/terms) section.*
-
-
-
-## Lifecycle of a job
-
-### Batch job
-
-A typical use case for grid computing is a non-interactive batch job, when the user only prepares input and set of instructions at the beginning. The calculation itself then runs independently on the user.
-
-Batch jobs consist of the following steps:
-
-1. **User prepares data** to be used in the calculation **and instructions** what is to be done with them (input files + batch script).
-2. The batch script is submitted to the job planner (**PBS server**), which stages the job until the required resources are available.
-3. After the PBS server has released the job to be run, **the job runs** on one of the computational nodes.
-4. At this time, the applications (software) are loaded.
-5. When the job is finished, results are copied back to the user's directory according to instructions in the batch script.
-
-
-
-### Interactive job
-
-Interactive job works in different way. The user does not need to specify in advance what will be done, neither does not need to prepare any input data. Instead, they first reserve computational resources and, after the job starts to run, work interactively on the CLI.
-
-The interactive job consists of the following steps:
-
-1. User **submits request for specified resources** to the PBS server
-2. **PBS server stages** this request until the resources are available.
-3. When the job starts running, **user is redirected** to a computational node's CLI.
-4. **User does whatever they need** on the node.
-5. When the user logs out of the computational node or when the time reserved for the job runs out, the job is done.
-
-
-
-### Batch vs interactive
-
-A primary choice for grid computing is the batch job. Batch jobs allow users to run massive sets of calculations without the need to overview them, manipulate data, etc. They also optimize the usage of computational resources better, as there is no need to wait for user's input.
-
-Interactive jobs are good for:
-
-- testing what works and what does not (software versions, input data format, bash constructions to be used in batch script later, etc)
-- getting first guess about resources
-- compiling your own software
-- processing, moving or archiving large amounts of data
-
-Interactive jobs are **necessary** for [running GUI application](../software/graphical-access).
-
-## Batch job example
-
-The batch script in the following example is called myJob.sh.
-
-```bash
- (BUSTER)user123@skirit:~$ cat myJob.sh
- #!/bin/bash
- #PBS -N batch_job_example
- #PBS -l select=1:ncpus=4:mem=4gb:scratch_local=10gb
- #PBS -l walltime=1:00:00
- # The 3 lines above are options for the scheduling system: the job will run 1 hour at maximum, 1 machine with 4 processors + 4gb RAM memory + 10gb scratch memory are requested
-
- # define a DATADIR variable: directory where the input files are taken from and where the output will be copied to
- DATADIR=/storage/brno12-cerit/home/user123/test_directory # substitute username and path to your real username and path
-
- # append a line to a file "jobs_info.txt" containing the ID of the job, the hostname of the node it is run on, and the path to a scratch directory
- # this information helps to find a scratch directory in case the job fails, and you need to remove the scratch directory manually
- echo "$PBS_JOBID is running on node `hostname -f` in a scratch directory $SCRATCHDIR" >> $DATADIR/jobs_info.txt
-
- #loads the Gaussian's application modules, version 03
- module add g03
-
- # test if the scratch directory is set
- # if scratch directory is not set, issue error message and exit
- test -n "$SCRATCHDIR" || { echo >&2 "Variable SCRATCHDIR is not set!"; exit 1; }
-
- # copy input file "h2o.com" to scratch directory
- # if the copy operation fails, issue an error message and exit
- cp $DATADIR/h2o.com $SCRATCHDIR || { echo >&2 "Error while copying input file(s)!"; exit 2; }
-
- # move into scratch directory
- cd $SCRATCHDIR
-
- # run Gaussian 03 with h2o.com as input and save the results into h2o.out file
- # if the calculation ends with an error, issue error message an exit
- g03 h2o.out || { echo >&2 "Calculation ended up erroneously (with a code $?) !!"; exit 3; }
-
- # move the output to user's DATADIR or exit in case of failure
- cp h2o.out $DATADIR/ || { echo >&2 "Result file(s) copying failed (with a code $?) !!"; exit 4; }
-
- # clean the SCRATCH directory
- clean_scratch
-```
-
-The last two lines can be piped together.
-
-```bash
- cp h2o.out $DATADIR/ || export CLEAN_SCRATCH=false
-```
-
-SCRATCH will be automatically cleaned (by the `clean_scratch` utility) only if the copy command finishes without error.
-
-
-The job is then submitted as
-
-```bash
- (BUSTER)user123@skirit:~$ qsub myJob.sh
- 11733571.pbs-m1.metacentrum.cz # job ID is 11733571.pbs-m1.metacentrum.cz
-```
-
-Alternatively, you can specify resources on the command line. In this case, the lines starting by `#PBS` need not to be in the batch script.
-
-```bash
- (BUSTER)user123@skirit:~$ qsub -l select=1:ncpus=4:mem=4gb:scratch_local=10gb -l walltime=1:00:00 myJob.sh
-```
+MetaCentrum provides free computing resources to Czech academic institutions through distributed compute clusters.
- If both resource specifications are present (on CLI as well as inside the script), the values on CLI have priority.
+ New users unfamiliar with grid computing environments often have questions. Please [contact user support](https://docs.metacentrum.cz/en/docs/support) if you need help - we're here for you, and your feedback helps us improve the documentation.
-## Interactive job example
+Before getting started you need an active MetaCentrum account (see [Account guide](../access/account)).
-An interactive job is requested via `qsub -I` command (uppercase "i").
+## Getting started: Logging in
-```bash
- (BUSTER)user123@skirit:~$ qsub -I -l select=1:ncpus=4:mem=10gb:scratch_local=10gb -l walltime=2:00:00 # submit interactive job
- qsub: waiting for job 13010171.pbs-m1.metacentrum.cz to start
- qsub: job 13010171.pbs-m1.metacentrum.cz ready # 13010171.pbs-m1.metacentrum.cz is the job ID
- (BULLSEYE)user123@elmo3-1:~$ # elmo3-1 is computational node
- (BULLSEYE)user123@elmo3-1:~$ module add mambaforge # make available mamba
- (BULLSEYE)user123@elmo3-1:~$ mamba list | grep scipy # make sure there is no scipy package already installed
- (BULLSEYE)user123@elmo3-1:~$ mamba search scipy
- ... # mamba returns list of scipy packages available in repositories
- (BULLSEYE)user123@elmo3-1:~$ mamba create -n my_scipy # create my environment to install scipy into
- ...
- environment location: /storage/praha1/home/user123/.conda/envs/my_scipy
- ...
- Proceed ([y]/n)? y
- ...
- (BULLSEYE)user123@elmo3-1:~$ mamba activate my_scipy # enter the environment
- (my_scipy) (BULLSEYE)user123@elmo3-1:~$
- (my_scipy) (BULLSEYE)user123@elmo3-1:~$ mamba install scipy
- ...
- Proceed ([y]/n)? y
- ...
- Downloading and Extracting Packages
- ...
- (my_scipy) (BULLSEYE)user123@elmo3-1:~$ python
- ...
- >>> import scipy as sp
- >>>
-```
-
-Unless you log out within the time quota (in this example 2 hours), you will get the following message:
+Once you have an activated account connect to MetaCentrum using SSH with your username. Here we are using the `tarkil.metacentrum.cz` login server (frontend). You can use any frontend, but we recommend choosing one closest to your physical location to minimize network latency.
```bash
- user123@elmo3-1:~$ =>> PBS: job killed: walltime 7230 exceeded limit 7200
- logout
- qsub: job 13010171.pbs-m1.metacentrum.cz completed
+ssh your_username@tarkil.metacentrum.cz
```
-## job ID
+For the full list of frontends, their locations, and detailed login instructions (including Windows/PuTTY), see the [Log in guide](../access/log-in).
-Job ID is a unique identifier in a job. Job ID is crucial to track, manipulate or delete job, as well as to identify your problem to user support.
+## Understanding the architecture
-Under some circumstances, the job can be identified by the number only (e.g. `13010171.`). In general, however, the PBS server suffix is needed, too, to fully identify the job (e.g. `13010171.meta-pbs.metacentrum.cz`).
+MetaCentrum connects distributed compute clusters. Each frontend has a native home directory on a storage server. You can access any storage from any frontend.
-You can get the job ID:
+Frontends are shared by all users and are **not** intended for heavy computing. Use them only for data preparation, job management, or light compiling. For computing, use batch or interactive jobs.
-- after running the `qsub` command
-- by `echo $PBS_JOBID` in the interactive job or in the batch script
-- by `qstat -u your_username @pbs-m1.metacentrum.cz
+For detailed infrastructure information see [Frontend & Storage guide](../infrastructure/frontend-storage).
-Within interactive job:
+## Basic job concepts
-```bash
- (BULLSEYE)user123@elmo3-1:~$ echo $PBS_JOBID
- 13010171.pbs-m1.metacentrum.cz
-```
+### Batch job vs interactive job
-By `qstat` command:
+**Batch job**: Non-interactive, submit script and it runs independently (primary choice for grid computing)
-```bash
- (BULLSEYE)user123@perian :~$ qstat -u user123 @pbs-m1.metacentrum.cz
-
- Job id Name User Time Use S Queue
- --------------------- ---------------- ---------------- -------- - -----
- 1578105.pbs-m1 Boom-fr-bulk_12* fiserp 0 Q q_1w
-```
+**Interactive job**: Reserve resources, then work interactively (useful for testing, compiling, running [GUI apps](../software/graphical-access))
-## Job status
-
-The basic command for getting the status of your jobs is the `qstat` command.
-
-```bash
- qstat -u user123 # list all jobs of user "user123" running or queuing on the PBS server
- qstat -xu user123 # list finished jobs for user "user123"
- qstat -f # list details of the running or queueing job with a given jobID
- qstat -xf # list details of the finished job with a given jobID
-```
-
-You will see something like the following table:
-
-```bash
- Req'd Req'd Elap
- Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
- -------------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
- 11733550.meta-pbs.* user123 q_2h myJob.sh -- 1 1 1gb 00:05 Q --
-```
-
-The letter under the header 'S' (status) gives the status of the job. The most common states are:
-
-- Q – queued
-- R – running
-- F – finished
-
-To learn more about how to track running job and how to retrieve job history, see [Job tracking page](../computing/jobs/job-tracking).
-
-## Output files
-
-When a job is completed (no matter how), two files are created in the directory from which you have submitted the job:
-
-1. `.o` - job's standard output (STDOUT)
-2. `.e` - job's standard error output (STDERR)
-
-STDERR file contains all the error messages which occurred during the calculation. It is a first place where to look if the job has failed.
-
-## Job termination
-
-### Done by user
-
-Sometimes, you need to delete the submitted/running job. This can be done by `qdel` command:
+## Your first batch job
```bash
- (BULLSEYE)user123@skirit~: qdel 21732596.pbs-m1.metacentrum.cz
+#!/bin/bash
+#PBS -N my_job
+#PBS -l select=1:ncpus=4:mem=4gb:scratch_local=10gb -l walltime=1:00:00
+
+DATADIR=/storage/cityXY/home/user/data
+cp $DATADIR/input.txt $SCRATCHDIR
+cd $SCRATCHDIR
+module add software_name
+run_calculation
+cp results.txt $DATADIR/
+clean_scratch
```
-If plain `qdel` does not work, add `-W` (force del) option:
+Essential resources: `select` (number of chunks), `ncpus`, `mem`, `scratch_local`, `walltime`
```bash
- (BULLSEYE)user123@skirit~: qdel -W force 21732596.pbs-m1.metacentrum.cz
+qsub myJob.sh # submit job, returns job ID
```
-### Done by PBS server
-
-The PBS server keeps track of resources used by the job. In case the job uses more resources than it has reserved, PBS server sends a **SIGKILL** signal to the execution host.
-
-You can see the signal as `Exit_status` on CLI:
+## Software modules
```bash
- (BULLSEYE)user123@tarkil:~$ qstat -x -f 13030457.pbs-m1.metacentrum.cz | grep Exit_status
- Exit_status = -29
+module avail tool/ # list versions
+module add tool # load default version
+module list # show loaded modules
```
-## Exit status
-
-When the job is finished (no matter how), it exits with a certain **exit status** (a number).
-
-
- Interactive jobs have always exit status equal to 0.
+
+ Use wildcards to search: `module avail *python*`. Add `/` to see versions within a module directory.
-A normal termination is denominated by 0.
-
-Any non-zero exit status means the job failed for some reason.
-
-You can get the exit status by typing
+## Monitoring jobs
```bash
- (BULLSEYE)user123@skirit:~$ qstat -xf job_ID | grep Exit_status
- Exit_status = 271
+qstat -u username # list your jobs
```
-
- The `qstat -x -f` works only for recently finished jobs (last 24 hours). For For older jobs, use the `pbs-get-job-history` utility - see [advanced chapter on getting info about older jobs](../computing/jobs/finished-jobs#older).
-
-
-Alternatively, you can navigate to [your list of jobs in PBSmon](https://metavo.metacentrum.cz/pbsmon2/jobs/detail), go to tab "Jobs" and choose a particular finished job from the list.
-
-A gray table at the bottom of the page contains many variables connected to the job. Search for "Exit status" like shown in the picture below:
-
-
+Status codes: `Q`=queued, `R`=running, `F`=finished
-### Exit status ranges
+Output files (in submission directory):
+- `jobname.o` – standard output
+- `jobname.e` – errors (check here first if job fails)
-Exit status can fall into one of three categories, or ranges of numbers.
+## Account maintenance
-| Exit status range | Meaning |
-|--------------------|---------------------|
-| X < 0 | job killed by PBS; either some resource was exceeded
or another problem occured |
-| 0 <= X < 256 | exit value of shell or top process of the job |
-| X >= 256 | job was killed with an OS signal |
+**Renewal**: Accounts expire February 2nd. You'll be notified by email.
-### Exit status to `SIG*` type
+**Security**: Use a strong password and never share credentials. For password changes and complete security rules, see [Account page](../access/account) and [Terms and conditions](../access/terms).
-If the exit status exceeds 256, it means an signal from operation system has terminated the job.
+## Next steps
-Usually this means the used has deleted the job by `qdel`, upon which a `SIGKILL` and/or `SIGTERM` signal is sent.
+Now that you understand the basics, you can:
+- Learn [advanced job configuration and troubleshooting](./advanced)
+- Explore [available software](../software/alphabet)
+- Read about [parallel computing](./parallel-comput)
+- Check [GPU resources](./gpu-comput)
+- Try usegalaxy.cz [web interface](../graphical/usegalaxy) – an alternative way to run jobs in a web-based platform with workflow support.
-The OS signal have an OS code of their own.
+## Troubleshooting basics
-Type `kill -l` on any frontend to get list of OS signals together with their values.
-
-To translate PBS exit code >= 256 to OS signal type, just subtract 256 from exit code.
-
-For example, exit status of 271 means the OS signal no. 15 (a `SIGTERM`).
-
-
-
-
- `PBS exit status` - `256` = `OS signal code`.
-
+If your job fails:
+1. Check the error file (`*.e`)
+2. Verify your input files exist and have correct permissions
+3. Check if software modules are loaded correctly
+4. Ensure you requested adequate resources (memory, walltime, scratch)
-### Common exit statuses
+For more advanced troubleshooting, see the [Advanced guide](./advanced).
-Most often you will meet some of the following signals:
-
-| Type of job ending | Exit status |
-|--------------------|---------------|
-| missing Kerberos credenials | -23 |
-| job exceeded number of CPUs | -25 |
-| job exceeded memory | -27 |
-| job exceeded walltime | -29 |
-| **normal termination** | **0** |
-| Job killed by `SIGTERM`
(result of `qdel`) | 271 |
-
-## Manual scratch clean
-
-In case of erroneous job ending, the data are left in the scratch directory. You should always clean the scratch after all potentially useful data has been retrieved. To do so, you need to know the hostname of machine where the job was run, and path to the scratch directory.
-
-
- Users' rights allow only `rm -rf $SCRATCHDIR/*`, not `rm -rf $SCRATCHDIR`.
-
-
-For example:
-
-```bash
- user123@skirit:~$ ssh user123@luna13.fzu.cz # login to luna13.fzu.cz
- user123@luna13:~$ cd /scratch/user123/job_14053410.pbs-m1.metacentrum.cz # enter scratch directory
- user123@luna13:/scratch/user123/job_14053410.pbs-m1.metacentrum.cz$ rm -r * # remove all content
-```
+## Acknowledgements
-The scratch directory itself will be **deleted automatically** after some time.
+Publications created with MetaCentrum support must include the e-INFRA CZ acknowledgement (ID:90254) and be submitted to the [publications system](https://publications.e-infra.cz/all-publications). For ELIXIR CZ resources, please use ID:90255.
+>Computational resources were provided by the e-INFRA CZ project (ID:90254),
+>supported by the Ministry of Education, Youth and Sports of the Czech Republic.
diff --git a/content/docs/sandbox/index.mdx b/content/docs/sandbox/index.mdx
index 175ccaf8..733269d0 100644
--- a/content/docs/sandbox/index.mdx
+++ b/content/docs/sandbox/index.mdx
@@ -31,7 +31,7 @@ import IconGrid from '@/public/img/meta/welcome/icon-pbs.svg';
}>
Distributed HPC computing built on OpenPBS scheduler and NFS filesystem.
- MetaCentrum Grid docs
+ MetaCentrum Grid docs
diff --git a/content/docs/welcome.mdx b/content/docs/welcome.mdx
index 1f00c8b0..462af0e1 100644
--- a/content/docs/welcome.mdx
+++ b/content/docs/welcome.mdx
@@ -39,7 +39,7 @@ Welcome to the MetaCentrum documentation, the home of all MetaCentrum services.
}>
Distributed HPC computing built on OpenPBS scheduler and NFS filesystem.
- MetaCentrum Grid docs
+ MetaCentrum Grid docs