Extremely low performance with 20 simultaneous clients and 9Gb squashfs image

Hi all,

I have an students lab in my university with about 180 clients (i5,16Gb,Gigabit network) running ubuntu-20.04 from pxe+nbd by mean of LTSP.(Gigabit network)
The generated squashfs image is about 9Gb. So we have 5 NBD servers running in parallel to avoid load issues.
Things go fine: load is balanced across servers, startup time is acceptable when a single client starts....  but problems arise when the class begins and 60+ clients are started at same time.
 
Network load is almost none. "htop" command tell that servers (i7-10thgen, 8cpu, 32GbRam, NVMe disks) have minimal cpu work... but clients seems as frozen... they last 5+ minutes to show login screen. Seems that some one is thrasing data with no real effect. Load balancing goes right: each nbd server handles about 20 clients 

We've tried some tips as change number of workers, split image files... none of these solutions work.
So I've starting in LTSP intrinsics diveing... change NBD to NFS, use of raw image instead of squashfs... or even change nbd-client and squashfs block size.
Perhaps LTSP is not designed to handle so big images with so many simultaneous clients...
Any ideas?
Thanks in adance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extremely low performance with 20 simultaneous clients and 9Gb squashfs image #150

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Extremely low performance with 20 simultaneous clients and 9Gb squashfs image #150

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions