Skip to content

Commit c991a5d

Browse files
authored
Fix XPU iGPU regressions (Comfy-Org#9322)
* Change bf16 check and switch non-blocking to off default with option to force to regain speed on certain classes of iGPUs and refactor xpu check. * Turn non_blocking off by default for xpu. * Update README.md for Intel GPUs.
1 parent 9df8792 commit c991a5d

3 files changed

Lines changed: 23 additions & 24 deletions

File tree

README.md

Lines changed: 9 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ ComfyUI lets you design and execute advanced stable diffusion pipelines using a
3939
## Get Started
4040

4141
#### [Desktop Application](https://www.comfy.org/download)
42-
- The easiest way to get started.
42+
- The easiest way to get started.
4343
- Available on Windows & macOS.
4444

4545
#### [Windows Portable Package](#installing)
@@ -211,27 +211,19 @@ This is the command to install the nightly with ROCm 6.4 which might have some p
211211

212212
### Intel GPUs (Windows and Linux)
213213

214-
(Option 1) Intel Arc GPU users can install native PyTorch with torch.xpu support using pip (currently available in PyTorch nightly builds). More information can be found [here](https://pytorch.org/docs/main/notes/get_start_xpu.html)
215-
216-
1. To install PyTorch nightly, use the following command:
214+
(Option 1) Intel Arc GPU users can install native PyTorch with torch.xpu support using pip. More information can be found [here](https://pytorch.org/docs/main/notes/get_start_xpu.html)
217215

218-
```pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/xpu```
219-
220-
2. Launch ComfyUI by running `python main.py`
216+
1. To install PyTorch xpu, use the following command:
221217

218+
```pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/xpu```
222219

223-
(Option 2) Alternatively, Intel GPUs supported by Intel Extension for PyTorch (IPEX) can leverage IPEX for improved performance.
224-
225-
1. For Intel® Arc™ A-Series Graphics utilizing IPEX, create a conda environment and use the commands below:
220+
This is the command to install the Pytorch xpu nightly which might have some performance improvements:
226221

227-
```
228-
conda install libuv
229-
pip install torch==2.3.1.post0+cxx11.abi torchvision==0.18.1.post0+cxx11.abi torchaudio==2.3.1.post0+cxx11.abi intel-extension-for-pytorch==2.3.110.post0+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
230-
```
222+
```pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/xpu```
231223

232-
For other supported Intel GPUs with IPEX, visit [Installation](https://intel.github.io/intel-extension-for-pytorch/index.html#installation?platform=gpu) for more information.
224+
(Option 2) Alternatively, Intel GPUs supported by Intel Extension for PyTorch (IPEX) can leverage IPEX for improved performance.
233225

234-
Additional discussion and help can be found [here](https://github.com/comfyanonymous/ComfyUI/discussions/476).
226+
1. visit [Installation](https://intel.github.io/intel-extension-for-pytorch/index.html#installation?platform=gpu) for more information.
235227

236228
### NVIDIA
237229

@@ -352,7 +344,7 @@ Generate a self-signed certificate (not appropriate for shared/production use) a
352344

353345
Use `--tls-keyfile key.pem --tls-certfile cert.pem` to enable TLS/SSL, the app will now be accessible with `https://...` instead of `http://...`.
354346

355-
> Note: Windows users can use [alexisrolland/docker-openssl](https://github.com/alexisrolland/docker-openssl) or one of the [3rd party binary distributions](https://wiki.openssl.org/index.php/Binaries) to run the command example above.
347+
> Note: Windows users can use [alexisrolland/docker-openssl](https://github.com/alexisrolland/docker-openssl) or one of the [3rd party binary distributions](https://wiki.openssl.org/index.php/Binaries) to run the command example above.
356348
<br/><br/>If you use a container, note that the volume mount `-v` can be a relative path so `... -v ".\:/openssl-certs" ...` would create the key & cert files in the current directory of your command prompt or powershell terminal.
357349
358350
## Support and dev channel

comfy/cli_args.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,8 @@ class LatentPreviewMethod(enum.Enum):
132132

133133
parser.add_argument("--async-offload", action="store_true", help="Use async weight offloading.")
134134

135+
parser.add_argument("--force-non-blocking", action="store_true", help="Force ComfyUI to use non-blocking operations for all applicable tensors. This may improve performance on some non-Nvidia systems but can cause issues with some workflows.")
136+
135137
parser.add_argument("--default-hashing-function", type=str, choices=['md5', 'sha1', 'sha256', 'sha512'], default='sha256', help="Allows you to choose the hash function to use for duplicate filename / contents comparison. Default is sha256.")
136138

137139
parser.add_argument("--disable-smart-memory", action="store_true", help="Force ComfyUI to agressively offload to regular ram instead of keeping models in vram when it can.")

comfy/model_management.py

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,6 @@ def get_supported_float8_types():
7878
torch_version = torch.version.__version__
7979
temp = torch_version.split(".")
8080
torch_version_numeric = (int(temp[0]), int(temp[1]))
81-
xpu_available = (torch_version_numeric[0] < 2 or (torch_version_numeric[0] == 2 and torch_version_numeric[1] <= 4)) and torch.xpu.is_available()
8281
except:
8382
pass
8483

@@ -102,10 +101,14 @@ def get_supported_float8_types():
102101

103102
try:
104103
import intel_extension_for_pytorch as ipex # noqa: F401
104+
except:
105+
pass
106+
107+
try:
105108
_ = torch.xpu.device_count()
106-
xpu_available = xpu_available or torch.xpu.is_available()
109+
xpu_available = torch.xpu.is_available()
107110
except:
108-
xpu_available = xpu_available or (hasattr(torch, "xpu") and torch.xpu.is_available())
111+
xpu_available = False
109112

110113
try:
111114
if torch.backends.mps.is_available():
@@ -946,10 +949,12 @@ def pick_weight_dtype(dtype, fallback_dtype, device=None):
946949
return dtype
947950

948951
def device_supports_non_blocking(device):
952+
if args.force_non_blocking:
953+
return True
949954
if is_device_mps(device):
950955
return False #pytorch bug? mps doesn't support non blocking
951-
if is_intel_xpu():
952-
return True
956+
if is_intel_xpu(): #xpu does support non blocking but it is slower on iGPUs for some reason so disable by default until situation changes
957+
return False
953958
if args.deterministic: #TODO: figure out why deterministic breaks non blocking from gpu to cpu (previews)
954959
return False
955960
if directml_enabled:
@@ -1282,10 +1287,10 @@ def should_use_bf16(device=None, model_params=0, prioritize_performance=True, ma
12821287
return False
12831288

12841289
if is_intel_xpu():
1285-
if torch_version_numeric < (2, 6):
1290+
if torch_version_numeric < (2, 3):
12861291
return True
12871292
else:
1288-
return torch.xpu.get_device_capability(device)['has_bfloat16_conversions']
1293+
return torch.xpu.is_bf16_supported()
12891294

12901295
if is_ascend_npu():
12911296
return True

0 commit comments

Comments
 (0)