how to use all tpu core in pytorch xla #8215

fancy45daddy · 2024-10-04T02:54:18Z

❓ Questions and Help

I follow the code in https://github.com/pytorch/xla/blob/master/contrib/kaggle/distributed-pytorch-xla-basics-with-pjrt.ipynb

But use xmp.spawn(print_device, args=(lock,), nprocs=8, start_method='fork')

the source code

import os
os.environ.pop('TPU_PROCESS_ADDRESSES')

import torch_xla.core.xla_model as xm
import torch_xla.distributed.xla_multiprocessing as xmp
import multiprocessing as mp
lock = mp.Manager().Lock()

def print_device(i, lock):
    device = xm.xla_device()
    with lock:
        print('process', i, device)
        
xmp.spawn(print_device, args=(lock,), nprocs=8, start_method='fork')

WARNING:root:Unsupported nprocs (8), ignoring...
process 4 xla:0
process 5 xla:1
process 0 xla:0
process 1 xla:1
process 2 xla:0
process 3 xla:1
process 6 xla:0
process 7 xla:1

xla just can see 2 xla device. But when I run xm.get_xla_supported_devices() it list all ['xla:0', 'xla:1', 'xla:2', 'xla:3', 'xla:4', 'xla:5', 'xla:6', 'xla:7'] I want to know how to use all tpu cores?

The text was updated successfully, but these errors were encountered:

zpcore · 2024-10-07T20:30:39Z

By default, it's automatically using all cores available.

nprocs only accept two options: 1) you can leave nprocs as None it will automatically use all cores, 2) or let it be 1 if you want to debug single process.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to use all tpu core in pytorch xla #8215

how to use all tpu core in pytorch xla #8215

fancy45daddy commented Oct 4, 2024

zpcore commented Oct 7, 2024

how to use all tpu core in pytorch xla #8215

how to use all tpu core in pytorch xla #8215

Comments

fancy45daddy commented Oct 4, 2024

❓ Questions and Help

zpcore commented Oct 7, 2024