You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> Passwordless SSH connection is also enabled in the image, but the container does not contain any SSH ID keys. The user needs to mount those keys at `/root/.ssh/id_rsa` and `/etc/ssh/authorized_keys`.
114
117
115
-
> **Note:** Passwordless SSH connection is also enabled in the image.
116
-
> The container does not contain the SSH ID keys. The user needs to mount those keys at `/root/.ssh/id_rsa` and `/etc/ssh/authorized_keys`.
117
-
> Since the SSH key is not owned by default user account in docker, please also do "chmod 600 authorized_keys; chmod 600 id_rsa" to grant read access for default user account.
118
+
> [!TIP]
119
+
> Before mounting any keys, modify the permissions of those files with `chmod 600 authorized_keys; chmod 600 id_rsa` to grant read access for the default user account.
118
120
119
121
#### Setup and Run IPEX Multi-Node Container
120
122
@@ -132,30 +134,52 @@ To add these files correctly please follow the steps described below.
132
134
133
135
1. Setup ID Keys
134
136
135
-
You can use the commands provided below to [generate the Identity keys](https://www.ssh.com/academy/ssh/keygen#creating-an-ssh-key-pair-for-user-authentication) for OpenSSH.
137
+
You can use the commands provided below to [generate the identity keys](https://www.ssh.com/academy/ssh/keygen#creating-an-ssh-key-pair-for-user-authentication) for OpenSSH.
136
138
137
139
```bash
138
140
ssh-keygen -q -N "" -t rsa -b 4096 -f ./id_rsa
139
141
touch authorized_keys
140
142
cat id_rsa.pub >> authorized_keys
141
143
```
142
144
143
-
2. Configure the permissions and ownership for all of the files you have created so far.
145
+
2. Configure the permissions and ownership for all of the files you have created so far
2. Add hosts to config. (**Note:** This is an optional step)
203
-
204
-
User can optionally mount their own custom client config file to define a list of hosts and ports where the SSH server is running inside the container. An example of a hostfile is provided below. This file is supposed to be mounted in the launcher container at `/etc/ssh/ssh_config`.
209
+
> [!NOTE]
210
+
> [Intel® MPI] can be configured based on your machine settings. If the above commands do not work for you, see the documentation for how to configure based on your network.
205
211
206
-
```bash
207
-
touch config
208
-
```
212
+
#### Enable [DeepSpeed*] optimizations
209
213
210
-
```txt
211
-
Host host1
212
-
HostName <Hostname of host1>
213
-
IdentitiesOnly yes
214
-
IdentityFile ~/.root/id_rsa
215
-
Port <SSH Port>
216
-
Host host2
217
-
HostName <Hostname of host2>
218
-
IdentitiesOnly yes
219
-
IdentityFile ~/.root/id_rsa
220
-
Port <SSH Port>
221
-
...
222
-
```
214
+
To enable [DeepSpeed*] optimizations with [Intel® oneAPI Collective Communications Library], add the following to your python script:
bash -c 'ipexrun cpu --nnodes 2 --nprocs-per-node 1 --master-addr 127.0.0.1 --master-port ${SSH_PORT} /workspace/tests/ipex-resnet50.py --ipex --device cpu --backend ccl'
237
-
```
219
+
# Rather than dist.init_process_group(), use deepspeed.init_distributed()
220
+
deepspeed.init_distributed(backend="ccl")
221
+
```
238
222
239
-
> [!NOTE]
240
-
> [Intel® MPI] can be configured based on your machine settings. If the above commands do not work for you, see the documentation for how to configure based on your network.
223
+
Additionally, if you have a [DeepSpeed* configuration](https://www.deepspeed.ai/getting-started/#deepspeed-configuration) you can use the below command as your launcher to run your script with that configuration:
241
224
242
-
> [!TIP]
243
-
> Additionally, [DeepSpeed*] optimizations can be utilized in place of ipexrun with the `ccl` backend for multi-node training.
0 commit comments