@@ -12,7 +12,7 @@ local development and cluster execution. Users are not required to think about
12
12
how many workers exist or how to distribute and partition their data;
13
13
Modin handles all of this seamlessly and transparently.
14
14
15
- .. image :: ../../../ examples/tutorial/jupyter/img/modin_cluster.png
15
+ .. image :: ../../examples/tutorial/jupyter/img/modin_cluster.png
16
16
:alt: Modin cluster
17
17
:align: center
18
18
:scale: 90%
@@ -37,7 +37,7 @@ just run the following command:
37
37
Starting and connecting to the cluster
38
38
--------------------------------------
39
39
40
- This example starts 1 head node (m5.24xlarge) and 7 worker nodes (m5.24xlarge), 768 total CPUs.
40
+ This example starts 1 head node (m5.24xlarge) and 5 worker nodes (m5.24xlarge), 576 total CPUs.
41
41
You can check the `Amazon EC2 pricing `_ .
42
42
43
43
You can manually create AWS EC2 instances and configure them or just use the `Ray autoscaler ` to
@@ -76,7 +76,7 @@ Executing on a cluster environment
76
76
Modin lets you instantly speed up your workflows with a large data by scaling pandas
77
77
on a cluster. In this tutorial, we will use a 12.5 GB `big_yellow.csv ` file that was
78
78
created by concatenating a 200MB `NYC Taxi dataset `_ file 64 times. Preparing this
79
- file was provided as part of our `Modin's cluster setup config `_.
79
+ file was provided as part of our `Modin's Ray cluster setup config `_.
80
80
81
81
If you want use another dataset in your own script, you should provide it to each of
82
82
the cluster nodes in the same path. We recomnend doing this by customizing the
@@ -119,7 +119,7 @@ with improvements in performance as we increase the number of resources Modin ca
119
119
.. _`Ray's autoscaler options` : https://docs.ray.io/en/latest/cluster/vms/references/ray-cluster-configuration.html#cluster-config
120
120
.. _`Ray's cluster docs` : https://docs.ray.io/en/latest/cluster/getting-started.html
121
121
.. _`NYC Taxi dataset` : https://modin-datasets.intel.com/testing/yellow_tripdata_2015-01.csv
122
- .. _`Modin's cluster setup config` : https://github.com/modin-project/modin/blob/master/examples/tutorial/jupyter/execution/pandas_on_ray/cluster/modin-cluster.yaml
122
+ .. _`Modin's Ray cluster setup config` : https://github.com/modin-project/modin/blob/master/examples/tutorial/jupyter/execution/pandas_on_ray/cluster/modin-cluster.yaml
123
123
.. _`Amazon EC2 pricing` : https://aws.amazon.com/ec2/pricing/on-demand/
124
124
.. _`exercise_5.py` : https://github.com/modin-project/modin/blob/master/examples/tutorial/jupyter/execution/pandas_on_ray/cluster/exercise_5.py
125
125
.. _`Ray client` : https://docs.ray.io/en/latest/cluster/running-applications/job-submission/ray-client.html
0 commit comments