You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/getting_started/using_modin/using_modin_cluster/using_modin_ray_cluster.rst
+13-14
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,5 @@
1
-
================================
2
-
Using Modin in a AWS Ray Cluster
3
-
================================
1
+
Using Modin on Ray in a Cluster
2
+
===============================
4
3
5
4
.. note::
6
5
|*Estimated Reading Time: 15 minutes*
@@ -26,8 +25,8 @@ First of all, install the necessary dependencies in your environment:
26
25
27
26
pip install boto3
28
27
29
-
The next step is to setup your AWS credentials. One can set `AWS_ACCESS_KEY_ID`,
30
-
`AWS_SECRET_ACCESS_KEY` and `AWS_SESSION_TOKEN` environment variables or
28
+
The next step is to setup your AWS credentials. One can set ``AWS_ACCESS_KEY_ID``,
29
+
``AWS_SECRET_ACCESS_KEY`` and ``AWS_SESSION_TOKEN`` environment variables or
31
30
just run the following command:
32
31
33
32
.. code-block:: bash
@@ -41,10 +40,10 @@ This example starts 1 head node (m5.24xlarge) and 5 worker nodes (m5.24xlarge),
41
40
You can check the `Amazon EC2 pricing`_ .
42
41
43
42
You can manually create AWS EC2 instances and configure them or just use the `Ray autoscaler` to
44
-
create and initialize a Ray cluster on Amazon Web Service (AWS) using `Modin's Ray cluster setup config`_ .
45
-
You can read more about how to modify `Ray's autoscaler options`_ .
43
+
create and initialize a Ray cluster on AWS using `Modin's Ray cluster setup config`_ .
44
+
You can read more about how to modify the file on `Ray's autoscaler options`_ .
46
45
47
-
Detailed instructions can be found in `Ray's cluster docs`_.
46
+
More details on how to launch a Ray cluster can be found on `Ray's cluster docs`_.
48
47
49
48
To start up the Ray cluster, run the following command in your terminal:
50
49
@@ -64,7 +63,7 @@ To exit the ssh session and return back into your local shell session, type:
64
63
65
64
exit
66
65
67
-
Executing on a cluster environment
66
+
Executing in a cluster environment
68
67
----------------------------------
69
68
70
69
.. note::
@@ -78,15 +77,15 @@ on a cluster. In this tutorial, we will use a 12.5 GB `big_yellow.csv` file that
78
77
created by concatenating a 200MB `NYC Taxi dataset`_ file 64 times. Preparing this
79
78
file was provided as part of our `Modin's Ray cluster setup config`_.
80
79
81
-
If you want use another dataset in your own script, you should provide it to each of
82
-
the cluster nodes in the same path. We recomnend doing this by customizing the
80
+
If you want to use the other dataset, you should provide it to each of
81
+
the cluster nodes with the same path. We recomnend doing this by customizing the
83
82
`setup_commands` section of the [configuration file](https://github.com/modin-project/modin/blob/master/examples/tutorial/jupyter/execution/pandas_on_ray/cluster/modin-cluster.yaml).
84
83
85
-
To run any scripts on a remote cluster, you need to submit it to the ray. In this way,
84
+
To run any script in a remote cluster, you need to submit it to the ray. In this way,
86
85
the script file is sent to the the remote cluster head node and executed there.
87
86
88
-
In this tutorial, we provide the `exercise_5.py`_ script, which read the data from the
89
-
CSV file and executed some pandas Dataframe function such as count, groupby and applymap.
87
+
In this tutorial, we provide the `exercise_5.py`_ script, which reads the data from the
88
+
CSV file and executes such pandas operations as count, groupby and applymap.
90
89
As a result of the script, you will see the size of the file being read and the execution
0 commit comments