You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: INSTALL.md
+28-25
Original file line number
Diff line number
Diff line change
@@ -1,26 +1,26 @@
1
-
In order to run the application you have to have Cloudera's Hadoop installed. The steps of installation procedure are described below.
1
+
In order to run the application, you need to have Cloudera's Hadoop installed. The steps of the installation procedure are given below.
2
2
3
3
Hadoop: installation and configuration
4
4
======================================
5
-
Warning: Because of a bug in the Oozie version provided with Cloudera's Hadoop (which is removed in the version available in the source code repository), **you have to have (Oracle) Java JDK 1.6 installed**. This version of Oozie does not work with JDK 1.7.
5
+
IMPORTANT: Because of a bug in the Oozie version provided with Cloudera's Hadoop (by the way: this bug is removed in the version of Oozie available in the source code repository), **you need to have Oracle Java JDK 1.6 installed**. Oozie **does not** work with JDK 1.7.
6
6
7
7
---
8
8
9
-
Install Cloudera Hadoop CDH4 with MRv1 in accordance with the instructions given in [Cloudera CDH4 intallation guide](https://ccp.cloudera.com/display/CDH4DOC/CDH4+Installation+Guide) , to be more specific:
9
+
The instructions below show how to install Cloudera Hadoop CDH4 with MRv1 in accordance with the instructions given in [Cloudera CDH4 intallation guide](https://ccp.cloudera.com/display/CDH4DOC/CDH4+Installation+Guide).
10
10
11
-
Hadoop can be run in one of three modes:
11
+
It is important to know that Hadoop can be run in one of three modes:
12
12
13
13
-**standalone mode** - runs all of the Hadoop processes in a single JVM which makes it easy to debug the application.
14
-
-**pseudo-distributed mode** - runs a full-fledged Hadoop on your local computer
15
-
-**distributed mode** - runs on a cluster consisting of many nodes/hosts
14
+
-**pseudo-distributed mode** - runs a full-fledged Hadoop on your local computer.
15
+
-**distributed mode** - runs the application on a cluster consisting of many nodes/hosts.
16
16
17
-
Below we will show how to install Hadoop initially in pseudo-distributed mode but we will be able to switch between standalone and pseudo-distributed modes.
17
+
Below we will show how to install Hadoop initially in the pseudo-distributed mode but with a possibility to switch between the standalone and the pseudo-distributed mode.
18
18
19
-
Install Hadoop in pseudo-distributed mode, see [Cloudera CDH4 pseudo distributed mode installation guide](https://ccp.cloudera.com/display/CDH4DOC/Installing+CDH4+on+a+Single+Linux+Node+in+Pseudo-distributed+Mode)
19
+
Hadoop: installation
20
+
--------------------
21
+
Installing Hadoop in pseudo-distributed mode (based on [Cloudera CDH4 pseudo distributed mode installation guide](https://ccp.cloudera.com/display/CDH4DOC/Installing+CDH4+on+a+Single+Linux+Node+in+Pseudo-distributed+Mode)) in case of 64-bit Ubuntu 12.04:
20
22
21
-
In case of Ubuntu 12.04:
22
-
23
-
- create new file `/etc/apt/sources.list.d/cloudera.list` with contents:
23
+
- create a new file `/etc/apt/sources.list.d/cloudera.list` with contents:
24
24
25
25
deb http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh precise-cdh4 contrib
- next, follow the steps described in the Cloudera's guide for installing Hadoop in pseudo-distributed mode starting from the step "Step 1: Format the NameNode." This is available at [Cloudera CDH4 pseudo distributed mode installation guide - "Step 1: Format the Namenode"](https://ccp.cloudera.com/display/CDH4DOC/Installing+CDH4+on+a+Single+Linux+Node+in+Pseudo-distributed+Mode#InstallingCDH4onaSingleLinuxNodeinPseudo-distributedMode-Step1%3AFormattheNameNode.).
40
+
- next, follow the steps described in the Cloudera's guide to installing Hadoop in the pseudo-distributed mode starting from the step "Step 1: Format the NameNode." This is available at [Cloudera CDH4 pseudo distributed mode installation guide - "Step 1: Format the Namenode"](https://ccp.cloudera.com/display/CDH4DOC/Installing+CDH4+on+a+Single+Linux+Node+in+Pseudo-distributed+Mode#InstallingCDH4onaSingleLinuxNodeinPseudo-distributedMode-Step1%3AFormattheNameNode.).
41
41
42
-
---
42
+
Hadoop: after install
43
+
---------------------
43
44
44
-
You can **switch between standalone and pseudo-distributed configurations** (or others) of Hadoop using the `update-alternatives` command, e.g.:
45
+
### Switching between Hadoop modes
46
+
When you have Hadoop installed, you can **switch between standalone and pseudo-distributed configurations** (or other kinds of configurations) of Hadoop using the `update-alternatives` command, e.g.:
45
47
46
-
-`update-alternatives --display hadoop-conf` for list of available configurations and information which is the active one
48
+
-`update-alternatives --display hadoop-conf` for list of available configurations and information which one is currently active
47
49
-`sudo update-alternatives --set hadoop-conf /etc/hadoop/conf.empty` to set the active configuration to `/etc/hadoop/conf.empty` which corresponds to Hadoop standalone mode.
48
50
49
-
You can view the web interfaces to the following services using the addresses:
51
+
### Web interfaces
52
+
You can view the web interfaces to the following services using appropriate addresses:
50
53
51
54
-**NameNode** - provides a web console for viewing HDFS, number of Data Nodes, and logs - [http://localhost:50070/](http://localhost:50070/)
52
55
- In the pseudo-distributed configuration, you should see one live DataNode named "localhost".
53
-
-**JobTracker** - allows viewing and running completedand failed jobs with logs - [http://localhost:50030/](http://localhost:50030/)
56
+
-**JobTracker** - allows viewing the completed, currently running, and failed jobs along with their logs - [http://localhost:50030/](http://localhost:50030/)
54
57
55
58
Oozie: installation and configuration
56
59
-------------------------------------
57
-
Based on [Cloudera CDH4 Oozie installation guide](https://ccp.cloudera.com/display/CDH4DOC/Oozie+Installation#OozieInstallation-ConfiguringOozieinstall)
60
+
The description below is based on [Cloudera CDH4 Oozie installation guide](https://ccp.cloudera.com/display/CDH4DOC/Oozie+Installation#OozieInstallation-ConfiguringOozieinstall).
58
61
59
62
- Install Oozie with
60
63
@@ -65,7 +68,7 @@ Based on [Cloudera CDH4 Oozie installation guide](https://ccp.cloudera.com/displ
- Through a webpage - use a web browser to open: [http://localhost:11000/oozie/](http://localhost:11000/oozie/)
113
+
- Through a webpage - use a web browser to open a webpage at the following address: [http://localhost:11000/oozie/](http://localhost:11000/oozie/)
111
114
112
-
If you want to check if Oozie correctly executes workflows, you can run some example workflows as described in [Cloudera Oozie example workflows](http://archive.cloudera.com/cdh4/cdh/4/oozie/DG_Examples.html). Note that contrary to what is written there, the Oozie server is not available at `http://localhost:8080/oozie` but at `http://localhost:11000/oozie` address
115
+
If you want to check if Oozie correctly executes its workflows, you can run some of the example workflows provided with Oozie as described in [Cloudera Oozie example workflows](http://archive.cloudera.com/cdh4/cdh/4/oozie/DG_Examples.html). Note that contrary to what is written there, the Oozie server is not available at `http://localhost:8080/oozie` but at `http://localhost:11000/oozie` address.
0 commit comments