Skip to content
This repository was archived by the owner on Jul 4, 2023. It is now read-only.

Commit 9fff927

Browse files
author
Riaan Nolan
committed
Merge branch 'feature/update-layout-and-theme' into 'master'
adding typography, dbt and airflow, adding typography examples, embed youtube... See merge request all-staff/hashiqube!138
1 parent 8e81a7a commit 9fff927

25 files changed

+1438
-2
lines changed

README.md

+2
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,8 @@ Now you can use DNS like nomad.service.consul:9999 vault.service.consul:9999 via
140140
* [__Newrelic Kubernetes Monitoring__](newrelic-kubernetes-monitoring/#newrelic-kubernetes-monitoring) - Monitor Kubernetes Clusters and Workloads with Newrelic
141141
* [__Docsify__](docsify/#docsify) - A magical documentation site generator
142142
* [__Ansible-Tower__](ansible-tower/#ansible-tower) - Red Hat Ansible Tower
143+
* [__Dbt__](dbt/#dbt) - Dbt is a data transformation tool that enables data analysts and engineers to transform, test and document data in the cloud data warehouse
144+
* [__Airflow__](apache-airflow/#apache-airflow) - Apache Airflow is an open-source workflow management platform for data engineering pipelines
143145

144146
Once the stack is up you will have a large number of services running and available on `localhost` <br />
145147
For Documentation please open http://localhost:3333 in your browser

SUMMARY.md

+3
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,9 @@
22

33
* [Ansible](ansible/README.md)
44
* [Ansible-tower](ansible-tower/README.md)
5+
* [Apache-airflow](apache-airflow/README.md)
56
* [Database](database/README.md)
7+
* [Dbt](dbt/README.md)
68
* [Docker](docker/README.md)
79
* [Docsify](docsify/README.md)
810
* [Git](git/README.md)
@@ -13,3 +15,4 @@
1315
* [Minikube](minikube/README.md)
1416
* [Multi-cloud](multi-cloud/README.md)
1517
* [Newrelic-kubernetes-monitoring](newrelic-kubernetes-monitoring/README.md)
18+
* [Typography](typography/README.md)

Vagrantfile

+7
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,7 @@ Vagrant::configure("2") do |config|
106106
config.vm.network "forwarded_port", guest: 18889, host: 18889 # apache airflow
107107
config.vm.network "forwarded_port", guest: 3333, host: 3333 # docsify
108108
config.vm.network "forwarded_port", guest: 8043, host: 8043 # ansible-tower
109+
config.vm.network "forwarded_port", guest: 28080, host: 28080 # dbt docs serve
109110

110111
end
111112

@@ -297,6 +298,9 @@ Vagrant::configure("2") do |config|
297298
# vagrant up --provision-with minikube to only run this on vagrant up
298299
config.vm.provision "minikube", run: "never", type: "shell", preserve_order: true, privileged: false, path: "minikube/minikube.sh"
299300

301+
# apache-airflow
302+
# vagrant up --provision-with apache-airflow to only run this on vagrant up
303+
config.vm.provision "apache-airflow", run: "never", type: "shell", preserve_order: true, privileged: false, path: "apache-airflow/apache-airflow.sh"
300304

301305

302306

@@ -309,6 +313,9 @@ Vagrant::configure("2") do |config|
309313

310314

311315

316+
# dbt
317+
# vagrant up --provision-with dbt to only run this on vagrant up
318+
config.vm.provision "dbt", run: "never", type: "shell", preserve_order: true, privileged: false, path: "dbt/dbt-global.sh"
312319

313320
# vagrant up --provision-with bootstrap to only run this on vagrant up
314321
config.vm.provision "welcome", preserve_order: true, type: "shell", privileged: true, inline: <<-SHELL

apache-airflow/README.md

+59
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# Apache Airflow
2+
https://airflow.apache.org/
3+
4+
Airflow is a platform created by the community to programmatically author, schedule and monitor workflows
5+
6+
7+
![Airflow](images/airflow-logo.png?raw=true "Airflow")
8+
9+
## Provision
10+
11+
In order to provision apache airflow you need bastetools, docker, minikube as dependencies.
12+
13+
```
14+
vagrant up --provision-with basetools,docker,minikube,postgresql,dbt,apache-airflow
15+
```
16+
17+
## Web UI Access
18+
19+
To access the web UI visit http://localhost:18889.
20+
Default login is:
21+
```
22+
Username: admin
23+
Password: admin
24+
```
25+
26+
# Further Info
27+
Airflow is deployed on Minikube (Kubernetes) using Helm, and additional values are supplied in the values.yaml file.
28+
29+
Example DAGs are supplied in the dags folder and they are mounted into the airflow scheduler pod, see the details in the values.yaml file
30+
31+
# Airflow Information
32+
In the dags folder you will find 2 dags
33+
- example-dag.py
34+
- test-ssh.py
35+
36+
The `example-dag.py` runs dbt commands by using the SSHOperator and ssh'ing into Hashiqube.
37+
The `test-ssh.py` just ssh into hashiqube to test the connection
38+
39+
# Airflow DAGs
40+
![Airflow](images/airflow_dags.png?raw=true "Airflow")
41+
42+
# Airflow Connections
43+
![Airflow](images/airflow_connections.png?raw=true "Airflow")
44+
45+
# Airflow DAG run
46+
![Airflow](images/airflow_dag_run_dbt.png?raw=true "Airflow")
47+
48+
# Airflow Task Instance
49+
![Airflow](images/airflow_task_instance.png?raw=true "Airflow")
50+
51+
# Airflow Task Instance Result
52+
![Airflow](images/airflow_task_result.png?raw=true "Airflow")
53+
54+
# Links and further reading
55+
- https://artifacthub.io/packages/helm/airflow-helm/airflow/8.3.1
56+
- https://airflow.apache.org/docs/helm-chart/stable/index.html
57+
- https://airflow.apache.org/docs/helm-chart/stable/adding-connections-and-variables.html
58+
- https://airflow.readthedocs.io/_/downloads/en/1.10.2/pdf/
59+
- https://airflow.apache.org/docs/helm-chart/stable/parameters-ref.html

apache-airflow/airflow-dag-pvc.yaml

+28
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
---
2+
apiVersion: v1
3+
kind: PersistentVolume
4+
metadata:
5+
name: airflow-dags
6+
namespace: airflow
7+
spec:
8+
storageClassName: manual
9+
accessModes:
10+
- ReadWriteMany
11+
capacity:
12+
storage: 1Gi
13+
hostPath:
14+
path: "/vagrant/apache-airflow/dags"
15+
---
16+
apiVersion: v1
17+
kind: PersistentVolumeClaim
18+
metadata:
19+
name: airflow-dags
20+
namespace: airflow
21+
spec:
22+
storageClassName: manual
23+
accessModes:
24+
- ReadWriteMany
25+
volumeName: airflow-dags
26+
resources:
27+
requests:
28+
storage: 1Gi

apache-airflow/apache-airflow.sh

+100
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
#!/bin/bash
2+
3+
# https://airflow.apache.org/docs/apache-airflow/stable/installation/index.html
4+
# https://airflow.apache.org/docs/helm-chart/stable/index.html
5+
# https://github.com/apache/airflow/tree/main/chart
6+
# https://github.com/apache/airflow/blob/main/chart/values.yaml
7+
# https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/guides/quickstart.md
8+
# https://airflow.apache.org/docs/helm-chart/stable/adding-connections-and-variables.html
9+
# https://airflow.readthedocs.io/_/downloads/en/1.10.2/pdf/
10+
# https://airflow.apache.org/docs/helm-chart/stable/parameters-ref.html
11+
# https://artifacthub.io/packages/helm/airflow-helm/airflow/
12+
13+
cd ~/
14+
# Determine CPU Architecture
15+
arch=$(lscpu | grep "Architecture" | awk '{print $NF}')
16+
if [[ $arch == x86_64* ]]; then
17+
ARCH="amd64"
18+
elif [[ $arch == aarch64 ]]; then
19+
ARCH="arm64"
20+
fi
21+
echo -e '\e[38;5;198m'"CPU is $ARCH"
22+
23+
echo -e '\e[38;5;198m'"++++ "
24+
echo -e '\e[38;5;198m'"++++ Cleanup"
25+
echo -e '\e[38;5;198m'"++++ "
26+
for i in $(ps aux | grep kubectl | grep -ve sudo -ve grep -ve bin | grep -e airflow | tr -s " " | cut -d " " -f2); do kill -9 $i; done
27+
sudo --preserve-env=PATH -u vagrant helm delete airflow --namespace airflow
28+
sudo --preserve-env=PATH -u vagrant kubectl delete -f /vagrant/apache-airflow/airflow-dag-pvc.yaml
29+
sudo --preserve-env=PATH -u vagrant kubectl delete namespace airflow
30+
31+
echo -e '\e[38;5;198m'"++++ "
32+
echo -e '\e[38;5;198m'"++++ Create Namespace airflow for Airflow"
33+
echo -e '\e[38;5;198m'"++++ "
34+
sudo --preserve-env=PATH -u vagrant kubectl create namespace airflow
35+
36+
echo -e '\e[38;5;198m'"++++ "
37+
echo -e '\e[38;5;198m'"++++ Create PVC for Airflow DAGs in /vagrant/apache-airflow/dags"
38+
echo -e '\e[38;5;198m'"++++ "
39+
sudo --preserve-env=PATH -u vagrant kubectl apply -f /vagrant/apache-airflow/airflow-dag-pvc.yaml
40+
41+
# Install with helm
42+
# https://airflow.apache.org/docs/helm-chart/stable/index.html
43+
echo -e '\e[38;5;198m'"++++ "
44+
echo -e '\e[38;5;198m'"++++ Installing Apache Airflow using Helm Chart in namespace airflow"
45+
echo -e '\e[38;5;198m'"++++ "
46+
47+
echo -e '\e[38;5;198m'"++++ "
48+
echo -e '\e[38;5;198m'"++++ helm repo add apache-airflow https://airflow.apache.org"
49+
echo -e '\e[38;5;198m'"++++ "
50+
sudo --preserve-env=PATH -u vagrant helm repo add apache-airflow https://airflow.apache.org
51+
sudo --preserve-env=PATH -u vagrant helm repo update
52+
53+
# https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/guides/quickstart.md
54+
echo -e '\e[38;5;198m'"++++ "
55+
echo -e '\e[38;5;198m'"++++ helm install airflow apache-airflow/airflow"
56+
echo -e '\e[38;5;198m'"++++ "
57+
sudo --preserve-env=PATH -u vagrant helm upgrade --install airflow apache-airflow/airflow --namespace airflow --create-namespace \
58+
--values /vagrant/apache-airflow/values.yaml \
59+
--set dags.persistence.enabled=true \
60+
--set dags.persistence.existingClaim=airflow-dags \
61+
--set dags.gitSync.enabled=false
62+
63+
attempts=0
64+
max_attempts=15
65+
while ! ( sudo --preserve-env=PATH -u vagrant kubectl get pods --namespace airflow | grep web | tr -s " " | cut -d " " -f3 | grep Running ) && (( $attempts < $max_attempts )); do
66+
attempts=$((attempts+1))
67+
sleep 60;
68+
echo -e '\e[38;5;198m'"++++ "
69+
echo -e '\e[38;5;198m'"++++ Waiting for Apache Airflow to become available, (${attempts}/${max_attempts}) sleep 60s"
70+
echo -e '\e[38;5;198m'"++++ "
71+
sudo --preserve-env=PATH -u vagrant kubectl get po --namespace airflow
72+
sudo --preserve-env=PATH -u vagrant kubectl get events | grep -e Memory -e OOM
73+
done
74+
75+
echo -e '\e[38;5;198m'"++++ "
76+
echo -e '\e[38;5;198m'"++++ kubectl port-forward 18889:8080"
77+
echo -e '\e[38;5;198m'"++++ "
78+
attempts=0
79+
max_attempts=15
80+
while ! ( sudo netstat -nlp | grep 18889 ) && (( $attempts < $max_attempts )); do
81+
attempts=$((attempts+1))
82+
sleep 60;
83+
echo -e '\e[38;5;198m'"++++ "
84+
echo -e '\e[38;5;198m'"++++ kubectl port-forward service/airflow-webserver 18889:8080 --namespace airflow --address=\"0.0.0.0\", (${attempts}/${max_attempts}) sleep 60s"
85+
echo -e '\e[38;5;198m'"++++ "
86+
sudo --preserve-env=PATH -u vagrant kubectl port-forward service/airflow-webserver 18889:8080 --namespace airflow --address="0.0.0.0" > /dev/null 2>&1 &
87+
done
88+
89+
echo -e '\e[38;5;198m'"++++ "
90+
echo -e '\e[38;5;198m'"++++ Add SSH Connection for Hashiqube"
91+
echo -e '\e[38;5;198m'"++++ "
92+
kubectl exec airflow-worker-0 -n airflow -- /bin/bash -c '/home/airflow/.local/bin/airflow connections add HASHIQUBE --conn-description "hashiqube ssh connection" --conn-host "10.9.99.10" --conn-login "vagrant" --conn-password "vagrant" --conn-port "22" --conn-type "ssh"'
93+
94+
echo -e '\e[38;5;198m'"++++ "
95+
echo -e '\e[38;5;198m'"++++ Docker stats"
96+
echo -e '\e[38;5;198m'"++++ "
97+
sudo --preserve-env=PATH -u vagrant docker stats --no-stream -a
98+
99+
echo -e '\e[38;5;198m'"++++ Apache Airflow Web UI: http://localhost:18889"
100+
echo -e '\e[38;5;198m'"++++ Username: admin; Password: admin"

apache-airflow/dags/run-dbt.py

+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
from airflow.decorators import dag
2+
from datetime import datetime
3+
from airflow.providers.ssh.operators.ssh import SSHOperator
4+
5+
@dag(
6+
dag_id="run-dbt",
7+
schedule_interval=None,
8+
start_date=datetime(2022, 1, 1),
9+
catchup=False,
10+
)
11+
def run_dbt():
12+
task_1=SSHOperator(
13+
task_id="dbt-debug",
14+
ssh_conn_id='HASHIQUBE',
15+
command='cd /vagrant/dbt/jaffle_shop; /home/vagrant/.local/bin/dbt debug;',
16+
)
17+
task_2=SSHOperator(
18+
task_id="dbt-seed",
19+
ssh_conn_id='HASHIQUBE',
20+
command='cd /vagrant/dbt/jaffle_shop; /home/vagrant/.local/bin/dbt seed;',
21+
)
22+
task_3=SSHOperator(
23+
task_id="dbt-run",
24+
ssh_conn_id='HASHIQUBE',
25+
command='cd /vagrant/dbt/jaffle_shop; /home/vagrant/.local/bin/dbt run;',
26+
)
27+
task_4=SSHOperator(
28+
task_id="dbt-test",
29+
ssh_conn_id='HASHIQUBE',
30+
command='cd /vagrant/dbt/jaffle_shop; /home/vagrant/.local/bin/dbt test;',
31+
)
32+
task_1 >> task_2 >> task_3 >> task_4
33+
34+
run_dbt_dag = run_dbt()
31.7 KB
Loading
189 KB
Loading
226 KB
Loading
289 KB
Loading
Loading
914 KB
Loading

apache-airflow/values.yaml

+25
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# https://artifacthub.io/packages/helm/airflow-helm/airflow/8.3.1
2+
airflow:
3+
## environment variables for airflow configs
4+
## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/configuration/airflow-configs.md
5+
config:
6+
AIRFLOW__WEBSERVER__EXPOSE_CONFIG: "True"
7+
AIRFLOW__CORE__LOAD_EXAMPLES: "True"
8+
AIRFLOW_CONN_HASHIQUBE: "ssh://vagrant:vagrant@10.9.99.10:22?timeout=10&compress=false&no_host_key_check=true&allow_host_key_change=true"
9+
10+
## extra VolumeMounts for the airflow Pods
11+
## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/kubernetes/mount-persistent-volumes.md
12+
## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/kubernetes/mount-files.md
13+
extraVolumeMounts:
14+
- name: dags-data-volume-mount
15+
mountPath: /opt/airflow/dags
16+
readOnly: false
17+
18+
## extra Volumes for the airflow Pods
19+
## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/kubernetes/mount-persistent-volumes.md
20+
## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/kubernetes/mount-files.md
21+
extraVolumes:
22+
- name: dags-data-volume
23+
persistentVolumeClaim:
24+
claimName: airflow-dags
25+
readOnly: false

0 commit comments

Comments
 (0)