- Owner: Nic Cope (@negz)
- Reviewers: Crossplane Maintainers
- Status: Accepted, revision 1.0
Crossplane is an open source multi cloud control plane. It introduces
workload and resource abstractions on-top of existing managed services to enable
a high degree of workload portability across cloud providers. A Crossplane
Workload
models an application that may be deployed to a Kubernetes cluster;
it is a unit of scheduled work that cannot be split across multiple clusters.
Crossplane managed clusters are represented by resource claim named
KubernetesCluster
; a Workload
scheduled to a KubernetesCluster
is
analogous to a Pod
scheduled to a Node
.
A contemporary Crossplane Workload
:
---
apiVersion: compute.crossplane.io/v1alpha1
kind: Workload
metadata:
name: demo
spec:
clusterSelector:
provider: gcp
resources:
- name: demo
secretName: demo
targetDeployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: wordpress
labels:
app: wordpress
spec:
selector:
app: wordpress
template:
metadata:
labels:
app: wordpress
spec:
containers:
- name: wordpress
image: wordpress:4.6.1-apache
ports:
- containerPort: 80
targetNamespace: demo
targetService:
apiVersion: v1
kind: Service
metadata:
name: wordpress
spec:
ports:
- port: 80
selector:
app: wordpress
type: LoadBalancer
Workloads are modeled in Crossplane 0.1 as a Custom Resource Definition
(CRD) embedding a Kubernetes Namespace
, Deployment
and Service
-
.spec.targetNamespace
, .spec.targetDeployment
and .spec.targetService
respectively. Once the scheduler has scheduled the Workload
to a cluster the
workload controller connects to said cluster and creates the templated
Deployment
and Service
. The controller polls the status of the Deployment
and Service
during its sync phase, persisting them inline in the Workload
's
.status
field. Each Workload
may also contain a set of references to
Crossplane resources or resource claims upon which the Workload depends -
modeled as distinct Kubernetes resources - in order to replicate their
connection Secrets
to the cluster upon which the Workload
is scheduled.
Complex applications such as Gitlab exceed the capabilities of today's
Workload
resource. Gitlab recommends deploying to Kubernetes via Helm. When
configured to use managed services for Redis, SQL, and Buckets the chart renders
to almost 4,800 lines of YAML including 14 Deployments
, 1 StatefulSet
,
3 Jobs
, 9 Services
, 16 ConfigMaps
, and many other resources. Crossplane
must be able to model complex applications as complex workloads.
The goal of this document is to design part of the best possible user
experience for deploying complex applications with Crossplane; Workload
will
not be responsible for the entire application installation and lifecycle
management but rather be a building block that may be managed by higher level
constructs.
It is important that:
- Workloads can model any Kubernetes resource, including built in resources and those defined by CRDs.
- Users do not need to connect to the cluster to which a
Workload
is scheduled in order to determine the status of the resources (Deployments
, etc) managed by saidWorkload
. - Each
Workload
is a unit of scheduling; it may not be spread across multipleKubernetesClusters
. - The proposed design lay a foundation for supporting workloads that are not containerised.
The following are out of scope for the Workload resource:
- Deploying a single workload to multiple clusters simultaneously.
- Configuration and/or templating. Each
Workload
will be a 'static' resource; the task of generating or alteringWorkloads
given a set of inputs will be that of a higher level construct. - Package and dependency management.
Workloads
will not model dependencies on or relationships to otherWorkloads
. Any resource types upon which aWorkload
depends are presumed to have been defined via CRD before instantiating theWorkload
.
This document proposes the Workload
kind within the
compute.crossplane.io/v1alpha1
API group be replaced with the
KubernetesApplication
kind in the workload.crossplane.io/v1alpha
group. The
.spec
of each KubernetesApplication
consists of a KubernetesCluster
label
selector used for scheduling, and a series of resource templates representing
resources to be deployed to the scheduled KubernetesCluster
.
A KubernetesApplication
will not template arbitrary resources directly, but
rather via an interstitial resource; KubernetesApplicationResource
. Each
KubernetesApplication
therefore consists of one or more templated
KubernetesApplicationResources
, each of which templates exactly one arbitrary
Kubernetes resource (for example a Deployment
or ConfigMap
).
Each KubernetesApplicationResource
represents a single Kubernetes resource to
be deployed to a KubernetesCluster
. The KubernetesApplicationResource
encapsulates the resource, including type and object metadata, in its
.spec.template
field. If the templated resource kind exposes a .status
field
when deployed, said field will be copied verbatim to the
KubernetesApplicationResource
's .status.remote
field.
KubernetesApplicationResources
will also specify a list of Secrets
presumed
to be the automatically created resource connection secrets for Crossplane
managed resources upon which its templated Kubernetes resource depends. These
Secrets
will be propagated into the same namespace as the templated resource.
Crossplane will model the template using the
*unstructured.Unstructured
type internally. Unstructured types must
include Kubernetes type and object metadata but are otherwise opaque. Status
will be completely opaque - i.e. a json.RawMessage
- to the controller
code. The controller will copy the remote resource's .status
field into the
KubernetesApplicationResource
's .status
.remote field. .status.remote
will be absent from KubernetesApplicationResources
that template resource
kinds that do not expose a .status
field.
An example complex workload:
---
apiVersion: workload.crossplane.io/v1alpha1
kind: KubernetesApplication
metadata:
name: wordpress-demo
namespace: complex
labels:
app: wordpress-demo
spec:
clusterSelector:
matchLabels:
app: wordpress-demo
# Each resource template is used to create a KubernetesApplicationResource.
resourceTemplates:
- metadata:
# Metadata of the KubernetesApplicationResource. The namespace is ignored;
# KubernetesApplicationResources are always created in the namespace of
# their controlling KubernetesApplication. This matches the behaviour of
# Deployments and ReplicaSets.
name: wordpress-demo-namespace
labels:
app: wordpress-demo
spec:
# This template specifies the actual resource to be deployed and managed
# in a remote Kubernetes cluster by this KubernetesApplicationResource.
# Note the two layers of templating; a KubernetesApplication templates
# KubernetesApplicationResources, which template arbitrary resources.
template:
# These templates must contain type as well as object metadata, because
# we allow templating of arbitrary resource kinds.
apiVersion: v1
kind: Namespace
metadata:
name: wordpress
labels:
app: wordpress
- metadata:
name: wordpress-demo-deployment
labels:
app: wordpress-demo
spec:
secrets:
# sql is the name of a connection secret. It will be propagated to the
# namespace of this KubernetesApplicationResource's template (i.e.
# wordpress) as a Secret named wordpress-demo-deployment-sql.
- name: sql
template:
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: wordpress
name: wordpress
labels:
app: wordpress
spec:
selector:
matchLabels:
app: wordpress
template:
metadata:
labels:
app: wordpress
spec:
containers:
- name: wordpress
image: wordpress:4.6.1-apache
ports:
- containerPort: 80
name: wordpress
- metadata:
name: wordpress-demo-service
labels:
app: wordpress-demo
spec:
template:
apiVersion: v1
kind: Service
metadata:
namespace: wordpress
name: wordpress
labels:
app: wordpress
spec:
ports:
- port: 80
selector:
app: wordpress
type: LoadBalancer
Listing resources associated with a Kubernetes application:
$ kubectl -n complex get kubernetesapplication wordpress-demo
NAME CLUSTER STATUS DESIRED SUBMITTED
wordpress-demo wordpress-demo-cluster PartiallySubmitted 3 2
$ kubectl -n complex get kubernetesapplicationresource --selector app=wordpress-demo
NAME TEMPLATE-KIND TEMPLATE-NAME CLUSTER STATUS
wordpress-demo-deployment Deployment wordpress wordpress-demo-cluster Submitted
wordpress-demo-namespace Namespace wordpress wordpress-demo-cluster Submitted
wordpress-demo-service Service wordpress wordpress-demo-cluster Failed
The proposed KubernetesApplication
and especially
KubernetesApplicationResource
names are rather verbose when compared to their
contemporary: Workload
. These names are best justified by breaking them down
into their parts:
Kubernetes represents the deployment vector of the application. Prefixing the kind with Kubernetes leaves room to define applications that are deployed using other methods. This design proposes the explicit prefix Kubernetes rather than the abstract prefix Containerized because the proposed CRD is tightly coupled to Kubernetes; it could not be used to deploy a containerized application via Amazon ECS or Docker Swarm. The scheme chosen by this design impacts future implementations; would an application targeting Amazon Lambda be named a ServerlessApplication or a LambdaApplication? Kubernetes is arguably ubiquitous enough to be analogous with generic resource kind names like ServerlessApplication or VMApplication.
Application distinguishes a workload from a compute resource when interacting
with Crossplane. It is synonymous in this context with Workload, which is
implied by the workload.crossplane.io API namespace. Including Application would
thus be redundant except that the API namespace is typically omitted when
interacting with the API server. Assume KubernetesApplication
was instead
named Kubernetes
, relying on the API namespace to indicate that it was a
workload. In this scenario kubectl get kubernetes
would return Kubernetes
workloads while kubectl get kubernetescluster
would return Crossplane managed
Kubernetes clusters. These names are close enough that it's not unlikely
Crossplane users would expect kubectl get kubernetes
to return Kubernetes
clusters rather than workloads. Application is preferable to Workload to avoid
stuttering when the API namespace is considered, and provides symmetry with
similar concepts like sig-apps' Application.
Resource templates an arbitrary Kubernetes resource of which an application
consists. A resource could template a compute resource such as a Deployment
,
StatefulSet
, or Job
; a configuration resource such as a ConfigMap
or
Secret
; or a networking resource such as a Service
or Ingress
. The term
'Resource' is overloaded in the Crossplane world; it can refer to both a generic
Kubernetes resource (roughly synonymous with 'object' in Kubernetes parlance) as
well as a Crossplane 'managed resource', for example an SQLInstance
resource
claim or an RDSInstance
as a concrete managed resource. This document uses
'resource template' interchangeably with KubernetesApplicationResource
and
explicitly refers to managed resources as 'managed resources'.
workload.crossplane.io is the API namespace in which applications and their
resource templates exist, regardless of whether the application targets
Kubernetes or something else. Moving the kinds from compute.crossplane.io
to
workload.crossplane.io
clearly delineates compute resources from things that
run on compute resources.
Kubernetes resource kinds may be namespace or cluster scoped. The former exist
within a namespace that must be created before the resource, allowing a named
resource to be instantiated multiple times; once per namespace. The latter are
singletons; only one named instance of a resource can exist per cluster. Most
Kubernetes resource kinds are namespaced. Cluster scoped resources include
CustomResourceDefinition
, ClusterRole
, PersistentVolume
, and Namespace
itself. Cluster scoped resources use the same object metadata schema as
namespaced resources but ignore the .metadata.namespace
field.
The contemporary Workload
templates two namespaced resources (a Deployment
and Service
) and one cluster scoped resource (a Namespace
). This document
proposes that application resource templates avoid special handling of
namespaces; an application could consist of three resource templates -
templating a Namespace
named coolns
, a Deployment
in namespace coolns
,
and a Deployment
without a namespace. Templated resources of a namespaced kind
that do not specify a namespace will be created in the namespace default as
would any other Kubernetes resource. No relationship will exist between the
namespace of the KubernetesApplication
or KubernetesApplicationResource
in
the Crossplane API server and the namespace of templated resources to be
deployed to a cluster.
At first glance this may seem more complicated than requiring a namespace be
specified one time at the application level. On the contrary, doing so would
both complicate Crossplane's controller logic and result in surprising
behaviours for users. Recall that a KubernetesApplicationResource
may
template any valid Kubernetes resource kind, including those unknown to the
Crossplane API server. This means a KubernetesApplication
specifying an
explicit target namespace for its resource templates could consist of
KubernetesApplicationResources
that template cluster scoped resources,
including other namespaces, that cannot be created in said target namespace.
This confusing behaviour could be eliminated by eliminating support for cluster
scoped resources; such resources are typically more closely related to clusters
themselves than the workloads running upon them. Unfortunately the ability to
require templated resources be namespaced is mutually exclusive with the ability
to template resource kinds unknown to the Crossplane API server. Namespaced and
cluster scoped resources are indistinguishable. Both use standard Kubernetes
object metadata, but cluster scoped resources ignore .metadata.namespace
. It
is possible to determine whether a resource is namespaced by inspecting its
kind's API resource definition, but this would require resource definitions be
applied to the Crossplane API server before Crossplane was able to template
their resources.
The main arguments for specifying target namespaces at the application rather
than resource template level involve avoiding repetition. Most applications will
be composed of several namespaced resources deployed to one namespace.
Specifying the namespace via a resource template's object metadata would require
an application with ten resource templates to repeat the namespace ten times. In
cases where one application is deployed per cluster this is a moot point; there
is no need for namespacing when a cluster runs only one application. Simply omit
the namespace altogether and let resources be created in the namespace default
as is the Kubernetes API server's standard behaviour.
References to dependent managed resources are also specified at the resource
template level in the proposed design. Recall that the contemporary Workload
contains a set of references to managed resources. This allows Crossplane to
propagate their connection Secrets
to the cluster upon which the Workload
is
scheduled. Secrets
are namespaced, and may only be consumed from within their
own namespace, so Crossplane must ensure secrets are propagated to the same
namespace as their consumers. It could be repetitive to specify dependent
managed resources at the resource template level, for example if an application
was composed of three Deployments
all connecting to the same message queue.
Each resource template of a Deployment
would need to reference the same
message queue resource.
On the other hand, this repetition is born of explicitness. Imagine a complex
workload consisting of three Deployments
dependent upon two SQLInstances
.
Specifying resource dependencies at the resource template level makes it
explicit which Deployment
depends upon which SQLInstance
. In this case it's
less ideal to model dependent resources at the application level, as doing so
would effectively represent that "some of the resource templates of this
application depend on some of these managed resources" rather than "this
Kubernetes resource depends on exactly these managed resources".
An application and its resource templates are static representations of a
complex workload to be deployed to a cluster. Requiring that templated resources
exist in exactly one namespace specified at the application scope complicates
Crossplane's controller code and results in surprising behaviours. This document
proposes that applications be unopinionated about resource namespaces and
instead rely on convention. Most workloads will be generated via a higher level
tool such as Helm. Such tools are the better place for strong opinions; they can
easily take a namespace as an input and output a KubernetesApplication
consisting of a KubernetesApplicationResource
templating a Namespace
along
with several other KubernetesApplicationResources
templating resources to be
deployed to that namespace.
As mentioned in Namespacing this document proposes that the set of Crossplane managed resource references used to propagate connection secrets be scoped at the resource, not application level.
type ResourceReference struct {
// These first seven fields are in reality an embedded
// corev1.ObjectReference.
Kind string
Namespace string
Name string
UID types.UID
APIVersion string
ResourceVersion string
FieldPath string
SecretName string
}
The resources field of the contemporary Workload
is a slice of
ResourceReference
structs. These references are used, by convention, to refer
to either a Crossplane resource binding (e.g. SQLInstance
) or a concrete
Crossplane resource (e.g. RDSInstance
), but could just as easily refer to a
Deployment
or ConfigMap
that does not make sense in this context. In
practice, the contemporary workload controller code only uses
ResourceReference
's SecretName
and Name
fields. If SecretName
is
specified a Secret
of that name will be retrieved. If SecretName
is not
specified a secret named Name
will be retrieved. In either case all other
fields of the ResourceReference
, including Namespace
, are ignored. The
contemporary controller always looks for connection secrets in the Workload
's
namespace. Naming this field .resources
makes it seem that a user could simply
provide a set of resource claims or concrete resources and let Crossplane figure
out the rest, but this is not the case. The user must either provide a set of
resources that follow Crossplane's default convention of storing their
connection secret in a Secret
with the same name as the resource, or
explicitly tell Crossplane which Secret
name to propagate.
type KubernetesApplicationResourceSpec struct {
Template *unstructured.Unstructured
Secrets []corev1.LocalObjectReference
}
type LocalObjectReference struct {
Name
}
Given that the only purpose of the contemporary resources field is to load
resource connection Secrets
for propagation, and given that the contemporary
workload only loads Secrets
from within the Workload
's namespace,
KubernetesApplicationResource
instead uses a slice of
corev1.LocalObjectReference
in a field named .secrets
. Doing so clarifies
the purpose and constraints of the field without having to read documentation or
the controller code.
The contemporary Workload
is watched by two controllers within Crossplane -
the scheduler and the workload controller. The former is responsible for
allocating a KubernetesCluster
to a Workload
while the latter is responsible
for connecting to said cluster and managing the lifecycle of the Workload
's
Namespace
, Deployment
and Service
.
This document proposes the responsibilities of the existing workload controller be broken up between two controllers - application and resource. Under this proposal the three controllers would have the following responsibilities:
- The scheduler controller watches for
KubernetesApplications
. It allocates each application to aKubernetesCluster
. This is unchanged from today's scheduler implementation. - The application controller watches for scheduled
KubernetesApplications
. It is responsible for:- Creating, updating, and deleting
KubernetesApplicationResources
according to its templates. - Ensuring the controller reference is set on its extant
KubernetesApplicationResources
. - Updating the application's
.status.desiredResources
and.status.submittedResources
fields. The former represents the number of resource templates the application specifies. The latter represents the subset of those resource templates that have been successfully submitted to their scheduled Kubernetes cluster.
- Creating, updating, and deleting
- The resource controller watches for scheduled
KubernetesApplicationResources
. It is responsible for:- Propagating its
.secrets
to its scheduledKubernetesCluster
. PropagatedSecret
names are derived from theKubernetesApplicationResource
and connection secret names in order to avoid conflicts when two resource templates reference the sameSecret
. For example aSecret
namedmysql
referenced by a resource template namedwordpress-deployment
would be propagated to the scheduled cluster as aSecret
namedwordpress-deployment-mysql
. - Creating or updating the resource templated in its
.spec.template
(e.g. aDeployment
,Service
,Job
,ConfigMap
, etc) in its scheduled KubernetesCluster. - Copying the templated resource's
.status
into its own.status.remote
.
- Propagating its
This design ensures KubernetesApplication
is our atomic unit of scheduling,
while making it possible to reflect the status of each templated resource on the
KubernetesApplicationResource
that envelopes it. Resources templated by a
KubernetesApplicationResource
are opaque to the Crossplane API server - their
group, version, and kind need only be known to the Kubernetes cluster upon which
they're scheduled. A KubernetesApplicationResource
may be retroactively added
to or removed from a KubernetesApplication
after it has been created by
updating the application's templates.
Kubernetes object metadata allows any resource to reflect that it is owned by
one or more resources. Exactly one owner of a resource may be marked as its
controller. A Pod
may mark a ReplicaSet
as its controller, which in turn
may mark a Deployment
as its controller. Controllers are expected to respect
this metadata in order to avoid fighting over a resource.
This is relevant in the case of two KubernetesApplications
both containing a
template for a KubernetesApplicationResource
named cool
. Despite the desired
one-to-many application-to-resource relationship both controllers would assume
they owned the KubernetesApplicationResources
, resulting in a potential
many-to-many relationship and undefined, racy behaviour. The application
controller must use controller references to claim its templated
KubernetesApplicationResources
.
The relationship between an application and its resource templates is as follows:
- The application controller takes a watch on all
KubernetesApplications
andKubernetesApplicationResources
. Any activity for either kind triggers a reconciliation of theKubernetesApplication
. - During each reconciliation the controller:
- Attempts to create or update a
KubernetesApplicationResource
for each of its extant templates. This will fail if a named template conflicts with an existingKubernetesApplicationResource
not controlled (in the controller reference sense) by theKubernetesApplication
. - Iterates through all extant
KubernetesApplicationResources
, deleting any resource that is controlled by the application but that does not match the name of an extant template within the application's spec.
- Attempts to create or update a
- The application controller uses the
foregroundDeletion
finalizer. This ensures all of an application's controlled resource templates are garbage collected (i.e. deleted) upon deletion of the application.
A KubernetesApplication
can only ever be associated with the
KubernetesApplicationResources
that it templates; a KubernetesApplication
will never orphan or adopt orphaned KubernetesApplicationResources
. This is
in line with the controller reference design, which states:
If a controller finds an orphaned object (an object with no ControllerRef) that matches its selector, it may try to adopt the object by adding a ControllerRef. Note that whether or not the controller should try to adopt the object depends on the particular controller and object.
The controller reference pattern applies only to resources defined in the same
API server. It uses a metav1.OwnerReference
that assumes the controlling
resource exists in the same cluster and namespace as the controlled resource.
Consider two resource templates, both owned by the same application and thus scheduled
to the same cluster:
- A
KubernetesApplicationResource
namedcoolns/cooldeployment
, templating aDeployment
namedremotens/cooldeployment
- A
KubernetesApplicationResource
namedcoolns/lamedeployment
, also templating aDeployment
namedremotens/cooldeployment
, but with a different.spec.template.spec
.
In this example the two resource templates will race to create or update
remotens/cooldeployment
. The resource controller will avoid this race by
adding annotations to the remote resource templated by a particular resource
and obeying the three laws of controllers. All remote resources owned by a
KubernetesApplicationResource
will be annotated with key
kubernetesapplicationresource.workload.crossplane.io/uid
set to the UID of
the KubernetesApplicationResource
that created the remote resource.
All Crossplane resources, including KubernetesApplication
and
KubernetesApplicationResource
, are CRDs. CRDs are validated against an
OpenAPI v3 schema, but some kinds of validation require the use of a
ValidatingAdmissionWebhook
. In particular a webhook is required to
enforce immutability; it's not possible via OpenAPI schema alone to specify
fields that may be set at creation time but that may not be subsequently
altered.
The design proposed by this document requires a handful of fields be immutable.
Updating a KubernetesApplication
's .spec.clusterSelector
would require all
resources be removed from the old cluster and recreated on the new cluster. This
is more cleanly handled by deleting and recreating the application. The cluster
selector should be immutable.
A KubernetesApplicationResource
's .spec.template.kind
,
.spec.template.apiVersion
, .spec.template.name
, and
.spec.template.namespace
fields must also be immutable. Changing any of these
fields after creation time would cause the templated resource to be orphaned and
a new resource created with the new kind, API version, name, or namespace. The
controller-runtime library upon which Crossplane is built does not expose the
old version of an object during updates, making it impossible to determine
whether these fields have changed, but validating webhooks do.
Crossplane does not currently leverage Kubernetes webhooks, controller-runtime
has support for both validating and mutating admission webhooks. This
document proposes two validating webhook be added to Crossplane; one each of
KubernetesApplication
and KubernetesApplicationResource
to enforce
immutability of the aforementioned fields.
The following alternative designs were considered and discarded or deferred in favor of the design proposed by this document.
The proposed relationship between a KubernetesApplication
and its
KubernetesApplicationResources
is unlike that of any built in Kubernetes
controller resources and their controlled resources. Most controller resources
(as opposed to controller logic) include a single template that is used to
create one or more identical replicas of the templated resource; ReplicaSet
is
an example of this pattern; a ReplicaSet
includes a single pod template that
is used to instantiate N homogenous replicas. A KubernetesApplication
on
the other hand includes one or more heterogenous resource templates that are
used to instantiate one or more heterogenous resources. This pattern is closer
to the relationship between a Pod
and its containers, except that Kubernetes
does not model containers as a distinct API resource.
Managing a set of heterogeneous resources is more complicated than managing
several homogenous replicas. A ReplicaSet
can support only a handful of
operations:
- Increase running replicas by instantiating
N
randomly namedPod
resources from its current pod template. - Decrease running replicas by deleting
N
random controlledPods
. - Update its pod template. Note that doing so does not affect running
Pods
, onlyPods
that are created in future scale ups.
A KubernetesApplication
must support:
- Creating a
KubernetesApplicationResource
that has been added to its set of templates. This resource template has an explicit, non-random name, increasing the likelihood of an irreconcilable conflict with an existingKubernetesApplicationResource
. - Deleting a
KubernetesApplicationResource
that has been removed from its set of templates. There's no reliable way to observe the previous generation of the application, so the controller logic must assume any resource template referencing the application as its controller that does not match an extant template's name should be deleted. - Updating a
KubernetesApplicationResource
.
One alternative to the pattern proposed by this design is closer to the loosely
coupled relationship between a Service
and its backing Pods
; the Crossplane
user would submit a series of KubernetesApplicationResources
, then group them
all into a co-scheduled unit via a KubernetesApplication
via a label selector.
A KubernetesApplication
would be associated with its constituent
KubernetesApplicationResources
purely via label selectors (and controller
references) rather than actively managing their lifecycles based on templates
encoded in its .spec
. This defers conflict resolution to the Crossplane user
and avoids unwieldy, potentially gigantic, KubernetesApplication
resources.
The main drawback of this loosely coupled approach is that the system is
eventually consistent with the user's intent. When all desired resources are
specified as templates in the application's .spec
it's always obvious how many
resources the user desired and how many have been successfully submitted. If a
resource template is invalid the entire application will be rejected by the
Crossplane API Server. In the loosely coupled approach the invalid
KubernetesApplicationResource
would be rejected by the API server, but the
KubernetesApplication
would, according to the API server, otherwise appear to
be a healthy application that happens to desire one less resource than the user
intended.
This alternative proposes a 'monolithic' workload. A monolithic workload is
similar to the design proposed by this document but with the various resources
and statuses nested directly within the KubernetesApplication
rather than via
the interstitial KubernetesApplicationResource
resource.
An example monolithic complex workload:
---
apiVersion: workload.crossplane.io/v1alpha1
kind: KubernetesApplication
metadata:
name: demo
spec:
clusterSelector:
provider: gcp
resources:
- name: demo
secretName: demo
resourceTemplates:
# The monothlic workload does not template KubernetesApplicationResources, but
# instead templates arbitrary Kubernetes resources directly.
- apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: wordpress
labels:
app: wordpress
spec:
selector:
app: wordpress
template:
metadata:
labels:
app: wordpress
spec:
containers:
- name: wordpress
image: wordpress:4.6.1-apache
ports:
- containerPort: 80
status:
cluster:
namespace: cool
name: theperfectkubernetescluster
conditions:
- lastTransitionTime: 2018-10-02T12:25:39Z
lastUpdateTime: 2018-10-02T12:25:39Z
message: Successfully submitted cool/supercoolwork
status: "True"
remote:
# There's no distinct API resource within the Crossplane API server with which
# to associate the status of each remote resource, so instead we maintain an
# array of statuses 'keyed' by their resource's type and object metadata.
- apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: wordpress
labels:
app: wordpress
status:
replicas: 2
availableReplicas: 2
unavailableReplicas: 2
observedGeneration: 3
conditions:
- lastTransitionTime: 2016-10-04T12:25:39Z
lastUpdateTime: 2016-10-04T12:25:39Z
message: Replica set "nginx-deployment-4262182780" is progressing.
reason: ReplicaSetUpdated
status: "True"
type: Progressing
The monolithic workload design is functionally close to that proposed by this document, but has two major drawbacks:
- Representing the status of remote resources would become unwieldy. Each
KubernetesApplication
would need to maintain a map of resource statuses keyed by their type and object metadata. - It precludes breaking out the logic of the workload controller into separate application and resource controllers, resulting in a single more complicated controller.
It's worth noting that this monolithic design has a lot of symmetry with the
relationship between a Pod
and its containers. Containers are not modelled as
distinct Kubernetes API resources, and are always coscheduled to a node, much as
resources under the monolithic design are always coscheduled to a Kubernetes
cluster and are not modelled as distinct API resources in the Crossplane API
server. Container status is modeled as an array 'keyed' by container name.
Both the contemporary and proposed workload designs poll the status of the
resources they create in their scheduled cluster, reflecting them in the status
of the Workload
or KubernetesApplicationResource
that created them. This
allows a Crossplane user to inspect the status of the resources they created in
a remote cluster without ever explicitly connecting to said cluster.
Resource statuses have arbitrary schemas; there is no standard even amongst
built in types. This makes it impossible to consistently model the health of a
resource managed by a resource. The status field exposed by a healthy
Deployment
is completely different from the status field exposed by a healthy
Ingress
, let alone the status field exposed by a custom resource. This forces
both the controller code and the KubernetesApplicationResource
CRD OpenAPI
validation specification to treat status as an opaque JSON object.
One alternative would be to avoid polling the status altogether; resource
templates would simply reflect that they had submitted their templated resource
to their scheduled KubernetesCluster
either successfully or unsuccessfully. It
would be left as an exercise for the Crossplane user to connect to the scheduled
cluster, locate the managed resources, and inspect them directly.
The Kubernetes Federation project has similar but not identical goals to Crossplane's workloads. Federation defines Kubernetes resources in one cluster which runs controllers that propagate said resources to another set of clusters.
Federation v2 uses 'envelope' resources similar to the proposed
KubernetesApplicationResource
, but with stronger typing. A federated resource
of kind <K>
is specified using a Federated<K>
, for example a Service
is
modeled using a FederatedService
. These Federated<K>
envelopes are CRDs
generated via a command line tool that introspects the underlying
resource. Federated<K>
is associated with <K>
via a
FederatedTypeConfig
. The federation controller watches for
FederatedTypeConfig
, creating two more controllers for each
Federated<K>
referenced by a FederatedTypeConfig
. One controller is
responsible for propagating the Federated<K>
's templated <K>
resource to the
clusters upon which it is scheduled while the other is responsible for polling
the status of the managed resources.
Crossplane could replace KubernetesApplicationResource
with a series of
resources similar to the Federated<K>
envelope resources, for example
Cross<K>
. This is appealing because it allows for stronger typing; generating
a Cross<K>
analog to a resource would require introspecting <K>
, allowing
the Cross<K>
to derive the schema for its .spec.template
and
.status.remote
fields from the underlying <K>
kind.
Unfortunately this approach has several detractors:
- It requires the Crossplane API server to understand each kind of resource that
it wishes to propagate as part of a workload. Assuming a resource of kind
Cool
is specified via a CRD, said CRD must be applied to the Crossplane API server before aCrossCool
can be generated. - Even when the
Cool
CRD has been applied to the Crossplane API server Crossplane does not have a Go object to associate with said CRD and thus must resort to using*unstructured.Unstructured
andjson.RawMessage
to represent the kind's template and status. - Additional complexity is introduced in order to generate strongly typed
envelopes. The Federation project requires the operator to explicitly create
these envelope CRDs by running a command line tool. A Crossplane controller
could automate this by watching for
APIResource
. - Applications cannot be associated with several different resource kinds by
label selector alone. It's possible to get all of a particular resource kind
by label (e.g.
kubectl get pod -l thislabel=cool
) but it's not possible to get all resources (e.g.kubectl get all -l thislabel=cool
). Workloads would need to be associated to strongly typed envelope kinds via either an array ofcorev1.ObjectReferences
, or a label selector and an array of kinds.
A Federated resource status is still a map[string]interface{}
in the
controller code:
type FederatedResource struct {
metav1.TypeMeta
metav1.ObjectMeta
ClusterStatus []ResourceClusterStatus
}
type ResourceClusterStatus struct {
ClusterName string
Status map[string]interface{}
}
One alternative to a simple annotation representing that a remote resource is
owned by a KubernetesApplicationResource
is to model said ownership using a
distinct resource in the KubernetesCluster
to which a
KubernetesApplicationResource
is scheduled. This resource would act as the
controller reference of the remote, templated resource. Assuming we named this
intermediary resource CrossplaneApplicationResourceReference
a Deployment
templated by a KubernetesApplicationResource
in the Crossplane API server
would be 'owned' (in the controller reference sense) by a
CrossplaneApplicationResourceReference
in the remote cluster:
---
apiVersion: workload.crossplane.io/v1alpha1
kind: CrossplaneApplicationResourceReference
metadata:
name: demo
remote:
apiServer: https://some.crossplane.apiserver.example.org
# Everything below represents the controlling resource in the controlling
# Crossplane API server.
apiVersion: workload.crossplane.io/v1alpha1
kind: KubernetesApplication
metadata:
name: demo
namespace: demo
uid: some-cool-uuid
An intermediary resource would provide context to uninitiated users of the
remote Kubernetes as to what a Crossplane is and which Crossplane instance is
managing a particular resource, but comes at the expense of increased
complexity. Crossplane would need to propagate the
CrossplaneApplicationResourceReference
CRD to each cluster it managed, and
manage a CrossplaneApplicationResourceReference
for every actual remote
resource. This complexity is only worthwhile if it is expected that Crossplane
will frequently deploy applications to clusters that are also used directly by
users who are unfamiliar with Crossplane.
Namespaced resources often depend on cluster scoped resources; Namespace
and
CustomResourceDefinition
for example are cluster scoped resources that are
used by namespaced resources. The order in which the resource templates of an
application are reconciled are undefined. This means that, for example, an
application consisting of a resource templating a Namespace
and another
resource templating a Deployment
to be created in said namespace may take a
few reconcile loops to be created:
- Random chance causes the resource templating the
Deployment
to be submitted first. This fails due to theDeployment
targeting a name that has yet to be created. The reconcile of this resource is requeued. - The resource containing the
Namespace
is submitted successfully. - The resource containing the
Deployment
tries again. It now succeeds.
One way to avoid this would be to break a large application up into smaller
ones, applied sequentially. The issue here is that there is no guarantee the
second KubernetesApplication
will be scheduled to the same cluster as the
first. The first application could add a label to the KubernetesCluster
it is
scheduled to that the second could select, but this devolves into a flawed
dependency system. The requirements of the second KubernetesApplication
are
not considered when the first is scheduled, despite the fact that they must be
co-scheduled.
Another alternative is to allow KubernetesApplicationResources
to be
associated directly with a KubernetesCluster
(instead of a
KubernetesApplication
) via a label selector. This circumvents the scheduling
of a KubernetesApplication
; the KubernetesCluster
controller would find all
associated resource templates and explicitly 'schedule' them to itself when
instantiated. This pattern could be used to model resource templates that were more
strongly associated with the cluster itself rather than applications running
upon it, for example ensuring every KubernetesCluster
ran a functional ingress
controller or had a base set of ClusterRoles
available.
Per Secret Propagation this document proposes
KubernetesApplicationResources
use a set of Secret
references rather than a
set of managed resource references. Doing so makes the purpose of the field
clearer given that it is in practice only used to propagate connection
Secrets
. If there are worthwhile uses for associating managed resources or
managed resource claims with a KubernetesApplicationResource
beside connection
Secret
propagation it would be preferable to maintain the contemporary
Workload
pattern of taking a set of managed resource references rather than
Secrets
. One speculative use could be to automatically ensure connectivity
between said managed resources and the KubernetesCluster
to which their
consuming Kubernetes resources are scheduled.
Referencing Crossplane managed resources or resource claims in a fashion that
avoids the flaws of the contemporary design (see Secret
Propagation for details) is complicated by the fact that
the controller must know whether the referenced managed resource is concrete or
a claim (i.e. an RDSInstance
or a SQLInstance
). This is difficult because
Crossplane managed resources and claims are Kubernetes resources with arbitrary
kinds, e.g. RedisCluster
, Bucket
, RDSInstance
, CloudMemorystoreInstance
,
etc.