drill
Helm Charts for deploying Apache Drill Clusters on Kubernetes
Helm Charts for Apache Drill
!NOTE This project was originally forked from Agirish/drill-helm-charts, but was moved to this repo for automated build and discoverability.
Overview
This repository contains a collection of files that can be used to deploy Apache Drill on Kubernetes using Helm Charts. Supports single-node and cluster modes.
What are Helm and Charts?
Helm is a package manager for Kubernetes. Charts are a packaging format in Helm that can simplify deploying Kubernetes applications such as Drill Clusters.
Pre-requisites
- A Kubernetes Cluster (this project is tested on a K3s cluster)
- Helm version 3 or greater
- Kubectl version 1.16.0 or greater
Usage
Helm must be installed to use the charts. Please refer to Helm's documentation to get started.
Once Helm has been set up correctly, add the repo as follows:
helm repo add wearefrank https://wearefrank.github.io/charts
If you had already added this repo earlier, run helm repo update
to retrieve
the latest versions of the packages. You can then run helm search repo wearefrank
to see the charts.
To install the Drill chart:
helm install drill wearefrank/drill
To uninstall the chart:
helm delete drill
Values
Helm Charts use values.yaml
for providing default values to 'variables' used in the chart templates. These values may
be overridden either by editing the values.yaml
file or during helm install
. For example, such as the namespace,
number of drillbits and more to the template
files
Please refer to the values.yaml file for details on default values for Drill Helm Charts.
Access Drill Web UI
There is a service that can be used, but this one will jump from pod, which isn't very friendly. Depending on ingress class you can make this sticky with annotations. You could also change the
Chart Structure
Drill Helm charts are organized as a collection of files inside the drill
directory. As Drill depends on Zookeeper for
cluster co-ordination, a zookeeper chart added as dependency in the chart definition. The Zookeeper chart
is maintained by Bitnami.
drill/
Chart.yaml # A YAML file with information about the chart
Chart.lock # A YAML file containing information about the fetched dependencies
values.yaml # The default configuration values for this chart
charts/ # A directory containing the ZK charts
templates/ # A directory of templates, when combined with values, will generate valid Kubernetes manifest files
docs/ # A directory containing files for the documentation
Templates
Helm Charts contain templates
which are used to generate Kubernetes manifest files. These are YAML-formatted resource
descriptions that Kubernetes can understand. These templates contain 'variables', values for which are picked up from
the values.yaml
file.
Drill Helm Charts contain the following templates:
Autoscaling Drill Clusters
The size of the Drill cluster (number of Drill Pod replicas / number of drill-bits) can not only be manually scaled up or down, but can also be autoscaled to simplify cluster management. When enabled, with a higher CPU utilization, more drill-bits are added automatically and as the cluster load goes down, so do the number of drill-bits in the Drill Cluster. The drill-bits deemed excessive gracefully shut down, by going into quiescent mode to permit running queries to complete.
!IMPORTANT For the graceful shutdown to succeed, a sigfile is made in the
$DRILL_HOME
folder. This requires running asroot
( uid 0). If the application is run asdrilluser
thestop
commando will be used.
Enable autoscaling by editing the autoscale section in drill/values.yaml
file.
Parameters
Common parameters
Name | Description | Value |
---|---|---|
nameOverride | String to partially override common.names.fullname template (will maintain the release name) | "" |
fullnameOverride | String to fully override common.names.fullname template | "" |
Drill image parameters
Name | Description | Value |
---|---|---|
image.registry | Drill image registry | "" |
image.repository | Drill image repository | apache/drill |
image.tag | Drill image tag (immutable tags are recommended) | "" |
image.pullPolicy | Drill image pull policy | IfNotPresent |
image.pullSecrets | Drill image pull secrets | [] |
Drill deployment parameters
Name | Description | Value |
---|---|---|
replicaCount | Number of Drill replicas to deploy | 1 |
startupProbe.initialDelaySeconds | Initial delay seconds for livenessProbe | 10 |
startupProbe.periodSeconds | Period seconds for livenessProbe | 10 |
startupProbe.timeoutSeconds | Timeout seconds for livenessProbe | 1 |
startupProbe.failureThreshold | Failure threshold for livenessProbe | 6 |
startupProbe.successThreshold | Success threshold for livenessProbe | 1 |
readinessProbe.initialDelaySeconds | Initial delay seconds for livenessProbe | 0 |
readinessProbe.periodSeconds | Period seconds for livenessProbe | 5 |
readinessProbe.timeoutSeconds | Timeout seconds for livenessProbe | 1 |
readinessProbe.failureThreshold | Failure threshold for livenessProbe | 3 |
readinessProbe.successThreshold | Success threshold for livenessProbe | 1 |
livenessProbe.initialDelaySeconds | Initial delay seconds for livenessProbe | 0 |
livenessProbe.periodSeconds | Period seconds for livenessProbe | 10 |
livenessProbe.timeoutSeconds | Timeout seconds for livenessProbe | 1 |
livenessProbe.failureThreshold | Failure threshold for livenessProbe | 6 |
livenessProbe.successThreshold | Success threshold for livenessProbe | 1 |
resources | Set the resources for the Drill containers | {} |
resources.limits | The resources limits for the Drill containers | "" |
resources.requests.memory | The requested memory for the Drill containers | "" |
resources.requests.cpu | The requested cpu for the Drill containers | "" |
terminationGracePeriodSeconds | Number of seconds after which pods are forcefully killed | 25 |
terminationGracePeriodSeconds | Note: Lower values may cause running queries to fail | |
nodeSelector | Node labels for pod assignment | {} |
tolerations | Set tolerations for pod assignment | [] |
affinity | Set affinity for pod assignment | {} |
timeZone | used for database connection and log timestamps | Etc/UTC |
Traffic Exposure Parameters
Name | Description | Value |
---|---|---|
service.web.type | Drill Web service type | ClusterIP |
service.web.port | Drill Web service port | 80 |
service.user.type | Drill User Api service type | ClusterIP |
service.user.port | Drill User Api service port | 31010 |
ingress.enabled | Enable ingress record generation for Drill | false |
ingress.className | IngressClass that will be used to implement the Ingress (Kubernetes 1.18+) | "" |
ingress.annotations | Additional annotations for the Ingress resource. To enable certificate auto-generation, place here your cert-manager annotations. | {} |
ingress.hosts | Set hosts for ingress | [] |
ingress.hosts.host | Set hostname | "" |
ingress.hosts.paths | Set multiple paths | [] |
ingress.hosts.paths.path | Set path (context url) | "" |
ingress.hosts.paths.pathType | Set type of path | "" |
ingress.tls | Define tls secrets for hosts (implementation not done yet) | [] |
Other Parameters
Name | Description | Value |
---|---|---|
serviceAccount.create | Enable creation of ServiceAccount for Drill pod | true |
serviceAccount.annotations | Additional custom annotations for the ServiceAccount | {} |
serviceAccount.name | The name of the ServiceAccount to use. | "" |
podAnnotations | Annotations for Drill pods | {} |
podLabels | Extra labels for Drill pods | {} |
podSecurityContext | Set Drill pod's Security Context | {} |
securityContext | Set Drill container's Security Context | {} |
Drill configuration
Configuring Drill can be done with override files or in the web ui, although some properties can only be set in the override file. When using the web ui, ZooKeeper will be used to store the values. Make sure that the storage of ZooKeeper is persistent if you intend to configure this way.
This is an example where the web ui and authentication for local (plain) users is enabled.
drill.exec: {
http.enabled: true,
impersonation: {
enabled: true,
max_chained_user_hops: 3
},
security: {
auth.mechanisms: ["PLAIN"]
},
security.user.auth: {
enabled: true,
packages += "org.apache.drill.exec.rpc.user.security",
impl: "pam4j",
pam_profiles: [ "sudo", "login" ]
}
}
For more options refer to the Apache Drill documentation.
Name | Description | Value |
---|---|---|
drill.drivers | JDBC Drivers can be configured to download here. This can be used if the Docker image doesn't contain the correct drivers | [] |
drill.drivers.name | The name of the driver, will be used as filename (with .jar appended) and as name for initContainer | "" |
drill.drivers.url | The URL to download the driver from | "" |
drill.drivers.noCheckCertificate | Skip certificate check | "" |
drill.overrideConfiguration.existingConfigMap | The name of the configmap, containing configuration files to override | "" |
drill.overrideConfiguration.drill | Multiline value for drill-override.conf | |
drill.overrideConfiguration.drillMetastore | Multiline value for drill-metastore-override.conf | "" |
drill.overrideConfiguration.drillOnYarn | Multiline value for drill-on-yarn-override.conf | "" |
drill.overrideConfiguration.drillSqlLine | Multiline value for drill-sqlline-override.conf | "" |
drill.overrideConfiguration.storagePlugins | Multiline value for storage-plugins-override.conf Can also be configured in the Web UI and saved by persistent ZooKeeper | "" |
drill.authentication.existingSecret | Name of the secret containing a passwd file | "" |
drill.authentication.users | Users to create on the system | [] |
drill.authentication.users.name | Username for the user | "" |
drill.authentication.users.password | Password for the user | "" |
drill.authentication.users.admin | Configures if the user should be admin | "" |
extraVolumes | Optionally specify extra list of additional volumes for Drill pods | [] |
extraVolumeMounts | Optionally specify extra list of additional volumeMounts for Drill container(s) | [] |
Persistence
Persistence is used for logging and for JDBC drivers. These can be configured separately.
Configuration for Drill will be saved in ZooKeeper. Make sure that ZooKeeper is persistent if you want to keep changes in the Web UI.
Name | Description | Value |
---|---|---|
persistence.enabled | Enable persistence using Persistent Volume Claims | false |
persistence.storageClass | Persistent Volume storage class | "" |
persistence.accessModes | Persistent Volume access modes | [] |
persistence.size | Persistent Volume size | 2Gi |
persistence.dataSource | Custom PVC data source | {} |
persistence.existingClaim | The name of an existing PVC to use for persistence | "" |
persistence.selector | Selector to match an existing Persistent Volume for Drill's data PVC | {} |
persistence.annotations | Persistent Volume Claim annotations | {} |
persistence.dataLogDir.size | PVC Storage Request for Drill's dedicated data log directory | 2Gi |
persistence.dataLogDir.existingClaim | The name of an existing PVC to use for persistence | "" |
persistence.dataLogDir.selector | Selector to match an existing Persistent Volume for Drill's data log PVC | {} |
Notable changes
1.2.7
The notation for .Values.service
has been changed. This makes it possible to configure web
and user
services separately.
1.2.6
.Values.replicaCount
has been changed from 3
to 1
. This is to default to a less complex install.
Having three replicas introduces some complexity regarding, authentication, logging and how queries will be executed.
Until features have been added to simple this, the user needs to take these things into account.
The notation for .Values.persistence
has changed so storage for logs and data can be configured secretly.
The values for persistent logging are now located at .Values.persistence.dataLogDir
.