WARN! Kubernetes is NOT for you!
Extracted from @[https://pythonspeed.com/articles/dont-need-kubernetes/]
"""...
BºMicroservices are an organizational scaling technique: when you have º
Bº500 developers working on one live website, it makes sense to pay the º
Bºcost of a large-scale distributed system if it means the developer º
Bºteams can work independently. So you give each team of 5 developers a º
Bºsingle microservice, and that team pretends the rest of the º
Bºmicroservices are external services they can’t trust. º
RºIf you’re a team of 5 and you have 20 microservices, and you º
Rºdon’t have a very compelling need for a distributed system, º
Rºyou’re doing it wrong. Instead of 5 people per service like the big º
Rºcompany has, you have 0.25 people per service. º
"""
- Extracted from Local Persistent Volume documentation:
""" Because of these constraints, Rºit’s best to exclude nodes with º
Rºlocal volumes from automatic upgrades or repairsº, and in fact some
cloud providers explicitly mention this as a best practice. """
Basically most of the benefits of k8s are lost for apps managing
storage at App level. This is the case with most modern DDBBs and stream
architectures (PostgreSQL, MySQL, kafka, p2p-alike Ethereum, ....).
- As a reference Google, original creator of K8s, launches
Bº2 billion containers per week!!!º in its infrastructures.
This is not excatly a normal IT load.
- See also:
@[https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing]
1.- "The network is reliable"
2.- "Latency is zero"
3.- "Bandwidth is infinite"
4.- "The network is secure"
5.- "Topology doesn't change"
6.- "There is one administrator"
7.- "Transport cost is zero"
8.- "The network is homogeneous"
See also: Reasons NOT to use k8s:
https://thenewstack.io/question-the-current-dogma-is-kubernetes-hyper-scale-necessary-for-everyone/
Why Coinbase Is Not Using Kubernetes to Run Their Container Workloads
https://www.infoq.com/news/2020/07/coinbase-kubernetes/
WARN! Kubernetes is for you!
- You plan to have a running infrastructure for years,
and you know in advance that you will need to automate
lot of RESTfull deployments in an standard way.
- You want to bet for a technology that is well known
by software industry, so you can hire people that already
are familiar with it.
- Distributed systems are complex are is better to skip,
but you can not avoid them. Then k8s is the most familiar
approach.
- Your company has different isolated teams with random knowledge
and skills, from finances, to science, to marketing, .....
They all want a common base and future-resistent virtual
computing infrastructure.
They all want to share computer resources, and balance
computation when needed, as well as reuse DevOps knowledge
for databases, web servers, networking, ...
- Some k8s operator automate all the life-cycle of software
management for a given database, CMS, ... You want to profit
from such existing operator and the knowledge and experience
of the operator developers.
External Links
- About Desired State management @[https://www.youtube.com/watch?v=PH-2FfFD2PU]
- Online training:
@[https://www.katacoda.com/courses/kubernetes/playground]
@[https://labs.play-with-k8s.com/]
- @[https://kubernetes.io/]
- @[https://kubectl.docs.kubernetes.io/] ← Book
- @[https://kubernetes.io/docs/reference/] (Generated from source)
- @[https://kubernetes.io/docs/reference/glossary/]
- @[https://kubernetes.io/docs/reference/tools/]
- Most voted questions k8s on serverfault:
@[https://serverfault.com/questions/tagged/kubernetes?sort=votes]
- Most voted questions k8s on stackoverflow:
@[https://stackoverflow.com/questions/tagged/kubernetes?sort=votes]
- Real app Examples:
@[https://github.com/kubernetes/examples]
ºk8s Issues:º
- @[https://kubernetes.io/docs/reference/issues-security/issues/]
- @[https://kubernetes.io/docs/reference/issues-security/security/]
ºk8s Enhancements/Backlogº
- @[https://github.com/kubernetes/enhancements]
ºSIG-Appsº
- @[https://github.com/kubernetes/community/blob/master/sig-apps/README.md]
Special Interest Group for deploying and operating apps in Kubernetes.
- They meet each week to demo and discuss tools and projects.
""" Covers deploying and operating applications in Kubernetes. We focus
on the developer and devops experience of running applications in
Kubernetes. We discuss how to define and run apps in Kubernetes, demo
relevant tools and projects, and discuss areas of friction that can lead
to suggesting improvements or feature requests """
- Yaml Tips:
@[https://www.redhat.com/sysadmin/yaml-tips]
Summary ref:@[https://kubernetes.io/docs/concepts/overview/components/] @[https://kubernetes.io/docs/reference/#config-reference] - k8s orchestates pools of CPUs, storage and networks Kubernetes Cluster Components: │ºetcdº: k8s "brain memory" │ º1+ Masterº: │- High availability key/value│ ^ cluster of masters │ddbb used to save the cluster│ - manages the cluster. │metadata, service registrat. │ - keeps tracks of: │and discovery │ -ºdesired stateº └─────────────────────────────┘ - application scalation - rolling updates - Raft consensus used in multimaster mode (requires 1,3,5,... masters @[https://raft.github.io/]) └────────┬─────────────────┘ │ ┌───────────────┬────────────┬────────┴────────────────┬──────────┬─────────────────────────────┐ ºkube─apiserver:º │ºkube─controller─managerº │ºkube-schedulerº ºcloud─controller─managerº ─ (REST) Wrapper around│ ─Oº:EMBEDS K8S CORE CONTROL LOOPS!!!º│ ─ Assigns workloads to nodes ─ k8s 1.6Rºalpha featureº k8s objects │ ─ºhandles a number of controllers:º │ ─ tracks host-resource ussage ─ runs controllers - Listen for management│ ─ regulating the state of cluster │ ─ tracks total resources interacting with underlying tools (kubectl, │ ─ perform routine tasks │ available per server and cloud provider dashboard,...) │ ─ ensures number of replicas for │ resources allocated to ─ affected controllers │ a service, │ existing workloads assigned ─ Node Controller │ ─ ... │ to each server ─ Route Controller │ │ ─ Manages availability, ─ Service Controller │ │ performance and capacity ─ Volume ControllerCluster │ │ ºfederation-apiserverº ºfederation-controller-managerº - API server for federated - embeds the core control loops shipped clusters with Kubernetes federation. (RºWARNº: An "Application" will contain also Controllers like replica-sets, batch ... not shown in next diagram) │ K8S │ │CLUSTER│ 1 ←··→ N│Namespace│ 1 ←───→ NBº│Application│º └┬──────┘ ========= ============ ├ etcd "Virtual cluster" "Application" is an informal term in k8s. ├ Master º*1º In practice it could be something similar to: └ Node Pool 1 DDBB Deployment → (0+ PVCs, Service, ConfigMaps, Secrets, ...) └┬──────┘ + 1 ESB Deployment → (0+ PVCs, Service, ConfigMaps, Secrets, 0-1 Ingress,...) ├ node 1 + 1 Middlewa.Deployment → (0+ PVCs, Service, ConfigMaps, Secrets, 1 Ingress,...) ├ node 2 + ... └────┬───┘ └ ... N A deployment defines: └┬───┘ - (OCI) image/s to deploy in pods. ├ kubelet agent º*1º - Desired states for pods (Containers, Num. or replicas, live-probes, ...) ├ Container Runtime (Docker/rkt/...) - Rolling policies └ kube-proxy - ... ---------- - ... ^ - Gº"SDN enabler"º · - simple or round-robin TCP/UDP · stream forwarding across set · of back-ends. · - Listen for new Service/Service.EndPoint · events from ºMasterº to update · Service Virtual IPs to · real EndPoint IPs with the help of · proxy-ports, iptables,.... · Node VM/computer serving as a worker machine where pods. To get External IPs of all nodes: $º$ kubectl get nodes -o jsonpath=\ º $º '{.items[*].status.addresses[?(@.type=="ExternalIP")].address}'º º*1º kubelet agents takes a set of PodSpecs and ensures thate described containers are running and healthy. Pod 1 ←················→ 1+ Container ==== ========= · Minimum Unit of Exec/Sched. - executable OCI ("Docker") image · At creation time resource -ºLifecycle Hooksº limits for CPU/memory are - PostStart defined (min and max-opt) - PreStop · Ussually Pod 1←→1 Container - containers in a Pod are co-located/co-scheduled on the same cluster node. (rarely 2 when such - type Container struct { containers are closely Name string related to make Pod work) Image string ·RºPods are ephemeral.º Command []string Deploy Controller+ Args [] string Pods is high available. WorkingDir string Ports []ContainerPort · Define shared resources EnvFrom []EnvFromSource for its container/s: Env []EnvVar o Volumes: (Shared Storage) Resources ResourceRequirements o Networking: (Sharedºuniqueº VolumeMounts []VolumeMount ºcluster IPº) VolumeDevices []VolumeDevice o Metadata: (Container image LivenessProbe *Probe ←······· kubelet exec the probes version, ports to use,...) ReadinessProbe *Probe - ExecAction Lifecycle *Lifecycle - TCPSocketAction · Represent a "logical host" TerminationMessagePath string - HTTPGetAction TerminationMessagePolicy TerminationMessagePolicy · A Pod is tied to the Node where ImagePullPolicy PullPolicy it has been scheduler to run, SecurityContext *SecurityContext and remains there until Stdin bool termination. } · In case of failure, controllers can be scheduled to run on other available Nodes in the cluster. ·ºControllers takes charge ofº ºscalling /"healing" pods. º REF: @[https://kubernetes.io/docs/Architecture/architecture_map.html?query=service+registration+discovery] ┌───────────┐ ─ Describes the resources available ─┐ │Node Status│ ─ CPU │ ├───────────┤ ─ memory │ │Addresses ←─(HostName, ExternalIP, InternalIP) ─ max.number of pods supported └───────→Capacity │ ^^^^^^^^^^ ┌───────────────────────────────────────────→Condition │ Typically visible status ofºallºRunning nodes. │Info ← General info. like within cluster Node Description └───────────┘ kernel/k8s/docker ver, ----------------- ------------------------------------- OS name,... OutOfDisk True → insufficient free space on the node for adding new pods Ready True → node is healthy and ready to accept pods if not True after ºpod─eviction─timeoutº (5 minutes by default) an argument is passed to the kube-controller-manager and all Pods MemoryPressure True → pressure exists on the node memory PIDPressure True → pressure exists on the processes DiskPressure True → pressure exists on the disk size NetworkUnavailable True → node network not correctly configured
Network REF:@[https://kubernetes.io/docs/concepts/architecture/cloud-controller/] NETWORK SCHEMA WITHOUT CLOUD-CONTROLLER | NETWORK SCHEMAºWITHº CLOUD-CONTROLLER: o Cloud connector _ _ | o Cloud connector _ _ ___| | ___ _ _ __| | | ___| | ___ _ _ __| | kube─controller─managero──→ / __| |/ _ \| | | |/ _` | | cloud-controller─managero──→ / __| |/ _ \| | | |/ _` | ^ | (__| | (_) | |_| | (_| | | ^ | (__| | (_) | |_| | (_| | │ \___|_|\___/ \__,_|\__,_| | kube─controller─manager │ \___|_|\___/ \__,_|\__,_| │ ^ | ^ │ │ │ | │ │ │ │ | └─────────────────┤ v │ node | │ etcd ←───→ kube─apiserver ←──┬─┐ ┌─o────────┐ | v node │ └──→kubelet ←───┐ | etcd ←───────→ kube─apiserver ←──┬─┐ ┌──────────┐ ^ │ │ │ │ | │ └─→kubelet ←───┐ │ │ │ │ │ | ^ │ │ │ │ v │ │ │ │ | │ │ │ │ │ kube─scheduler └────→kube─proxy│ │ | v │ │ │ │ └──────────┘ │ | kube─scheduler └───→kube─proxy│ │ │ | └──────────┘ │ │ ┌────────┐ │ └─→Image ←───────────────────────────────────────────────┘ │Registry│ └────────┘
Object Management REF: - @[https://kubernetes.io/docs/tasks/tools/install-kubectl/] - @[https://kubernetes.io/docs/concepts/overview/object-management-kubectl/overview/] - @[https://kubernetes.io/docs/reference/kubectl/cheatsheet/] º┌──────────┐º ┌───────────┐ º│k8s object│º ┌─→│metadata │ º├──────────┤º │ ├───────────┤ º│kind │º one of theGºResource typesº │ │name │← maps to /api/v1/pods/name º│metadata ←───────────────────────────────┘º │UID │← Distinguish between historical º│spec │← desired stateº │namespace │ occurrences of similar object º│state │← present stateº │labels │ º└──────────┘º │annotations│ └───────────┘ Bºmetadata is organized around the concept of an application.º k8s does NOT enforce a formal notion of application. Apps are described thorugh metadata in a loose definition. etcd kubernetes (/openapi/v2) cli management: (serialized ←────── objects ←────────────────→ API ←───→ $ºkubectlº'action'Gº'resource'º API resource - Represent API resources Serverº*1º ^^^^^^ states) (persistent entities ºget º: list resources in a cluster) ºdescribeº: show details (and events for pods) - They can describe: ºlogs º: print container logs - apps running on nodes ºexec º: exec command on - resources available to container a given app ºapply º: creates and updates resources - policies around apps ... (restart, upgrades, ...) common kubectl flags: $º--all-namespaces º $º-o wide º $º--include-uninitialized º $º--sort-by=.metadata.name º $º--sort-by='.status.containerStatuses[0].restartCount'º $º--selector=app=cassandra º º*1º: @[https://kubernetes.io/docs/conceptsverview/kubernetes-api/]
Resource Types GºRESOURCE TYPES SUMMARYº clusters │podtemplates │statefulsets (cs)componentstatuses │(rs)replicasets │(pvc)persistentvolumeclaims (cm)configmaps │(rc)replicationcontrollers │(pv) persistentvolumes (ds)daemonsets │(quota)resourcequotas │(po) pods (deploy)deployments │cronjob │(psp)podsecuritypolicies (ep)endpoints │jobs │secrets (ev)event │(limits)limitranges │(sa)serviceaccount (hpa)horizon...oscalers │(ns)namespaces │(svc)services (ing)ingresses │networkpolicies │storageclasses │(no)nodes │thirdpartyresources GºRESOURCE TYPES EXTENDEDº @[https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/core/types.go] 001 Volume 021 FlexPersistentVolumeSource 041 DownwardAPIProjection 002 VolumeSource 022 FlexVolumeSource 042 AzureFileVolumeSource 003 PersistentVolumeSource 023 AWSElasticBlockStoreVolumeSource 043 AzureFilePersistentVolumeSource 004 PersistentVolumeClaimVolumeSource 024 GitRepoVolumeSource 044 VsphereVirtualDiskVolumeSource 005 PersistentVolume 025 SecretVolumeSource 045 PhotonPersistentDiskVolumeSource 006 PersistentVolumeSpec 026 SecretProjection 046 PortworxVolumeSource 007 VolumeNodeAffinity 027 NFSVolumeSource 047 AzureDiskVolumeSource 008 PersistentVolumeStatus 028 QuobyteVolumeSource 048 ScaleIOVolumeSource 009 PersistentVolumeList 029 GlusterfsVolumeSource 049 ScaleIOPersistentVolumeSource 010 PersistentVolumeClaim 030 GlusterfsPersistentVolumeSource 050 StorageOSVolumeSource 011 PersistentVolumeClaimList 031 RBDVolumeSource 051 StorageOSPersistentVolumeSource 012 PersistentVolumeClaimSpec 032 RBDPersistentVolumeSource 052 ConfigMapVolumeSource 013 PersistentVolumeClaimCondition 033 CinderVolumeSource 053 ConfigMapProjection 014 PersistentVolumeClaimStatus 034 CinderPersistentVolumeSource 054 ServiceAccountTokenProjection 015 HostPathVolumeSource 035 CephFSVolumeSource 055 ProjectedVolumeSource 016 EmptyDirVolumeSource 036 SecretReference 056 VolumeProjection 017 GCEPersistentDiskVolumeSource 037 CephFSPersistentVolumeSource 057 KeyToPath 018 ISCSIVolumeSource 038 FlockerVolumeSource 058 LocalVolumeSource 019 ISCSIPersistentVolumeSource 039 DownwardAPIVolumeSource 059 CSIPersistentVolumeSource 020 FCVolumeSource 040 DownwardAPIVolumeFile 060 CSIVolumeSource 061 ContainerPort 081 Handler 101 PreferredSchedulingTerm 062 VolumeMount 082 Lifecycle 102 Taint 063 VolumeDevice 083 ContainerStateWaiting 103 Toleration 064 EnvVar 084 ContainerStateRunning 104 PodReadinessGate 065 EnvVarSource 085 ContainerStateTerminated 105 PodSpec 066 ObjectFieldSelector 086 ContainerState 106 HostAlias 067 ResourceFieldSelector 087 ContainerStatus 107 Sysctl 068 ConfigMapKeySelector 088 PodCondition 108 PodSecurityContext 069 SecretKeySelector 089 PodList 109 PodDNSConfig 070 EnvFromSource 090 NodeSelector 110 PodDNSConfigOption 071 ConfigMapEnvSource 091 NodeSelectorTerm 111 PodStatus 072 SecretEnvSource 092 NodeSelectorRequirement 112 PodStatusResult 073 HTTPHeader 093 TopologySelectorTerm 113 Pod 074 HTTPGetAction 094 TopologySelectorLabelRequirement 114 PodTemplateSpec 075 TCPSocketAction 095 Affinity 115 PodTemplate 076 ExecAction 096 PodAffinity 116 PodTemplateList 077 Probe 097 PodAntiAffinity 117 ReplicationControllerSpec 078 Capabilities 098 WeightedPodAffinityTerm 118 ReplicationControllerStatus 079 ResourceRequirements 099 PodAffinityTerm 119 ReplicationControllerCondition 080 Container 100 NodeAffinity 120 ReplicationController 121 ReplicationControllerList 141 DaemonEndpoint 161 Preconditions 181 ResourceQuotaSpec 122 ServiceList 142 NodeDaemonEndpoints 162 PodLogOptions 182 ScopeSelector 123 SessionAffinityConfig 143 NodeSystemInfo 163 PodAttachOptions 183 ScopedResourceSelerRequirement 124 ClientIPConfig 144 NodeConfigStatus 164 PodExecOptions 184 ResourceQuotaStatus 125 ServiceStatus 145 NodeStatus 165 PodPortForwardOptions 185 ResourceQuota 126 LoadBalancerStatus 146 AttachedVolume 166 PodProxyOptions 186 ResourceQuotaList 127 LoadBalancerIngress 147 AvoidPods 167 NodeProxyOptions 187 Secret 128 ServiceSpec 148 PreferAvoidPodsEntry 168 ServiceProxyOptions 188 SecretList 129 ServicePort 149 PodSignature 169 ObjectReference 189 ConfigMap 130 Service 150 ContainerImage 170 LocalObjectReference 190 ConfigMapList 131 ServiceAccount 151 NodeCondition 171 TypedLocalObjectReference 191 ComponentCondition 132 ServiceAccountList 152 NodeAddress 172 SerializedReference 192 ComponentStatus 133 Endpoints 153 NodeResources 173 EventSource 193 ComponentStatusList 134 EndpointSubset 154 Node 174 Event 194 SecurityContext 135 EndpointAddress 155 NodeList 175 EventSeries 195 SELinuxOptions 136 EndpointPort 156 NamespaceSpec 176 EventList 196 WindowsSecurityContextOptions 137 EndpointsList 157 NamespaceStatus 177 LimitRangeItem 197 RangeAllocation 138 NodeSpec 158 Namespace 178 LimitRangeSpec 139 NodeConfigSource 159 NamespaceList 179 LimitRange 140 ConfigMapNodeConfigSource 160 Binding 180 LimitRangeList
k8s (def.)TCP Ports REF @[https://kubernetes.io/docs/tasks/tools/install-kubeadm/] ┌───────┬───────┐ │Master │Worker │ │node(s)│node(s)│ ┌──────────────────────────────────────┼───────┼───────┤ │Port Range │ Purpose │ X │ │ ├──────────────────────────────────────┼───────┼───────┤ │6443* │ Kubernetes API server │ X │ │ ├──────────────────────────────────────┼───────┼───────┤ │2379─2380 │ etcd server client API │ X │ │ ├──────────────────────────────────────┼───────┼───────┤ │10250 │ Kubelet API │ X │ X │ ├──────────────────────────────────────┼───────┼───────┤ │10251 │ kube─scheduler │ X │ │ ├──────────────────────────────────────┼───────┼───────┤ │10252 │ kube─controller─manager│ X │ │ ├──────────────────────────────────────┼───────┼───────┤ │10255 │ Read─only Kubelet API │ X │ X │ ├──────────────────────────────────────┼───────┼───────┤ │ 30000─32767 │ NodePort Services │ │ X │ └──────────────────────────────────────┴───────┴───────┘
Application Troubleshooting
REF: @[https://learnk8s.io/]
- @[https://kubernetes.io/docs/tasks/debug-application-cluster/debug-application/]
On a separate (tmux)window monitor kubectl resources ike:
$º$ watch -n 4 "kubectl get pods,services,... -n $namespace" º
$º$ kubectl get pods º
└─────┬────────┘
QºQ: Is there any pending Pod?º
└─────────┬────────────────┘
┌─────────┘
├ YES → $º$ kubectl describe º←QºQ: Is the cluster full?º
│ $º pod $pod_name º └────────┬────────────┘
│ ┌─────────────────────────────┘
│ ├ NO →QºQ: Are you hitting ResourceQuotaLimits?º
│ │ └────────────────────┬────────────────┘
│ │ ┌─────────────────────┘
│ │ ├ NO →QºQ: Are you mounting a PENDINGº
│ │ │ Qº PersistentVolumeClaim?º
│ │ │ (kubectl describe pod $pod will show an even
│ │ │ "pod has unbound immediate PersistenceVolumeClaim")
│ │ │ └───────────┬────────────────┘
│ │ │ ┌────────────┘
│ │ │ ├ NO →$º$ kubectl get pods º←QºQ: Is the Pod assignedº
│ │ │ │ $º -o wide º Qº to the Node?º
│ │ │ │ └────┬───────────────┘
│ │ │ │ ┌─────────────────────────────┘
│ │ │ │ ├ YES →RºThere is an issue with the Kubeletº
│ │ │ │ │
│ │ │ │ └ NO →RºThere is an issue with the Schedulerº
│ │ │ │
│ │ │ └ YES →BºFix the PersistentVolumeClaimº
│ │ │
│ │ └ YES →BºRelax Quota Limitsº
│ │
│ └ YES →BºProvision a bigger clusterº
│
└ NO →QºQ: Are the Pods Running?º
└──────┬───────────────┘
┌───────────────┘
├ NO → $º$ kubectl logs $pod_name º←QºQ: Can you see the logsº
│ ↑ Qº for the App?º
│ │ └─────────┬───────────┘
│ │ ┌────────────────────────────────────┘
│ │ ├ Yes →BºFix the issue in the Appº
│ │ └ NO →QºQ: Did the container died too Quickly?º
│ │ └──────┬─────────────────────────────┘
│ │ ┌───────────┘
│ │ ├ NO → $º$ kubectl describe º←QºQ: Is the Pod in statusº
│ │ │ $º pod $pod_name º Qº ImagePullBackOff? º
│ │ │ └──┬──────────────────┘
│ │ │ ┌────────────────────────────────┘
│ │ │ ├ NO →QºQ: Is the Pode Status CrashLoopBackOff?º
│ │ │ │ └─────────┬─────────────────────────┘
│ │ │ │ ┌────────────────┘
│ │ │ │ ├ NO →QºQ: Is the Pod status RunContainerError?º
│ │ │ │ │ └─────────┬───────────────────────────┘
│ │ │ │ │ ┌───────────────┘
│ │ │ │ │ ├ NO →BºConsult StackOverflowº
│ │ │ │ │ │
│ │ │ │ │ └ YES →BºThe issue is likely to be withº
│ │ │ │ │ Bºmounting volumesº
│ │ │ │ │
│ │ │ │ └ YES →QºQ: Did you inspect the logs and fixed the crashes?º
│ │ │ │ $º$ kubectl logs --previous $POD_NAMEº
│ │ │ │ (--previous: See logs of chrased pod)
│ │ │ │ └─────────┬─────────────────────────┘
│ │ │ │ ┌───────────────┘
│ │ │ │ ├ NO →BºFix the app crahsesº
│ │ │ │ │
│ │ │ │ └ YES →QºQ: Did you forget the 'CMD' instructionº
│ │ │ │ Qº in the Dockerfile? º
│ │ │ │ └──────────┬──────────────────────────┘
│ │ │ │ ┌────────────────┘
│ │ │ │ ├ YES →BºFix the Dockerfileº
│ │ │ │ │
│ │ │ │ └ NO →QºQ: Is the Pod restarting frequently?º
│ │ │ │ Qº Cycling between Running and º
│ │ │ │ Qº CrashLoopBackoff?º
│ │ │ │ └──────────┬────────────────────────┘
│ │ │ │ ┌──────────────────┘
│ │ │ │ ├ YES →BºFix the liveness probeº
│ │ │ │ │
│ │ │ │ └ NO →RºUnknown Stateº
│ │ │ │
│ │ │ └ YES →QºQ: Is the name of the image correct?º
│ │ │ └────┬─────────────────────────────┘
│ │ │ ┌────────────┘
│ │ │ ├ NO →BºFix the image nameº
│ │ │ │
│ │ │ └ YES →QºQ: Is the image tag valid?º
│ │ │ Qº Does it exists?º
│ │ │ └──┬──────────────────────┘
│ │ │ ┌───────────┘
│ │ │ ├ NO →BºFix the tagº
│ │ │ │
│ │ │ └ YES →QºQ: Are you pulling images from aº
│ │ │ Qº private registry?º
│ │ │ └─────────┬────────────────────┘
│ │ │ ┌─────────────────┘
│ │ │ ├ NO →BºThe Issue could be with CRI|Kubeletº
│ │ │ │
│ │ │ └ YES →BºConfigure pulling images from aº
│ │ │ Bºprivate registryº
│ │ │
│ │ └ YES → $º$ kubectl logs $pod_name --previous º
│ │ └────────────┬──────────────────┘
│ └───────────────────────────────┘
└ YES →QºQ: Are ther Pods READY?º
└──┬──────────────────┘
┌───────────┘
├ NO → $º$ kubectl describe º←QºQ: Is the Readiness probeº
│ $º pod $pod_name º Qº failing?º
│ └──┬────────────────────┘
│ ┌───────────────────────┘
│ ├ YES →BºFix the Readiness Probeº
│ │
│ └ NO →RºUnknown Stateº
│
└ YES → $º$ kubectl port-forward \ º ←QºQ: Can you access the app?º º*1º
$º $pod_name 8080:$pod_port º └────────────┬───────────┘
│
º*1ºTIP: Test with a command similar to $º$ wget localhost:8080/... º
if error Rº"... error forwarding port 8080 to pod 1234..."º
is displayed check that pod 1234... is the intended one by
executing $º kubectl describe pods º and checking pod number
is correct. If it isn't, delete and recreate the service.
try again until to see if $ºwget ...º works. Continue next
checks otherwise.) │
│
┌─────────────────────────────────────────────────────────┘
└ QºQ: Do you have 2+ different deployments with colliding selector names?º
(this can be the case in "complex" Apps composed of N deployments)
└──────────────────────┬─────────────────────────────────────────┘
┌─────────────────────────────┘
├ YES: Fix one or more deployments to avoid colliding selectors
│ For example if two deployments have a selector labels like
│ 'app: myApp' split into 'app: myApp-ddbb' and 'app: myApp-frontend' or
│ selector 1: selector 2:
│ 'app: myapp' 'app: myapp'
│ 'layer: ddbb' 'layer: frontend'
│ update both on Deployment and related services
│
├ NO →QºQ: Is the port exposed by container correctº
│ Qº and listening on 0.0.0.0?º
│ You can check it like: (-c flag optional for 1 container pods)
│ $º$ kubectl exec -ti $pod -c $container -- /bin/sh º
│ $º# netstat -ntlp º
│ └────────────────────┬────────────────────┘
│ ┌─────────────────────┘
│ ├ NO →BºFix the app. It should listen onº
│ │ Bº0.0.0.0.º
│ │ BºUpdate the containerPortº
│ │
│ └ YES →RºUnknown Stateº
│ Try debugging issues in cluster:
│ $º$ SELECTOR="" º
│ $º$ SELECTOR="${SELECTOR},status.phase!=Running" º
│ $º$ SELECTOR="${SELECTOR},spec.restartPolicy=Always" º
│ $º$ kubectl get pods --field-selector=${SELECTOR} \ º ← Check failing pods in
│ $º -n kube-system º kube-system namespace
↓
YES
↓
Bº***************************º
Bº*POD ARE RUNNING CORRECTLY*º
Bº***************************º
└──────────┬────────────┘
↓
$º$ kubectl describe \ º ←QºQ: Can you see a list of endpoints?º
$º service $SERVICE_NAME º └────────────┬────────────────────┘
┌─────────────────────────────────────────┘
├ NO →QºQ: Is the Selector matching the right Pod label?º
│ └──┬───────────────────────────────────────────┘
│ ┌─────┘
│ ├ NO → Fix the Service selector to match targeted-Pod labels
│ │
│ └ YES →QºQ: Does the Pod have an IP address assigned?º
│ └──┬───────────────────────────────────────┘
│ ┌─────┘
│ ├ NO → There is an issue with
│ │ the Controller Manager
│ │
│ └ YES → There is an issue with the Kubelet
│
└ YES →$º$ kubectl port-forward \ º←QºQ: Can you visit the app?º
$º service/$SERVICE_NAME \ º └──┬────────────────────┘
$º 8080:$SERVICE_PORT º │
│
┌───────────────────────────────────────────┘
├ NO →QºQ: Is the targetPort on the Service º
│ Qº matching the containerPort in theº
│ Qº Pod?º
│ └──┬────────────────────────────────┘
│ ┌─────┘
│ ├ NO →BºFix the Service targetPort andº
│ │ Bºthe containerPodº
│ │
│ └ YES →BºThe issue could be with Kube Proxyº
↓
YES
↓
Bº******************************º
Bº*SERVICE IS RUNNING CORRECTLY*º
Bº******************************º
↓
$º$ kubectl describe \ º←QºQ: Can you see a list of Backends?º
$º ingress $INGRESS_NAME º └──┬─────────────────────────────┘
┌──────────────────────────────┘
├ NO →QºQ: Are the serviceName and servicePortº
│ Qº mathcing the service?º
│ └──┬─────────────────────────────────┘
│ ┌───┘
│ ├ NO →BºFix the ingress serviceName and servicePortº
│ │
│ └ YES →BºThe issue is specific to the Ingress Controllerº
│ BºConsult the docs for your Ingressº
↓
YES
↓
Bº**********************************º
Bº*THE INGRESS IS RUNNING CORRECTLY*º
Bº**********************************º
Bº(The app should be working now!!!)º
│
↓ ┌ NO →RºThe issue is likely to be with the º
│ RºInfrastructure and how the cluster isº
QºQ: Can you visit app º→ ─┤ Rºexposedº
Qº from the Internet?º │
│ Bº*****º
└ YES → Bº*END*º
Bº*****º
Kubectl
- @[https://kubernetes.io/docs/reference/kubectl/jsonpath/]
@[https://kubernetes.io/docs/reference/kubectl/overview/]
@[https://kubernetes.io/docs/reference/kubectl/kubectl/]
@[https://kubernetes.io/docs/reference/kubectl/kubectl-cmds/]
@[https://kubernetes.io/docs/reference/kubectl/conventions/]
@[https://kubernetes.io/docs/reference/kubectl/docker-cli-to-kubectl/]
$º$ KUBE_EDITOR="vim" º
$º$ source ˂(kubectl completion bash) º
$º$ source ˂(kubectl completion zsh) º
- use multiple kubeconfig files
$º$ KUBECONFIG=~/.kube/config:~/.kube/kubconfig2 \ º
$º kubectl config view # Show Merged kubeconfig º
$º # settings º
$º$ kubectl config current-context º
$º$ kubectl config use-context my-cluster-name º
$º$ kubectl run nginx --image=nginx º
$º$ kubectl explain pods,svc º
$º$ kubectl edit svc/my-service-1 # Edit resource º
ºkubectl Field Selectorsº
- filter k8s objects based on the value of one or more object fields.
- the possible field depends on each object type/kind but
all resource-types support 'metadata.name' and 'metadata.namespace'
Ex:
$º$ kubectl get pods --field-selector metadata.name==my-service,spec.restartPolicy=Alwaysº
$º$ kubectl get pods --field-selector metadata.namespace!=default,status.phase==Running º
Secrets
(Handling passwords, OAuth tokens, ssh keys, ... in k8s)
REFs: @[https://kubernetes.io/docs/concepts/configuration/secret/]
@[https://github.com/kubernetes/community/blob/master/contributors/design-proposals/auth/secrets.md]
@[https://kubernetes.io/docs/tasks/inject-data-application/distribute-credentials-secure/]
- At runtime Pods reference the secrets in 2 ways:
- as files in a mounted volume.
- as ENVIRONMENT variables.
- Also used by kubelet when pulling images for the pod
-ºk8s Built-in Secretsº
- Service Accounts automatically create and attach secrets with API Credentials
- k8s automatically creates secrets with credentials granting API acces
BºNOTEº: Pods are automatically modified to use them.º
BºU
ºCreating User Secretsº
ºSTEP 1: declare secretsº
┌───────────────────────────────────────┬──────────────────────────────────────────┬────────────────────────────────────────
│ ALTERNATIVE 1 │ ALTERNATIVE 2 │ ALTERNATIVE 3
│ Create un-encrypted secrets locally │ Create Secret with data │ Create Secret with stringData
│ │ │
│ $ echo -n 'admin' ˃ ./username.txt │ cat ˂˂ EOF ˃ secret.yaml │ apiVersion: v1
│ $ echo -n '1f2d....' ˃ ./password.txt │ apiVersion: v1 │ kind:ºSecretº
│ ^^^^^^^^^^^^ │ kind:ºSecretº │ metadata:
│ special chars. must be │ metadata: │ name: mysecret
│ '\' escaped. │ name: mysecret │ type: Opaque
│ │ type: Opaque │ stringData:
│ │ data: │ config.yaml: |-
│ │ username: $(echo -n 'admin' | base64)│ apiUrl: "https://my.api.com/api/v1"
│ │ password: $(echo -n '1f2..' | base64)│ username: admin
│ │ EOF │ password: 1f2...
ºSTEP 2: "Inject" to k8s cluster º
┌───────────────────────────────────────┬──────────────────────────────────────────┬────────────────────────────────────────
│ Package into a Secret k8s object │$º$ kubectl apply -f ./secret.yamlº │$º$ kubectl apply -f ./secret.yamlº
│$º$ kubectlºcreate secretº\ º │ │
│$º genericGºdb-user-passº \ º │ │
│$º --from-file=./username.txt \ º │ │
│$º --from-file=./password.txt º │ │
ºSTEP 3: check created secretº
┌──────────────────────────────────────────┐
│$º$ kubectl get secrets º│
│ → NAME TYPE DATA AGE │
│Gº→ db-user-pass Opaque 2 51s º│
│ │
│$º$ kubectl describe secrets/db-user-passº│
│ → Name: Gºdb-user-passº │
│ → Namespace: default │
│ → ... │
│ → Type: Opaque │
│ → │
│ → Data │
│ → ==== │
│ →ºpassword.txt:º 12 bytes │
│ →ºusername.txt:º 5 bytes │
º"Injecting" secrets into Podsº
│ Alt 1.a: Mount file as volume │ Alt 1.b: Mount items file as volume │ Alt 2: Consume as ENV.VAR
│ (App will read the file to fetch the secrets) │ (App will read env.vars to fetch
│ │ │ the secrets)
│ │ │
│ apiVersion: v1 │ apiVersion: v1 │ apiVersion: v1
│ kind:ºPodº │ kind:ºPodº │ kind: Pod
│ metadata: │ metadata: │ metadata:
│ name: mypod │ name: mypod │ name: secret-env-pod
│ spec: │ spec: │ spec:
│ containers: │ containers: │ containers:
│ ─ name: mypod │ - name: mypod │ - name: mycontainer
│ image: redis │ image: redis │ image: redis
│ volumeMounts: │ volumeMounts: │ ºenv:º
│ ─ name:ºfoOº │ - name:ºfooº │ - name: SECRET_USERNAME
│ mountPath: "/etc/foo" │ mountPath: "/etc/foo" │ valueFrom:
│ readOnly:ºtrueº │ readOnly: true │ ºsecretKeyRef:º
│ volumes: │ volumes: │ name:Gºdb─user─passº
│ ─ name:ºfoOº │ - name:ºfooº │ ºkey: usernameº
│ secret: │ secret: │ - name: SECRET_PASSWORD
│ secretName:Gºdb─user─passº│ secretName:Gºdb─user─passº │ valueFrom:
│ defaultMode: 256 │ items: │ ºsecretKeyRef:º
│ ^ │ - key: username │ name:Gºdb─user─passº
│ · │ path: my-group/my-username │ key: password
· ^^^^^^^^^^^^^^^^^^^^
JSON does NOT support · username will be seen in container as:
octal notation. /etc/foo/my-group/my-username
256 = 0400 · password secret is not projected
BºExample 2:º
SECRET CREATION | SECRET USSAGE
$ kubectl create secret generic \ | apiVersion: v1
Gºssh-key-secretº \ | kind: Pod
--from-file=ssh-privatekey=/path/to/.ssh/id_rsa \ | metadata:
--from-file=ssh-publickey=/path/to/.ssh/id_rsa.pub | name: secret-test-pod
| labels:
| name: secret-test
| spec:
| volumes:
| - name: secret-volume
| secret:
| secretName:Gºssh-key-secretº
| containers:
| - name: ssh-test-container
| image: mySshImage
| volumeMounts:
| - name: secret-volume
| readOnly: true
| mountPath: "/etc/secret-volume"
| ^^^^^^^^^^^^^^^^^^^^
| secret files available visible like:
| /etc/secret-volume/ssh-publickey
| /etc/secret-volume/ssh-privatekey
FEATURE STATE: Kubernetes v1.13 beta
You can enable encryption at rest for secret data, so that the secrets are not stored in the clear into etcd .
(Pod)ConfigMap @[https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/] ConfigMaps: decouple configuration artifacts from image content ºCreate a ConfigMapº $ kubectl create configmap \ 'map-name' \ 'data-source' ← data-source: directories, files or literal values translates to the key-value pair in the ConfigMap where key = file_name or key provided on the cli value = file_contents or literal_value provided on the cli You can use kubectl describe or kubectl get to retrieve information about a ConfigMap. ConfigMapºfrom config-fileº: ┌───────────────────────────── ┌────────────────────────────────────── │ºSTEP 0:º Input to config map │ºSTEP 1:º create ConfigMap object │ ...configure─pod─container/ │ $ kubectl create configmap \ │ └─ configmap/ │ game-config │ ├─ºgame.propertiesº │ --from-file=configmap/ │ └─º ui.propertiesº │ ^^^^^^^^^^ │ │ combines the contents of │ │ all files in the directory ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────── │ºSTEP 2:º check STEP 1 │$ kubectlºget configmapsºgame-config -o yaml │$ kubectlºdescribe configmapsºgame-config │→ apiVersion: v1 │→ │→ kind: ConfigMap │→ Name: game-config │→ metadata: │→ Namespace: default │→ creationTimestamp: 2016-02-18T18:52:05Z │→ Labels:Containers│→ name: game-config │→ Annotations: │→ namespace: default │→ │→ resourceVersion: "516" │→ Data │→ selfLink: /api/v1/namespaces/default/configmaps/game-config │→ ==== │→ uid: b4952dc3-d670-11e5-8cd0-68f728db1985 │→ºgame.properties:º 158 bytes │→ data: │→ºui.properties: º 83 bytes │→ ºgame.properties:º| │→ enemies=aliens │→ lives=3 ┌─────────────────────────────────── │→ enemies.cheat=true │ºSTEP 3:ºUse in Pod container* │→ enemies.cheat.level=noGoodRotten │ apiVersion: v1 │→ secret.code.passphrase=UUDDLRLRBABAS │ kind: Pod │→ secret.code.allowed=true │ metadata: │→ secret.code.lives=30 │ name: dapi-test-pod │→ ºui.properties:º| │ spec: │→ color.good=purple │ containers: │→ color.bad=yellow │ - name: test-container │→ allow.textmode=true │ ... │→ how.nice.to.look=fairlyNice │ env: │ º- name: ENEMIES_CHEAT º │ º valueFrom: º │ º configMapKeyRef: º │ º name: game-config º │ º key: enemies.cheat º ConfigMapºfrom env-fileº: │ºSTEP 0:º Input to config map │ºSTEP 1:º create ConfigMap object │ ...configure─pod─container/ │ $ kubectl create configmap \ │ └─ configmap/ │ game-config-env-file \ │ └─ºgame-env-file.propertiesº │ --from-env-file=game-env-file.properties │ ^^^^^^^^^^^^^^^^^^^^^^^^ │ game-env-file.properties │ enemies=aliens │ ºSTEP 2:º Check STEP 1 │ lives=3 │ $ kubectl get configmap game-config-env-file -o yaml │ allowed="true" │ → apiVersion: v1 │ → kind: ConfigMap │ → metadata: │ → creationTimestamp: 2017-12-27T18:36:28Z │ → name: game-config-env-file │ → namespace: default │ → resourceVersion: "809965" │ → selfLink: /api/v1/namespaces/default/configmaps/game-config-env-file │ → uid: d9d1ca5b-eb34-11e7-887b-42010a8002b8 │ → data: │ → allowed: '"true"' │ → enemies: aliens │ → lives: "3" NOTE: kubectl create configmap ... --from-file=Bº'my-key-name'º='path-to-file' will create the data under: → .... → data: → Bºmy-key-name:º → key1: value1 → ... ºConfigMaps from literal valuesº º STEP 1º | º STEP 2º $ kubectl create configmap special-config \ | → ... --from-literal=special.how=very \ | → data: --from-literal=special.type=charm | → special.how: very | → special.type: charm restartPolicy: Never
Init Containers
@[https://kubernetes.io/docs/concepts/workloads/pods/init-containers/]
- one or more specialized Containers that run before app Containers
-ºcan contain utilities or setup scripts not present in an app imageº
- exactly equal to regular Containers, except:
- They always run to completion.
k8s will restart it repeatedly until succeeds
(unless restartPolicy == Never)
- Each one must complete successfully before the next one is started.
- status is returned in .status.initContainerStatuses
^^^^^^^^^^^^^^^^^^^^^
vs .status.containerStatuses
- readiness probes do not apply
Note: Init Containers can also be given access to Secrets that app Containers
are not able to access.
- Ex: Wait for 'myService' and then for 'mydb'.
│ apiVersion: v1 │apiVersion: v1 │apiVersion: v1
│ kind:ºPodº │kind:ºServiceº │kind:ºServiceº
│ metadata: │metadata: │metadata:
│ name: myapp─pod │ name:Gºmyserviceº │ name:Bºmydbº
│ labels: │spec: │spec:
│ app: myapp │ ports: │ ports:
│ spec: │ ─ protocol: TCP │ ─ protocol: TCP
│ containers: │ port: 80 │ port: 80
│ ─ name: myapp─container │ targetPort: 9376 │ targetPort: 9377
│ image: busybox:1.28 └──────────────────── └─────────────────────
│ command: ['sh', '─c', 'echo The app is running! ⅋⅋ sleep 3600']
│ ºinitContainers:º
│ ─ name:Oºinit─myserviceº
│ image: busybox:1.28
│ command: ['sh', '─c', 'until nslookup Gºmyserviceº; sleep 2; done;']
│ ─ name:Oºinit─mydbº
│ image: busybox:1.28
│ command: ['sh', '─c', 'until nslookup Bºmydbº ; sleep 2; done;']
Inspect init Containers like:
$ kubectl logs myapp-pod -c Oºinit-myserviceº
$ kubectl logs myapp-pod -c Oºinit-mydb º
RºWARN:º
- Use activeDeadlineSeconds on the Pod and livenessProbe on the Container
to prevent Init Containers from failing forever.
- Debugging Init Containers:
@[https://kubernetes.io/docs/tasks/debug-application-cluster/debug-init-containers/]
Containers Limits
@[https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/]
- See also design proposal:
@[https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/resource-qos.md]
A Container:
- ºcan exceedº its resourceºrequestºif the Node has memory available,
- isºnot allowedºto use more than its resource ºlimitº.
- spec.containers[].resources.limits.cpu ← can NOT exceed resource limit → it will be scheduled
- spec.containers[].resources.limits.memory for termination
- spec.containers[].resources.requests.cpu ← can exceed resource request
- spec.containers[].resources.requests.memory
- spec.containers[].resources.limits.ephemeral-storage k8s 1.8+
- spec.containers[].resources.requests.ephemeral-storage k8s 1.8+
Note: resource quota feature can be configured to limit the total amount of resources
that can be consumed. In conjunction namespaces, it can prevent one team from
hogging all the resources.
ContaineRºComputeºResource types:
- CPU : units of cores
- total amount of CPU time that a container can use every 100ms.
(minimum resolution can be setup to 1ms)
- 1 cpu is equivalent to: 1 (AWS vCPU|GCP Core|Azure vCore,IBM vCPU,Hyperthread bare-metal )
- 0.1 cpu == 100m
- memory : units of bytes:
- Ex: 128974848, 129e6, 129M, 123Mi
- Local ephemeral storage(k8s 1.8+):
- No QoS can be applied.
- Ex: 128974848, 129e6, 129M, 123Mi
When using Docker:
greater of this number or 2 is used as the value of the --cpu-shares flag in the docker run command.
ºExtended resourcesº
- fully-qualified resource names outside the kubernetes.io domain.
Ussage:
- STEP 1: cluster operator must advertise an Extended Resource.
- STEP 2: users must request the Extended Resource in Pods.
(See official doc for more info)
It is planned to add new resource types, including a node disk space resource, and a framework for adding custom resource types.
Life Cicle DEFINITION END defined → scheduled in node → run → alt1: run until their container(s) exit alt2: pod removed (for any other reason) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ depending on policy and exit code, may be removed after exiting, or may be retained in order to enable access containers logs. @[https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/] Pod.PodStatus.phase := Pending|Running|Succeeded|Failed
Pod Preset @[https://v1-17.docs.kubernetes.io/docs/concepts/workloads/pods/podpreset/#] - objects used to inject information into pods at creation time like secrets, volumes, volume mounts, and environment variables. Pod → PodPreset → Apply → Merge (inject) → new Pod Creation controller Pod Presets all found Presets Request ºmatchingº into new pod ºnew pod º RºWARNº: In case of ºlabels º errors process continues (non fail-fast) BºPod Preset N ←→ M Pod:º A preset can be used as a default set for common pods. Ex: apiVersion: settings.k8s.io/v1alpha1 kind: PodPreset metadata: ºname: dbcommonº spec: selector: matchLabels: use_presets: ddbb01 ← Inject common database into all pods with this label env: ┐ ─ name: DB_PORT │ value: "6379" │ volumeMounts: │ Common setup to ─ mountPath: /cache ├ ddbb clients name: cache─volume │ volumes: │ ─ name: cache─volume │ emptyDir: {} ┘ -ºDisable Pod Preset for a Specific Podº: @ Pod Spec add: - podpreset.admission.kubernetes.io/exclude: "true".
Pod QoS @[https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/] Pod.spec.qosClass := Guaranteed Assigned to pod when: - All Pod-Containers have a defined cpu/mem. limit. - All Pod-Containers have same cpu/mem.limit and same cpu/mem.request Burstable Assigned to pod when: - criteria for QoS class Guaranteed isºNOTºmet. - 1+ Pod-Containers have memory or CPU request. BestEffort Assigned to pod when: - Pod-Containers do NOT have any memory/CPU request/limit Ex. Pod YAML Definition: apiVersion: v1 kind: Pod metadata: name: test-pd spec: containers: - image: gcr.io/google_containers/test-webserver name: test-container volumeMounts: - mountPath: /cache ← where to mount name:Bºcache-volumeº ← volume (name) to mount - mountPath: /test-pd name:Gºtest-volumeº volumes: - name:Bºcache-volumeº emptyDir: {} - name:Gºtest-volumeº hostPath: # directory location on host path: /data # this field is optional type: Directory Managing running pods$ kubectl get pod shell-demo # ← ensure pod is running $ kubectlºlogsºmy-pod (-c my-container) (-f) # ← Get logs ^^ "tail" $ kubectlºattachºmy-pod -i $ kubectlºport-forwardºmy-pod 5000:6000 ^ ^ local pod machine port $ kubectlºrunº-i \ # ← Exec. shell for image --tty busybox \ # (launch new temporal pod) --image=busybox \ -- sh $ kubectlºexecºmy-pod \ # ← Get a shell inside already-running Pod (-c my-container) # ← Required if Pod has 2+ containers -it # ← Allocate terminal. -- /bin/bash $ kubectlºexecºmy-pod \ # ← Execute non-interatively inside already-running Pod (-c my-container) # ← Required if Pod has 2+ containers env # ← env execution will dump all ENV.VARs inside container $ kubectlºtopºpod POD_NAME --containers
SecurityContext
@[https://kubernetes.io/docs/tasks/configure-pod-container/security-context/]
@[https://github.com/kubernetes/community/blob/master/contributors/design-proposals/auth/security_context.md]
Pod.spec.securityContext Ex:
Ex:
apiVersion: v1 Exec a terminal into a running container adnd execute:
kind: Pod $ id
metadata: ºuid=1000 gid=3000 groups=2000º
name: security-context-demo ^^^^ ^^^^
spec: uid/gid 0/0 if not specified
ºsecurityContext: º
º runAsUser: 1000 º
º runAsGroup: 3000º
º fsGroup: 2000º ← owned/writable by this GID when supported by volume
volumes:
- name: sec-ctx-vol ← Volumes will be relabeled with provided seLinuxOptions values
emptyDir: {}
containers:
- name: sec-ctx-demo
...
volumeMounts:
- name: sec-ctx-vol
mountPath: /data/demo
ºsecurityContext:º
º allowPrivilegeEscalation: falseº
º capabilities:º
º add: ["NET_ADMIN", "SYS_TIME"]º ← Provides a subset of 'root' capabilities:
@[https://github.com/torvalds/linux/blob/master/include/uapi/linux/capability.h]
º seLinuxOptions:º
º level: "s0:c123,c456"º
SecurityContext holds security configuration that will be applied to a container.
Some fields are present in both SecurityContext and PodSecurityContext. When both
are set, the values in SecurityContext take precedence.
typeºSecurityContextºstruct {
// The capabilities to add/drop when running containers.
// Defaults to the default set of capabilities granted by the container runtime.
// +optional
Capabilities *Capabilities
// Run container in privileged mode.
// Processes in privileged containers are essentially equivalent to root on the host.
// Defaults to false.
// +optional
Privileged *bool
// The SELinux context to be applied to the container.
// If unspecified, the container runtime will allocate a random SELinux context for each
// container. May also be set in PodSecurityContext. If set in both SecurityContext and
// PodSecurityContext, the value specified in SecurityContext takes precedence.
// +optional
SELinuxOptions *SELinuxOptions
// Windows security options.
// +optional
WindowsOptions *WindowsSecurityContextOptions
// The UID to run the entrypoint of the container process.
// Defaults to user specified in image metadata if unspecified.
// May also be set in PodSecurityContext. If set in both SecurityContext and
// PodSecurityContext, the value specified in SecurityContext takes precedence.
// +optional
RunAsUser *int64
// The GID to run the entrypoint of the container process.
// Uses runtime default if unset.
// May also be set in PodSecurityContext. If set in both SecurityContext and
// PodSecurityContext, the value specified in SecurityContext takes precedence.
// +optional
RunAsGroup *int64
// Indicates that the container must run as a non-root user.
// If true, the Kubelet will validate the image at runtime to ensure that it
// does not run as UID 0 (root) and fail to start the container if it does.
// If unset or false, no such validation will be performed.
// May also be set in PodSecurityContext. If set in both SecurityContext and
// PodSecurityContext, the value specified in SecurityContext takes precedence.
// +optional
RunAsNonRoot *bool
// The read-only root filesystem allows you to restrict the locations that an application can write
// files to, ensuring the persistent data can only be written to mounts.
// +optional
ReadOnlyRootFilesystem *bool
// AllowPrivilegeEscalation controls whether a process can gain more
// privileges than its parent process. This bool directly controls if
// the no_new_privs flag will be set on the container process.
// +optional
AllowPrivilegeEscalation *bool
// ProcMount denotes the type of proc mount to use for the containers.
// The default is DefaultProcMount which uses the container runtime defaults for
// readonly paths and masked paths.
// +optional
ProcMount *ProcMountType
}
─────────────────────────────────────────────────────────────────────────────
// PodSecurityContext holds pod-level security attributes and common container settings
// Some fields are also present in container.securityContext. Field values of
// container.securityContext take precedence over field values of PodSecurityContext.
typeºPodSecurityContextºstruct {
// Use the host's network namespace. If this option is set, the ports that will be
// used must be specified.
// Optional: Default to false
// +k8s:conversion-gen=false
// +optional
HostNetwork bool
// Use the host's pid namespace.
// Optional: Default to false.
// +k8s:conversion-gen=false
// +optional
HostPID bool
// Use the host's ipc namespace.
// Optional: Default to false.
// +k8s:conversion-gen=false
// +optional
HostIPC bool
// Share a single process namespace between all of the containers in a pod.
// When this is set containers will be able to view and signal processes from other conts
// in the same pod, and the first process in each container will not be assigned PID 1.
// HostPID and ShareProcessNamespace cannot both be set.
// Optional: Default to false.
// This field is beta-level and may be disabled with the PodShareProcessNamespace feature.
// +k8s:conversion-gen=false
// +optional
ShareProcessNamespace *bool
// The SELinux context to be applied to all containers.
// If unspecified, the container runtime will allocate a random SELinux context for each
// container. May also be set in SecurityContext. If set in
// both SecurityContext and PodSecurityContext, the value specified in SecurityContext
// takes precedence for that container.
// +optional
SELinuxOptions *SELinuxOptions
// Windows security options.
// +optional
WindowsOptions *WindowsSecurityContextOptions
// The UID to run the entrypoint of the container process.
// Defaults to user specified in image metadata if unspecified.
// May also be set in SecurityContext. If set in both SecurityContext and
// PodSecurityContext, the value specified in SecurityContext takes precedence
// for that container.
// +optional
RunAsUser *int64
// The GID to run the entrypoint of the container process.
// Uses runtime default if unset.
// May also be set in SecurityContext. If set in both SecurityContext and
// PodSecurityContext, the value specified in SecurityContext takes precedence
// for that container.
// +optional
RunAsGroup *int64
// Indicates that the container must run as a non-root user.
// If true, the Kubelet will validate the image at runtime to ensure that it
// does not run as UID 0 (root) and fail to start the container if it does.
// If unset or false, no such validation will be performed.
// May also be set in SecurityContext. If set in both SecurityContext and
// PodSecurityContext, the value specified in SecurityContext takes precedence
// for that container.
// +optional
RunAsNonRoot *bool
// A list of groups applied to the first process run in each container, in addition
// to the container's primary GID. If unspecified, no groups will be added to
// any container.
// +optional
SupplementalGroups []int64
// A special supplemental group that applies to all containers in a pod.
// Some volume types allow the Kubelet to change the ownership of that volume
// to be owned by the pod:
//
// 1. The owning GID will be the FSGroup
// 2. The setgid bit is set (new files created in the volume will be owned by FSGroup)
// 3. The permission bits are OR'd with rw-rw----
//
// If unset, the Kubelet will not modify the ownership and permissions of any volume.
// +optional
FSGroup *int64
// Sysctls hold a list of namespaced sysctls used for the pod. Pods with unsupported
// sysctls (by the container runtime) might fail to launch.
// +optional
Sysctls []Sysctl
}
ServiceAccount
@[https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/]
@[https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/]
- Processes in containers inside pods can also contact the apiserver.
When they do, they are authenticated as a particular Service Account
- provides an identity for processes that run in a Pod.
- to access the API from inside a pod a mounted service account with
a 1-hour-expiration-token is provided.
(can be disabled with automountServiceAccountToken)
- User accounts are for humans. Service accounts are for processes running on pods.
- Service accounts are namespaced.
- cluster users can create service accounts for specific tasks (i.e. principle of least privilege).
- Allows to split auditing between humans and services.
- Three separate components cooperate to implement the automation around service accounts:
- AºService account admission controllerº, part of the apiserver, thatsynchronously modify
pods as they are created or updated:
- sets the ServiceAccount to default when not specified by pod.
- abort if ServiceAccount in the pod spec does NOT exists.
- ImagePullSecrets of ServiceAccount are added to the pod if none is specified by pod.
- Adds a volume to the pod which contains a token for API access.
(service account token expires after 1 hour or on pod deletion)
- Adds a volumeSource to each container of the pod mounted at
*/var/run/secrets/kubernetes.io/serviceaccount*
- AºToken controllerº, running as as part of controller-manager. It acts asynchronously and
- observes serviceAccount creation and creates a corresponding Secret to allow API access.
- observes serviceAccount deletion and deletes all corresponding ServiceAccountToken Secrets.
- observes secret addition, and ensures the referenced ServiceAccount exists, and adds a
token to the secret if needed.
- observes secret deletion and removes a reference from the corresponding ServiceAccount if needed.
NOTE:
controller-manageRº'--service-account-private-key-file'º indicates the priv.key signing tokens
- kube-apiserver º'--service-account-key-file'º indicates the matching pub.key verifying signatures
- A controller loop ensures a secret with an API token exists for each service account.
To create additional API tokens for a service account:
- create a new secret of type ServiceAccountToken like:
(controller will update it with a generated token)
{
"kind":º"Secret"º,
"type":º"kubernetes.io/service-account-token"º
"apiVersion": "v1",
"metadata": {
"name": "mysecretname",
"annotations": {
"kubernetes.io/service-account.name" :
"myserviceaccount" ← reference to service account
}
},
}
- AºService account controller:º
- manages ServiceAccount inside namespaces,
- ensures a ServiceAccount named “default” exists in every active namespace.
$ kubectl get serviceAccounts
NAME SECRETS AGE
default 1 1d ← Exists for each namespace
...
- Additional ServiceAccount's can be created like:
-
- $ kubectl apply -f - ˂˂EOF
- apiVersion: v1
- kind: ServiceAccount
- metadata:
- name: build-robot
- EOF
Liveness/Readiness Probes
@[https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/]
@[https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.14/#probe-v1-core]
.
- liveness probes : used by kubelet to know when to restart a Container.
- readiness probes : used by kubelet to know whether a Container can accept new requests.
- A Pod is considered ready when all of its Containers are ready.
- The Pod is not restarted in this case but no new requests are
forwarded.
- Readiness probes continue to execute for the lifetime of the Pod
(not only at startup)
ºliveness COMMANDº │ ºliveness HTTP requestº │ ºTCP liveness probeº
│ │
apiVersion: v1 │ apiVersion: v1 │ apiVersion: v1
kind:ºPodº │ kind:ºPodº │ kind:ºPodº
metadata: │ metadata: │ metadata:
... │ ... │ ...
spec: │ spec: │ spec:
ºcontainers:º │ ºcontainers:º │ ºcontainers:º
- name: ... │ - name: ... │ - name: ...
... │ ... │ ºreadinessProbe: º
ºlivenessProbe: º │ ºlivenessProbe: º │ º tcpSocket: º
º exec: º │ º httpGet: º │ º port: 8080 º
º command: º │ º path: /healthz º │ º initialDelaySeconds: 5 º
º - cat º │ º port: 8080 º │ º periodSeconds: 10 º
º - /tmp/healthy º │ º httpHeaders: º │ ºlivenessProbe: º
º initialDelaySeconds: 5º │ º - name: Custom-HeadeRº │ º tcpSocket: º
º periodSeconds: 5 º │ º value: Awesome º │ º port: 8080 º
│ º initialDelaySeconds: 3 º │ º initialDelaySeconds: 15º
│ º periodSeconds: 3 º │ º periodSeconds: 20 º
HTTP proxy ENV.VARS is ignored
in liveness probes (k8s v1.13+)
↑ ↑
│ │
ºnamedºContainerPort can also ──────┴────────────────────────────────┘
be used for HTTP and TCP probes
like:
│ ports:
│ - name: liveness-port
│ containerPort: 8080
│ hostPort: 8080
│
│ livenessProbe:
│ httpGet:
│ path: /healthz
│ port: liveness-port
ºMonitor liveness test resultº
$ kubectl describe pod liveness-exec
Other optional parameters:
- timeoutSeconds : secs after which the probe times out. (1 sec by default)
- successThreshold: Minimum consecutive successes for the probe to be
considered successful after having failed. (1 by default)
- failureThreshold: number of failures before "giving up": Pod will be marked Unready.(Def 3)
HTTP optional parameters:
- host : for example 127.0.0.1 (defaults to pod IP)
- port : Name or number(1-65535)
- scheme: HTTP or HTTPS(skiping cert.validation)
- path :
- httpHeaders: Custom headers
PodDisruptionBudget
@[https://kubernetes.io/docs/concepts/workloads/pods/disruptions/#how-disruption-budgets-work]
@[https://kubernetes.io/docs/tasks/run-application/configure-pdb/]
A PodDisruptionBudget object (PDB) can be defined for each deployment(application),
limiting the number of pods of a replicated application that
are down simultaneously from ºvoluntaryº disruptions.
Ex: A Deployment has:
- .spec.replicas: 5 (5 pods desired at any given time)
- The PDB is defined as:
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: zookeeper-pdb
spec:
maxUnavailable: 1 ←-- Eviction API will allow voluntary disruption of one,
selector: but not two pods, at a time trying to have 4 running
matchLabels: at all times
app: zookeeper
Eviction API compatible tools/commands
like 'kubectl drain' must be used
(vs directly deleting pods/deployments)
Service Problem: Pods can be rescheduled to a different node in k8s pool, "at random", changing their internal cluster IP. How Pods can deterministically access some other Pod's services (TCP/UDP/Web service, ...)? Solution: Use k8s Services - A Services logically groups a set of Pods through a Label-selection, and for such logical group, it also established a ºnetwork policyº. to access them. - Kubernetes-native applications: K8s offers a simple Endpoints API - non-native applications: K8s offers virtual-IP-based-2-Service bridge -OºService Types:º - OºClusterIPº : (default) internal-to-cluster virtual IP. No IP exposed outside the cluster. - OºExternalNameº: Maps apps (Pods services) to an externally visible DNS entry (ex: foo.example.com) ─ OºNodePortº : Exposes an external NodeIP:NodePort to ←───────┐ k8s ºIngressº can also be used Pods providing the service. ├─ to expose HTTP/S services to │ external clients. ─ OºLoadBalancerº: Exposes Service externally using a cloud provider's ←─┘ NodePort/LoadBalancer Services load balancer. NodePort and ClusterIP services, allows for TCP,UDP,PROXY and SCTP(k8s 1.2+) to which the external load balancer routes, are - Advanced load balancing automatically created. (persistent sessions, dynamic weights) are RºNOTº yet supported by Ingress rules - On the other side ingress allows to expose services based on HTTP host (virtual hosting), paths/regex,... not available to NodePort/LoadBalancer Services kube-proxy, installed on each node of the k8s pool, will get in charge of forwarding correctly to the correct target. Note: kube-proxy is more reliable than DNS due to problems with TTLs, DNS caching on clients,... OºClusterIPº │ OºNodePortº │ OºExternalNameº Oº=========º │ Oº========º │ Oº============º │ │ ┌───┐ │ :3100┌────┐ :80┌───┐ │ ┌────┐:80───┐ ┌─→:80│Pod│ │ ┌───→│node├───┐ ┌──→│Pod│ │ ┌─→│node├─→│Pod│ │ └───┘ │ │ └────┘ │ │ └───┘ │ │ └────┘ └───┘ :80┌───────────┐ │ Incomming ┌──v───────┐ │ :80 ┌──────────┐ Internal──→│ºClusterIPº│ │ traffic │ºNodePortº│ │ ┌─→:│ ºLoadº │ ┌───┐ Traffic └───────────┘ │ │ └──^───────┘ │ │ │ºBalancerº│ │Pod│ │ ┌───┐ │ │ ┌────┐ │ │ ┌───┐ │ │ └──────────┘ └───┘ └─→:80│Pod│ │ └───→│node├───┘ └──→│Pod│ │ Incomming │ ┌────┐ ↑:80 └───┘ │ :3100└────┘ :80└───┘ │ traffic └──→│node├──┘ │ ^ ^ │ └────┘ │ Creates mapping between │ configures an external │ node port(3100) and │ IP address in balancer. │ Pod port (80) │ Then connects requested │ │ pods to balancer apiVersion: v1 kind:ºServiceº metadata: name: Redis_Master ← Must be a valid DNS label name spec: ┌→ · selector: ← Targeted Pods defined in selector | · app: MyApp ← Matching label. | · clusterIP: 1.1.1.1 ← Optional. set it to match any existing DNS entry or | · legacy systems with hardcoded IPs difficult to reconfigure | · In k8s apps, clusterIP can be auto-assigned by k8s, | · App will use Environment Variables (injected into the Pod) | · and/or DNS. | · REDIS_MASTER_SERVICE_HOST=10.0.0.11 | · REDIS_MASTER_SERVICE_PORT=6379 | · REDIS_MASTER_PORT=tcp://10.0.0.11:6379 | · REDIS_MASTER_PORT_6379_TCP=tcp://10.0.0.11:6379 | · REDIS_MASTER_PORT_6379_TCP_PROTO=tcp | · REDIS_MASTER_PORT_6379_TCP_PORT=6379 | · REDIS_MASTER_PORT_6379_TCP_ADDR=10.0.0.11 | · RºWARNº: Service must be setup before Pod is started. | | | · The DNS entry (if CoreDNS or other k8s-aware DNS server is setup) | · will be like: | · redis_master.myNameSpace | · ^^^^^^^^^^^^ | · Needed only for Pods in another Namespace. | · ports: | · - name: http | · protocol: TCP ← TCP*, UDP, HTTP, PROXY, SCTP(k8s 1.2+) º*1º | · port: 80 ← Request to port 80 will be forwarded to Port 9376 | · targetPort: 9376 on "any" pod matching the selector. | · - name: https | · protocol: TCP | · port: 443 | · targetPort: 9377 | · sessionAffinity: "ClientIP" ← Optionally sessionAffinityConfig.clientIP.timeoutSeconds | can be set (defaults to 10800) | └ The selector can be empty if Service doesn't target internal pod. This is the case when we want the Service to point to: ┌ - External tcp service (DDBB, ERP, ...) | - Different Namespace or differente K8s cluster. | - k8s migration is in progress. | In this case the Endpoint object is not automatically created. Admins can point manually | to an external end-point like: | | apiVersion: v1 | kind:ºEndpointsº | metadata: | name: my-service ← Must also be a valid DNS subdomain name. └→ subsets: - addresses: - ip: 10.0.0.10 ← External PostgreSQL @10.0.0.10:5432 - ports: - port: 5432 ºUser-space Proxy Mode:º (kube-proxy on each k8s node of k8s node-pool keeps listening for events from Master.ApiServer) master → kube-proxy: Event +Service.EndPoint kube-proxy → kube-proxy: - Open Random Port 01 - Setup round-robin from random-port 01 to *all existing* Backend Pods matching the Service selector. kube-proxy → iptables : Install iptables rule in kernel to capture traffice to (virtual) clusterIP:port to Random Port 01 ... app → kernel : request to clusterIP:port kernel → iptables : request to clusterIP:port iptables → kube-proxy: request to random port kube-proxy → kube-proxy: Round-robin amongst all availables Pods. (The kube-proxy keeps a list of live-Pods matching the Service.Selector) (Using also SessionAffinity for better balancing) kube-proxy → Pod N : request ºiptables Proxy Mode:º (kube-proxy on each k8s node of k8s node-pool keeps listening for events from Master.ApiServer) master → kube-proxy: Event +Service.EndPoint kube-proxy → iptables : Install iptables rule in kernel to capture traffice to (virtual) clusterIP:port to some Pod i (and just one) from all Pods matching the Service.selecotr ... app → kernel : request to clusterIP:port kernel → iptables : request to clusterIP:port iptables → Pod i : request ^^^^^ - If Pod is down request fails. - readiness probes can be set to avoid forwarding to a faulty Pod. ºIPVS proxy modeº PRE-SETUP: IPVS module must be available on the node. master → kube-proxy: Event +Service.EndPoint kube-proxy → netlink : Create IPVS rules (sync status periodically) to all Pods in Selector (allowing for balancing) app → kernel : request to clusterIP:port kernel → netlink : request to clusterIP:port netlink → Pod i : request Note: netlink allows next balancing options: · rr: round-robin · lc: least connection (smallest number of open connections) · dh: destination hashing · sh: source hashing · sed: shortest expected delay · nq: never queue º*1º:@[https://kubernetes.io/docs/concepts/services-networking/service/#protocol-support] ºLoad Balancerºworks at ºlayer 4º. unaware of the actual apps. (Layer 7) ┌───┐ ┌───┐ ☜ ☞ Use OºIngress controllersº(Oºlayer 7º) ┌─────────────────────┐ │Pod│ │Pod│ for advanced rules based on inbound │OºIngress controllerº│ └─^─┘ └─^─┘ URL, ... │ ┌─────────────────┐ │ ┌─┴────────┴───┐ - Another common feature of Ingress │ │ myapp.com/blog │ │───→ │Blog service │ isºSSL/TLS terminationºremoving │ └─────────────────┘ │ └──────────────┘ TLS cert complexity from Apps │ ┌─────────────────┐ │ ┌──────────────┐ │ │ myapp.com/store │ │────→│Store service │ │ └─────────────────┘ │ └─┬────────┬───┘ └───^─────────────────┘ ┌─v─┐ ┌─v─┐ | │Pod│ │Pod│ Incomming └───┘ └───┘ Traffic -ºService.spec.clusterIP can be used to force a given IPº ┌─────────────────┐ │SERVICE DISCOVERY│ ├─────────────────┴───────────────────────────────────────────────────────────────────────────────────────── │ ALT.1, using ENV.VARS │ ALT.2, using the k8s DNS add─on │ │ (strongly recommended). ├─────────────────────────────────────────────────┼────────────────────────────────────────────────────────── │Ex: given service "redis─master" with │ Updates dynamically the set of DNS records for services. │ OºClusterIPº = 10.0.0.11:6379 │ The entry will be similar to: "my─service.my─namespace" │ next ENV.VARs are created: │ │ REDIS_MASTER_SERVICE_HOST=10.0.0.11 │ │ REDIS_MASTER_SERVICE_PORT=6379 │ │ REDIS_MASTER_PORT=tcp://10.0.0.11:6379 │ │ REDIS_MASTER_PORT_6379_TCP=tcp://10.0.0.11:6379│ │ REDIS_MASTER_PORT_6379_TCP_PROTO=tcp │ │ REDIS_MASTER_PORT_6379_TCP_PORT=6379 │ │ REDIS_MASTER_PORT_6379_TCP_ADDR=10.0.0.11 │ └─────────────────────────────────────────────────┴────────────────────────────────────────────────────────── ºQuick App(Deployment) service creationº REF: @[https://kubernetes.io/docs/concepts/services-networking/connect-applications-service/] $ kubectl expose deployment/my-nginx is equivalent to: $ kubectl apply -f service.yaml ^^^^^^^^^^^^ apiVersion: v1 kind: Service metadata: name: my-nginx labels: run: my-nginx spec: ports: - port: 80 protocol: TCP selector: run: my-nginx @[https://kubernetes.io/docs/concepts/services-networking/add-entries-to-pod-etc-hosts-with-host-aliases/] Debug Services @[https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/] ☞ (Much more) detailed info available at: ( about kube-proxy + iptables +... advanced config settings) @[https://kubernetes.io/docs/concepts/services-networking/service/]
Ingress Rules @[https://kubernetes.io/docs/concepts/services-networking/ingress/] - Ingress objects (and its rules) allows to exposes HTTP/S based k8s services to the internal cluster. Ussually a WebApp (HTTP/s) can be summarised as: Creating Pods → Creating Services → Creating Ingress Rules running on to access Pods from to expose HTTP/S Services k8s pool-nodes other Ports. externally allowing to listening for Services listen for split requests based on host name, HTTP/s requests Pod creation/destruction HTTP path/subpaths, ... events to adjust the correct node/port target PRE-SETUP: Prerequisites - an ingress controller must be in place. ingress-nginx, ingress-istio, etc... RºWARNº: Different Ingress controllers operate slightly differently to expected specification. BºIngress Resource Example:º apiVersion: networking.k8s.io/v1beta1 ºkind: Ingressº metadata: · name: test-ingress · annotations: · nginx.ingress.kubernetes.io/rewrite-target: / ← Annotation used before IngressClass was added · to define the underlying controller implementation. · .ingressClassName can replace it now. spec: tls: ← Optional but recomended. Only 443 port supported. · - hosts: ← values must apiVersion: v1 · - secure.foo.com match cert CNs kind: Secret · metadata: · secretName: secretTLSPrivateKey ←·····→ name: secretTLSPrivateKey · namespace: default · data: · tls.crt: base64 encoded cert · tls.key: base64 encoded key · type: kubernetes.io/tls rules: ← A backend with no rules could be use to expose a single service - host: reddis.bar.com ← Optional. filter-out HTTP requests not addressing this host http: paths: ← For each path in paths list a If more than on path in list matches the request, longest matchs "wins". - path: /reddis pathType: Prefix ← ImplementationSpecific: delegate matching to IngressClass (nginx, istio, ...) Exact : Matches URL path exactly (RºWARNº:case sensitive). Prefix : Matching based on request URL prefix (RºWARNº:case sensitive) backend: serviceName: reddissrv ← targeted k8s service servicePort: 80 ← Need if more than a Listening port is defined for Service - host: monit.bar.com ← Optional. filter-out HTTP requests not addressing this host http: - path: /grafana pathType: Prefix backend: serviceName: grafanasrv ← targeted k8s service - path: /prometheus pathType: Prefix backend: serviceName: prometheussrv ← targeted k8s service - http: ← No host specified. All traffice not matching reddis.bar.com/monit.bar.com - path: / will be forwarded here. pathType: Prefix backend: serviceName: httpdsrv ← targeted k8s serviceLabels
Labels
@[https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors/]
- Labels = key/value pairs attached to objects.
- labelsºdo not provide uniquenessº
- we expect many objects to carry the same label(s)
- used to specify ºidentifying attributes of objectsº
ºmeaningful and relevant to usersº, (vs k8s core system)
- Normally used to organize and to select subsets of objects.
- Label format:
("prefix"/)name
name : [a-z0-9A-Z\-_.]{1,63}
prefix : must be a DNS subdomain no longer than 253 chars
- Example labels:
"release" : "stable" # "canary" ...
"environment": "dev" # "qa", "pre", "production"
"tier" : "frontend" # "backend" "cache"
"partition" : "customerA", "partition" : "customerB"
"track" : "daily" # "weekly" "monthly" ...
ºRecommended Labelsº
@[https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/]
Key Description Example Type
app.kubernetes.io/name The name of the application mysql string
app.kubernetes.io/instance A unique name identifying wordpress-abcxzy string
the instance of an application
app.kubernetes.io/version current version of the app 5.7.21 string
app.kubernetes.io/component component within the arch database string
app.kubernetes.io/part-of name of higher level app wordpress string
app.kubernetes.io/managed-by tool used to manage the helm string
operation of an application
See also recommended labels for Helm charts:
@[https://helm.sh/docs/chart_best_practices/labels/]
Ex.1 use in an StatefulSet object:
apiVersion: apps/v1 |apiVersion: apps/v1 |apiVersion: v1
kind: StatefulSet |kind: Deployment |kind: Service
metadata: |metadata: |metadata:
labels: | labels: | labels:
app.kubernetes.io/name : mysql | .../name : wordpress | .../name : wordpress
app.kubernetes.io/instance : wordpress-abc | .../instance : wordpress-abc | .../instance : wordpress-abcxzy
app.kubernetes.io/version : "5.7.21" | .../version : "4.9.4" | .../version : "4.9.4"
app.kubernetes.io/component : database | .../component : server | .../component : server
ºapp.kubernetes.io/part-of :ºwordpressº | .../part-of :ºwordpressº | .../part-of : wordpressº
app.kubernetes.io/managed-by: helm | .../managed-by: helm | .../managed-by: helm
|... |...
-ºWell-Known Labels, Annotations and Taintsº
@[https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/]
kubernetes.io/arch
kubernetes.io/os
beta.kubernetes.io/arch (deprecated)
beta.kubernetes.io/os (deprecated)
kubernetes.io/hostname
beta.kubernetes.io/instance-type
failure-domain.beta.kubernetes.io/region
failure-domain.beta.kubernetes.io/zone
Label selectors
- core (object) grouping primitive.
- two types of selectors:
- equality-based
Ex. environment=production,tier!=frontend
^
"AND" represented with ','
- set-based
- filter keys according to a set of values.
(in, notin and exists )
Ex.'ºkeyºequal to environment andºvalueºequal to production or qa'
Ex.'environment in (production, qa)'
Ex.'tier notin (frontend, backend)'
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
all resources with
(key == "tier" AND
values != frontend or backend ) AND
all resources with
(key != "tier")
Ex:'partition' !partition
^^^^^^^^^ ^^^^^^^^^
all resources including all resources without
a label with key 'partition' a label with key 'partition'
- LIST and WATCH operations may specify label selectors to filter the sets
of objects returned using a query parameter.
Ex.:
$ kubectl get pods -l environment=production,tier=frontend # equality-based
$ kubectl get pods -l 'environment in (production),tier in (frontend)' # set-based
$ kubectl get pods -l 'environment in (production, qa)' # "OR" only set-based
$ kubectl get pods -l 'environment,environment notin (frontend)' # "NOTIN"
Storage
Volume
@[https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/volume-ownership-management.md]
REF: https://www.youtube.com/watch?v=OulmwTYTauI
once underlying "hardware" resource applying
storage has been to the whole cluster
assigned to a pod (ºlifespanº independent of pods)
┌──────────────────┐ PV is bound to PVC ┌─↓───────────┐
│ Pod │ one-to-one relation┌···→│ Persistent ·····plugin to
│ │ ┌───────────────────┐· │ Volume(ºPVº)│ one of
│┌───────────────┐ │ ┌··→│ Persistent *1 │· └─────────────┘ · Underlying
││ºVolume Mountsº│ │ · │ Volume │· · Storage
││ /foo │ │ · │ Claim (ºPVCº) │· ┌───↓────────────────┐
│└───────────────┘ │ · │ │· ┌────────────┐ │-Local HD │
│ ┌──────────────┐· │- 100Gi │· ┌·→│ Storage │ │-NFS: path,serverDNS│
│ │ Volumes: │· │- Selector ·········┘ · │ Class(ºSCº)│ │-Amazon:EBS,EFS,... │
│ │ -PVC ·········┘ │- StorageClassName····┘ └────────────┘ │-Azure:... │
│ │ ─claimName ←┐ └───────────────────┘ │-... │
│ └──────────────┘└─ Volumeºlifespamºis └────────────────────┘
└──────────────────┘ that of the Pod.
├────────Created by users (Dev)───────────────────┤ ├──────Created by Admin (Ops)────────────┤
- many types supported, and a Pod can use any number of them simultaneously.
- a pod specifies what volumes to provide for the pod (spec.volumes) and where
to mount those into each container(spec.containers.volumeMounts).
- Sometimes, it is useful to share one volume for multiple uses in a single
pod. The volumeMounts.subPath property can be used to specify a sub-path inside
the referenced volume instead of its root.
*1:Similar to how Pods can request specific levels of resources
(CPU and Memory), Claims can request specific size and access modes
(e.g., can be mounted once read/write or many times read-only)
☞ BºPV contains max size, PVC contains min sizeº
wih PV max.size ˃ PVC min.size must be ˂ PV max.size
Networking Model
@[https://kubernetes.io/docs/concepts/cluster-administration/networking/]
Local Persistent Vol (1.12+)
REF: @[https://kubernetes.io/blog/2019/04/04/kubernetes-1.14-local-persistent-volumes-ga/]
Authors: Michelle Au (Google), Matt Schallert (Uber), Celina Ward (Uber)
- local persistent volume (k8s GA 1.14+) leverage local disks and it enables to use
them with persistent volume claims (pvc).
- Local PV provide for consistent high performance guarantees of local directly-attached
storage.
-Bºthis type of storage is suitable for applications which handles the dataº
Bºreplication themself. software defined storage or replicated databases º
Bºlike kafka, Ethereum, "p2p", cassandra, ...)º.
Apps that do not will better opt for standard (remote) PVs that transparently
takes care of data replication.
- to use this as a persistent volume, we have some manual steps to do:
- Pre-partition, format and mount disks to nodes
- Create Persistent Volumes:
- Alt 1: Manually
- Alt 2: Let a DaemonSet handle the creation.
- Create Storage Class
(can be skipeed in newer k8s releases)
NOTE:
Q: k8s "hostPath" already allows to use local disk as a storage for Pods.
why SHOULD we use Local Persistent Volume instead?
A: With Local PV, the Scheduler is aware of the Local PV, so on Pod
re-start, execution will be assigned to the same worker node.
With hostPath, k8s can re-schedule in a different node, loosing all
previously stored data.
Note: Most other k8s block and file storage plugins provide for remote storage
where data is persisted independently of the node excuting the Pod.
- raw block device support is also provided.
- K8s is also able to format the block device on demand, avoiding manual formating.
- dynamic volume provisioning RºNOTº supported.
(@[https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner]
can be used to help manage the Local PV lifecycle - create,clean up,reuse -
for individual disks),
BºUssage Steps:º
Bº└RE-SETUP Plannification:º
- how many local disks would each node cluster have?
- How would they be partitioned?
(The local static provisioner provides guidance to help answer these questions).
Hint: It’s best to be able to dedicate a full disk to each local volume (for IO isolation)
and a full partition per-volume (for capacity isolation).
Bº└1.- Create StorageClass:º
Qº kind: StorageClass º
Qº apiVersion: storage.k8s.io/v1 º
Qº metadata: º
Qº name: local-storage º
Qº provisioner: kubernetes.io/no-provisionerº
Qº volumeBindingMode: WaitForFirstConsumer º ← *1: enable Bºvolume topology-aware schedulingº
└────── *1 ────────┘ Consumer == "Pod requesting the volume"
Bº└2.-ºthe external static provisioner can be configured and run to (TODO)
create PVs for all the local disks on your nodes.
$º$ kubectl get pv º
NAME CAPA ACC. RECLAIM STATUS CLAIM STORAGECLASS REASON AGE
MODE POLICY
local-pv-27c0f084 368Gi RWO Delete Available local-storage 8s
local-pv-3796b049 368Gi RWO Delete Available local-storage 7s
local-pv-3ddecaea 368Gi RWO Delete Available local-storage 7s
Bº└3.- Start using PVs in workloads by:º
- Alt 1: creating a PVC and Pod
- Alt 2: creating a StatefulSet with volumeClaimTemplates
Ex:
QºapiVersion: apps/v1 º
Qºkind: StatefulSet º
Qºmetadata: º
Qº name: local-test º
Qºspec: º
Qº serviceName: "local-service" º
Qº replicas: 3 º
Qº selector: º
Qº matchLabels: º
Qº app: local-test º
Qº template: º
Qº metadata: º
Qº labels: º
Qº app: local-test º
Qº spec: º
Qº containers: º
Qº - name: test-container º
Qº image: k8s.gcr.io/busybox º
Qº command: º
Qº - "/bin/sh" º
Qº args: º
Qº - "-c" º
Qº - "sleep 100000" º
Qº volumeMounts: º
Qº - name: local-vol º
Qº mountPath: /usr/test-pod º
Qº volumeClaimTemplates: º
Qº - metadata: º
Qº name: local-vol º
Qº spec: º
Qº accessModes: [ "ReadWriteOnce" ] º
Qº storageClassName: "local-storage"º
Qº resources: º
Qº requests: º
Qº storage: 368Gi º
Once the StatefulSet is up and running,
the PVCs will be bound:
$º$ kubectl get pvc º
NAME STATUS VOLUME CAPACITY ACC.. STORAGECLASS AGE
local-vol-local-test-0 Bound local-pv-27c0f084 368Gi RWO local-storage 3m45s
local-vol-local-test-1 Bound local-pv-3ddecaea 368Gi RWO local-storage 3m40s
local-vol-local-test-2 Bound local-pv-3796b049 368Gi RWO local-storage 3m36s
Bº└4.- Automatic Clean Up Ex:º
$º$ kubectl patch sts local-test \ º
$º -p '{"spec":{"replicas":2}}' º
└────────────┘
Reduce Pod replicates 3→2
Associated local-pv-... is not needed anymore.
The external static provisioner will clean up
the disk and make the PV available for use again.
$ºstatefulset.apps/local-test patchedº
$º$ kubectl delete pvc local-vol-local-test-2º
persistentvolumeclaim "local-vol-local-test-2" deleted
$ kubectl get pv
$º$ kubectl get pv º
NAME CAPA ACC. RECLAIM STATUS CLAIM STORAGECLASS REASON AGE
MODE POLICY
local-pv-27c0f084 368Gi RWO Delete Bound ...-0 local-storage 11m
local-pv-3796b049 368Gi RWO Delete Available local-storage 7s
local-pv-3ddecaea 368Gi RWO Delete Bound ...-1 local-storage 19m
└─┬─┘
default/local-vol-local-test-0
default/local-vol-local-test-1
BºLIMITATIONS AND CAVEATSº:
- It ties apps to a specific node, making it harder to schedule.
Those apps Bºshould specify a high priorityº so that lower priority pods,
can be preempted if necessary.
Also if the tied-to node|local volume become inaccessible, then the pod
also becomes inaccessible requiring manual intervention, external controllers
or operators.
If a node becomes unavailable (removed from the cluster or drained), pods using
local volumes on that node are stuck in "Unknown" or "Pending" state depending
on whether or not the node was removed gracefully.
Recovering pods from these interim states means having to delete the PVC binding the pod
to its local volume and then delete the pod in order for it to be rescheduled
(or wait until the node and disk are available again).
"... We took this into account when building our operator for M3DB, which makes changes
to the cluster topology when a pod is rescheduled such that the new one gracefully
streams data from the remaining two peers..."
Because of these constraints, Rºit’s best to exclude nodes with º
Rºlocal volumes from automatic upgrades or repairsº, and in fact some
cloud providers explicitly mention this as a best practice.
(Basically most of the benefits of k8s are lost for apps managing
storage at App level. This is the case with most DDBBs and stream
architectures).
Running storage services
(GlusterFS,iSCSI,...)
@[https://opensource.com/article/17/12/storage-services-kubernetes]
Rook.io
@[https://rook.io/]
Rook turns distributed storage systems into self-managing,
self-scaling, self-healing storage services. It automates the tasks
of a storage administrator: deployment, bootstrapping, configuration,
provisioning, scaling, upgrading, migration, disaster recovery,
monitoring, and resource management.
Rook uses the power of the Kubernetes platform to deliver its
services via a Kubernetes Operator for each storage provider.
https://www.infoq.com/news/2019/08/rook-v1-release/
Rook, a storage orchestrator for Kubernetes, has released version 1.0
for production-ready workloads that use file, block, and object
storage in containers. Highlights of Rook 1.0 include support for
storage providers through operators like Ceph Nautilus, EdgeFS, and
NFS. For instance, when a pod requests an NFS file system, Rook can
provision it without any manual intervention.
Rook was the first storage project accepted into the Cloud Native
Computing Foundation (CNCF), and it helps storage administrators to
automate everyday tasks like provisioning, configuration, disaster
recovery, deployment, and upgrading storage providers. Rook turns a
distributed file system into storage services that scale and heal
automatically by leveraging the Kubernetes features with the operator
pattern. When administrators use Rook with a storage provider like
Ceph, they only have to worry about declaring the desired state of
the cluster and the operator will be responsible for setting up and
configuring the storage layer in the cluster.
Ceph
@[https://ceph.io/]
Noobaa
https://www.noobaa.io/try
NooBaa can collapse multiple storage silos into a single, scalable
storage fabric, by its ability to virtualize any local storage,
whether shared or dedicated, physical or virtual and include both
private and public cloud storage, using the same S3 API and
management tools. NooBaa also gives you full control over data
placement, letting you place data based on security, strategy and
cost considerations, in the granularity of an application.
- Easily scales locally on top of PVs or Ceph external clusters
- Workload Portability
Easily mirror data to other cluster or native cloud storage
Networking Explained
@[https://www.slideshare.net/CJCullen/kubernetes-networking-55835829]
SDNs
SDN intro:
@[https://www.redhat.com/sysadmin/getting-started-sdn]
- provide sdns: allows admin teams to control network traffic in
complex networking topologies through a centralized panel,
rather than handling each network device, such as routers
and switches, manually ("hierarchical" topology)
ºcontainer network interface (cni):º
- library definition and a set of tools to configure network
interfaces in linux containers through many supported plugins.
- multiple plugins can run at the same time in a container
that participates in a network driven by different plugins.
- networks use json configuration files and instantiated as
new namespaces when the cni plugin is invoked.
- common cni plugins include:
-ºcalicoº:
- high scalability.
@[https://docs.projectcalico.org/v3.7/getting-started/kubernetes/]
@[https://kubernetes.io/docs/tasks/administer-cluster/network-policy-provider/calico-network-policy/]
-ºciliumº:
- provides network connectivity and load balancing
between application workloads, such as application
containers and processes, and ensures transparent security.
-ºcontivº:
- integrates containers, virtualization, and physical servers
based on the container network using a single networking fabric.
-ºcontrailº:
- provides overlay networking for multi-cloud and
hybrid cloud through network policy enforcement.
-ºflannelº:
- makes it easier for developers to configure a layer 3
network fabric for kubernetes.
-ºmultusº:
- supports multiple network interfaces in a single pod on
kubernetes for sriov, sriov-dpdk, ovs-dpdk, and vpp workloads.
-ºopen vswitch (ovs)º:
- production-grade cni platform with a standard management
interface on openshift and openstack.
-ºovn-kubernetesº:
- enables virtual networks for multiple containers on different
hosts using an overlay function.
-ºromanaº:
- makes cloud network functions less expensive to build,
easier to operate, and better performing than traditional
cloud networks.
- in addition to network namespaces, an sdn should increase security by
offering isolation between multiple namespaces with the multi-tenant plugin:
packets from one namespace, by default, will not be visible to
other namespaces, so containers from different namespaces cannot
send packets to or receive packets from pods and services of a
different namespace.
network policy providers use cilium for networkpolicy use kube-router for networkpolicy romana for networkpolicy weave net for networkpolicy access clusters using the kubernetes api access services running on clusters advertise extended resources for a node autoscale the dns service in a cluster change the reclaim policy of a persistentvolume change the default storageclass cluster management configure multiple schedulers configure out of resource handling configure quotas for api objects control cpu management policies on the node customizing dns service debugging dns resolution declare network policy developing cloud controller manager encrypting secret data at rest guaranteed scheduling for critical add-on pods ip masquerade agent user guide kubernetes cloud controller manager limit storage consumption operating etcd clusters for kubernetes reconfigure a node's kubelet in a live cluster reserve compute resources for system daemons safely drain a node while respecting application slos securing a cluster set kubelet parameters via a config file set up high-availability kubernetes masters static pods storage object in use protection using coredns for service discovery using a kms provider for data encryption using sysctls in a kubernetes clusterservice mesh
istio.io
linkerd
ReplicaSet @[https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/] - Target: ensure "N" pod replicas are running simultaneously. - Most of the times used indirectly by "Deployments" to orchestrate pod creation/deletion/updates. BºNOTE:º Job controller prefered for pods that terminate on their own. BºNOTE:º DaemonSet controller prefered for pods providing a machine-level function. (monitoring,logging, pods that need to be running before others pods starts). OºDaemonSet pods lifetime == machine lifetimeº |controllers/frontend.yaml | |apiVersion: apps/v1 |kind:ºReplicaSetº |metadata: | name: frontend | labels: | app: guestbook | tier: frontend |spec: | # modify replicas according to your case | ºreplicas: 3º ← Default to 1 |Oºselector:º |Oº matchLabels:º |Oº tier: frontendº |Oº matchExpressions:º |Oº - {key: tier, operator: In, values: [frontend]}º | template: ← ················· Pod template (nested pod schema, removing | metadata: apiVersion/kind properties -) | labels: ←····· Needed in pod-template (vs isolated pod) | app: guestbook .spec.template.metadata.labels must match | Oºtier: frontendº .spec.selector | spec: | ºrestartPolicy: Alwaysº←··· (default/only allowed value) | Qºcontainers:º | - name: php-redis | image: gcr.io/google_samples/gb-frontend:v3 | resources: ←············ "Best Pattern". Indicate resources. | requests: | cpu: 100m | memory: 100Mi | env: | - name: GET_HOSTS_FROM | value: dns ←·········· replace 'dns' by 'env' if DNS is not configured | ports: | - containerPort: 80 $º$ kubectl create -f http://.../frontend.yaml º → replicaset.apps/frontend created $º$ kubectl describe rs/frontend º → Name: frontend · Namespace: default · Selector: tier=frontend,tier in (frontend) · Labels: app=guestbook · tier=frontend · Annotations: ˂none˃ · Replicas: Bº3 current / 3 desiredº · Pods Status: 3 Running / 0 Waiting / 0 Succeeded / 0 Failed · Pod Template: · OºLabels: app=guestbookº · Oº tier=frontendº · Containers: · Qºphp-redis:º · Image: gcr.io/google_samples/gb-frontend:v3 · Port: 80/TCP · Requests: · cpu: 100m · memory: 100Mi · Environment: · GET_HOSTS_FROM: dns · Mounts: ˂none˃ · Volumes: ˂none˃ · Events: · FirstSeen LastSeen Count From ··· Type Reason Message · --------- -------- ----- ---- -------- ------ ------- · 1m 1m 1 {replicaset-cont } Normal SuccessfulCreate Created pod: frontend-qhloh · 1m 1m 1 {replicaset-cont } Normal SuccessfulCreate Created pod: frontend-dnjpy · 1m 1m 1 {replicaset-cont } Normal SuccessfulCreate Created pod: frontend-9si5l $º$ kubectl get pods º → NAME READY STATUS RESTARTS AGE · frontend-9si5l 1/1 Running 0 1m · frontend-dnjpy 1/1 Running 0 1m · frontend-qhloh 1/1 Running 0 1m
Deployment ("Application") @[https://kubernetes.io/docs/concepts/workloads/controllers/deployment/] - Built "on top" of ReplicaSets adding lifecycle management (creation, updates,...). For example, if a Pod definition is changed, Deployments takes care of gradually moving from a running (old) ReplicaSet to a new (automatically created) ReplicaSet. Deployments also rollback to (old) ReplicaSet if new one is not stable. Old ReplicaSets are automatically removed once the updated one is detected to be stable. - Ussually a running applications maps to a Deployment (plus some network ingress rules, to make pod exposed services visibles "outside" k8s) - Ex. Deployment: | # for versions before 1.7.0 use apps/v1beta1 | apiVersion: apps/v1beta2 | kind: Deployment | metadata: | name: nginx-deployment | labels: | app: nginx | # namespace: production | spec: | replicas: 3 ← 3 replicated Pods | strategy: | - type : Recreate ← Recreate | RollingUpdate* | # Alt. strategy example | # strategy: | # rollingUpdate: | # maxSurge: 2 | # maxUnavailable: 0 | # type: RollingUpdate | selector: | matchLabels: | app: nginx | template: ← pod template | metadata: | labels: | app: nginx | spec: ← template pod spec | containers: change triggers new rollout | - name: nginx | image: nginx:1.7.9 | ports: | - containerPort: 80 | livenessProbe: | httpGet: | path: /heartbeat | port: 80 | scheme: HTTP $º$ kubectl create -f nginx-deployment.yaml º $º$ kubectl get deployments º → NAME DESIRED CURRENT ... · nginx-deployment 3 0 $º$ kubectl rollout status deployment/nginx-deployment º → Waiting for rollout to finish: 2 out of 3 new replicas · have been updated... · deployment "nginx-deployment" successfully rolled out $º$ kubectl get deployments º → NAME DESIRED CURRENT ... · nginx-deployment 3 3 ºTo see the ReplicaSet (rs) created by the deployment:º $ kubectl get rs NAME DESIRED ... nginx-deployment-...4211 3 └─────────┬────────────┘ Format: [deployment-name]-[pod-template-hash-value] $º$ kubectl get pods --show-labels º ← Display ALL labels automatically → NAME ... LABELS generated for ALL pods · nginx-..7ci7o ... app=nginx,..., · nginx-..kzszj ... app=nginx,..., · nginx-..qqcnn ... app=nginx,..., - Ex: Update nginx Pods from nginx:1.7.9 to nginx:1.9.1: $º$ kubectl set image deployment/nginx-deployment \ º $º nginx=nginx:1.9.1 º ºCheck the revisions of deployment:º $ kubectl rollout history deployment/nginx-deployment deployments "nginx-deployment" R CHANGE-CAUSE 1 kubectl create -f nginx-deployment.yaml ---record 2 kubectl set image deployment/nginx-deployment \ nginx=nginx:1.9.1 3 kubectl set image deployment/nginx-deployment \ nginx=nginx:1.91 $ kubectl rollout undo deployment/nginx-deployment \ --to-revision=2 ºScale Deployment:º $ kubectl scale deployment \ nginx-deployment --replicas=10 $ kubectl autoscale deployment nginx-deployment \ --min=10 --max=15 --cpu-percent=80
StatefulSet (v:1.9+) @[https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/] - naming convention, network names, and storage persist as replicas are rescheduled. - underlying persistent storage remains even when the StatefulSet is deleted. - Pods in StatefulSet are scheduled and run across any available node in an AKS cluster (vs DaemonSet pods, attached to a given node). - Manages stateful apps: - Useful for apps requiring one+ of: { - Stable, unique network identifiers. - persistent storage across Pod (re)scheduling - Ordered, graceful deployment and scaling. - Ordered, graceful deletion and termination. - Ordered, automated rolling updates. } - Manages the deploy+scaling of Pods providing guarantees about ordering and uniqueness - Unlike Deployments, a StatefulSet maintains a sticky identity for each of their Pods. These pods are created from the same spec, but are not interchangeable: each has a persistent identifier that it maintains across any rescheduling - Pod Identity: StatefulSet Pods have a unique identity that is comprised of [ordinal, stable network identity, stable storage] that sticks even if Pods is rescheduled on another node.- Ordinal:Each Pod will be assigned an unique integer ordinal, from 0 up through N-1, where N = number of replicas - Stable Network Pod host-name = $(statefulset name)-$(ordinal) Ex. full DNS using Stateless service: Pod full DNS (web == StatefullSet.name) Oº← pod-host →º Bº← service ns →º Qº←clusterDoma→º Oºweb-{0..N-1}º.Bºnginx.default.svcº.Qºcluster.localº Oºweb-{0..N-1}º.Bºnginx.foo .svcº.Qºcluster.localº Oºweb-{0..N-1}º.Bºnginx.foo .svcº.Qºkube.local º *1: Cluster Domain defaults to cluster.local - Pod Name Label: When the controller creates a Pod, it adds a label statefulset.kubernetes.io/"pod-name" set to the name of the pod, allowing to attach a Service to an unique Pod ºLimitationsº - The storage for a given Pod must either be provisioned by a PersistentVolume Provisioner based on the requested storage class, or pre-provisioned by an admin. - Deleting and/or scaling a StatefulSet down will not delete the volumes associated with the StatefulSet in order to ensure data safety, which is generally more valuable than an automatic purge of all related StatefulSet resources. - StatefulSets currently require a Headless Service to be responsible for the network identity of the Pods. You are responsible for creating this Service. ºExample StatefulSetº The Bºheadless Service (named nginx)º, is used to control the network domain The OºStatefulSet(named web)º, has a Spec that indicates that 3 replicas of the nginx container will be launched in unique Pods. The GºvolumeClaimTemplatesº will provide stable storage using PersistentVolumes provisioned by a PersistentVolume Provisioner. apiVersion: v1 kind: Service metadata: name: nginx labels: app: nginx spec: ports: - port: 80 name: web clusterIP: None selector: app: nginx --- apiVersion: apps/v1 kind: StatefulSet metadata: name: web spec: replicas: 3 # by default is 1 selector: matchLabels: Qºapp: nginx # has to match .spec.template.metadata.labelsº serviceName: "nginx" template: metadata: labels: app: nginx # has to match .spec.selector.matchLabels spec: terminationGracePeriodSeconds: 10 containers: - name: nginx image: k8s.gcr.io/nginx-slim:0.8 ports: - containerPort: 80 name: web volumeMounts: - name: www mountPath: /usr/share/nginx/html ºvolumeClaimTemplatesº: # Kubernetes creates one PersistentVolume for each VolumeClaimTemplate - metadata: name: www spec: accessModes: [ "ReadWriteOnce" ] # each Pod will receive a single PersistentVolume # When a Pod is (re)scheduled onto a node, its volumeMounts mount # the PersistentVolumes associated with its PersistentVolume Claims. storageClassName: "my-storage-class" resources: requests: storage: 1Gi
DONT'S - The StatefulSet should NOT specify a pod.Spec.TerminationGracePeriodSeconds of 0. Unsafe and ºstrongly discouragedº
Deployment and Scaling Guarantees - Pods are deployed sequentially in order from {0..N-1}. - When Pods are deleted they are terminated in reverse order, from {N-1..0} - Before a scaling operation is applied to a Pod, all of its predecessors must be Running and Ready - Before a Pod is terminated, all of its successors must be completely shutdown -ºOrdering Policies guarantees can be relaxed via .spec.podManagementPolicy(K8s 1.7+)º- OrderedReady: Defaults, implements the behavior described above - Parallel Pod Management: launch/terminate all Pods in parallel, not waiting for Pods to become Running and Ready or completely terminated -
-º.spec.updateStrategyº allows to configure and disable automated rolling updates for containers, labels, resource request/limits, and annotations for the Pods in a StatefulSet.- "OnDelete" implements the legacy (1.6 and prior) behavior. StatefulSet controller will not automatically update the Pods in a StatefulSet. Users must manually delete Pods to cause the controller to create new Pods that reflect modifications made to a StatefulSet’s .spec.template. - "RollingUpdate" (default 1.7+) implements automated, rolling update for Pods. The StatefulSet controller will delete and recreate each Pod proceeding in the same order as Pod termination (largest to smallest ordinal), updating each Pod one at a time waiting until an updated Pod is Running and Ready prior to updating its predecessor. Partitions { - RollingUpdate strategy can be partitioned, by specifying a .spec.updateStrategy.rollingUpdate.partition - If specified all Pods with an ordinal greater than or equal to the partition will be updated when the StatefulSet’s .spec.template is updated. All Pods with an ordinal that is less than the partition will not be updated, and, even if they are deleted, they will be recreated at the previous version. - If it is greater than its .spec.replicas, updates to its .spec.template will not be propagated to its Pods. - In most cases you will not need to use a partition, but they are useful if you want to stage an update, roll out a canary, or perform a phased roll out. }
}
Debug @[https://kubernetes.io/docs/tasks/debug-application-cluster/debug-stateful-set/]
DaemonSet - ensures "N" Nodes run a Pod instance - typical uses: cluster storage, log collection or monitoring - As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected. Deleting a DaemonSet will clean up the Pods it created. - Ex (simple case): one DaemonSet, covering all nodes, would be used for each type of daemon. A more complex setup might use multiple DaemonSets for a single type of daemon, but with different flags and/or different memory and cpu requests for different hardware types. Ex. DaemonSet for Bºfluentd-elasticsearchº: $ cat daemonset.yaml apiVersion: apps/v1 kind: ºDaemonSetº metadata: name: fluentd-elasticsearch namespace: kube-system labels: k8s-app: fluentd-logging spec: selector: matchLabels: name: fluentd-elasticsearch template: # Pod template # Pod Template must have RestartPolicy equal to Always (default if un-specified) metadata: labels: name: fluentd-elasticsearch spec: tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule containers: - name: fluentd-elasticsearch image: Bºk8s.gcr.io/fluentd-elasticsearch:1.20º resources: limits: memory: 200Mi requests: cpu: 100m memory: 200Mi volumeMounts: - name: varlog mountPath: /var/log - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true terminationGracePeriodSeconds: 30 volumes: - name: varlog hostPath: path: /var/log - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers
Garbage Collection
@[https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/]
Jobs @[https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/] Running Automated Tasks with a CronJob Parallel Processing using Expansions Coarse Parallel Processing Using a Work Queue Fine Parallel Processing Using a Work Queue - One example of Job pattern would be a Job which starts a Pod which runs a script that in turn starts a Spark master controller (see spark example), runs a spark driver, and then cleans up. - reliably run 1+ Pod/s to "N" completions - creates one+ pods and ensures that a specified number of them successfully terminate - Jobs are complementary to Deployment Controllers. A Deployment Controller manages pods which are not expected to terminate (e.g. web servers), and a Job manages pods that are expected to terminate (e.g. batch jobs). - As pods successfully complete, the job tracks the successful completions. When a specified number of successful completions is reached, the job itself is complete. Deleting a Job will cleanup the pods it created. - Pod Backoff failure policy: ifyou want to fail a Job after N retries set .spec.backoffLimit (defaults to 6). - Pods are not deleted on completion in order to allow view logs/output/errors for completed pods. They will show up with kubectl get pods º-aº. Neither the job object in order to allow viewing its status. - Another way to terminate a Job is by setting an active deadline in .spec.activeDeadlineSeconds or .specs.template.specs.activeDeadlineSeconds } example. Compute 2000 digits of "pi" $ cat job.yaml apiVersion: batch/v1 kind: ºJobº metadata: name: pi spec: template: # Required (== Pod template - apiVersion - kind) spec: containers: - name: pi ºimage: perlº ºcommand: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]º OºrestartPolicy: Neverº # Only Never/OnFailure allowed backoffLimit: 4 # ºRun job using:º # $ kubectl create -f ./job.yaml # ºCheck job current status like:º # $ kubectl describe jobs/pi # output will be similar to: # Name: pi # Namespace: default # Selector: controller-uid=b1db589a-2c8d-11e6-b324-0209dc45a495 # Labels: controller-uid=b1db589a-2c8d-11e6-b324-0209dc45a495 # job-name=pi # Annotations: ˂none˃ # Parallelism: 1 # Completions: 1 # Start Time: Tue, 07 Jun 2016 10:56:16 +0200 # Pods Statuses: 0 Running / 1 Succeeded / 0 Failed # Pod Template: # Labels: controller-uid=b1db589a-2c8d-11e6-b324-0209dc45a495 # job-name=pi # Containers: # pi: # Image: perl # Port: # Command: # perl # -Mbignum=bpi # -wle # print bpi(2000) # Environment: ˂none˃ # Mounts: ˂none˃ # Volumes: ˂none˃ # Events: # FirstSeen LastSeen Count From SubobjectPath Type Reason Message # --------- -------- ----- ---- ------------- ------- ------ ------- # 1m 1m 1 {job-controller} Normal SuccessfulCreate Created pod: pi-dtn4q # # ºTo view completed pods of a job, use º # $ kubectl get pods # ºTo list all pods belonging to job in machine-readable-formº: # # $ pods=$(kubectl get pods --selector=ºjob-name=piº --output=ºjsonpath={.items..metadata.name}º) # $ echo $pods pi-aiw0a # ºView the standard output of one of the pods:º # $ kubectl logs $pods # 3.1415926535897a....9
Parallel Jobs - Parallel Jobs with a fixed completion count (.spec.completions greater than zero). the job is complete when there is one successful pod for each value in the range 1 to .spec.completions. - Parallel Jobs with a work queue: do not specify .spec.completions: pods must coordinate with themselves or external service to determine what each should work on. each pod is independently capable of determining whether or not all its peers are done, thus the entire Job is done. - For Non-parallel job, leave both .spec.completions and .spec.parallelism unset. - Actual parallelism (number of pods running at any instant) may be more or less than requested parallelism, for a variety of reasons - (read official K8s docs for Job Patterns ussages) } Cron Jobs (1.8+) - written in Cron format (question mark (?) has the same meaning as an asterisk *) - Concurrency Policy - Allow (default): allows concurrently running jobs - Forbid: forbids concurrent runs, skipping next if previous still running - Replace: cancels currently running job and replaces with new one
CronJob - @[https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/] - Ex. cronjob: $ cat cronjob.yaml apiVersion: batch/v1beta1 kind: CronJob metadata: name: hello spec: schedule: Oº"*/1 º º º º"º jobTemplate: spec: template: spec: containers: Oº- name: hello º Oº image: busybox º Oº args: º Oº - /bin/sh º Oº - -c º Oº - date; echo 'Hi from K8s'º restartPolicy: OnFailure # Alternatively: $ kubectl run hello \ --schedule="*/1 º º º º" \ --restart=OnFailure \ --image=busybox \ -- /bin/sh -c "date; echo Hello from the Kubernetes cluster" # get status: $ kubectl get cronjob hello NAME SCHEDULE SUSPEND ACTIVE LAST-SCHEDULE hello */1 º º º º False 0# Watch for the job to be created: $ kubectl get jobs --watch NAME DESIRED SUCCESSFUL AGE hello-4111706356 1 1 2s
Node Health @[https://kubernetes.io/docs/tasks/debug-application-cluster/monitor-node-health/]Monitoring the cluster
crictl: Debug node @[https://kubernetes.io/docs/tasks/debug-application-cluster/crictl/]
Auditing
@[https://kubernetes.io/docs/tasks/debug-application-cluster/audit/]
Events in Stackdriver
@[href=https://kubernetes.io/docs/tasks/debug-application-cluster/events-stackdriver/]
- Kubernetes events are objects that provide insight into what is
happening inside a cluster, such as what decisions were made by
scheduler or why some pods were evicted from the node.
- Since events are API objects, they are stored in the apiserver on
master. To avoid filling up master's disk, a retention policy is
enforced: events are removed one hour after the last occurrence. To
provide longer history and aggregation capabilities, a third party
solution should be installed to capture events.
- This article describes a solution that exports Kubernetes events to
Stackdriver Logging, where they can be processed and analyzed.
Metrics API+Pipeline
- Resource usage metrics, such as container CPU and memory usage, are
available in Kubernetes through the Metrics API. These metrics can be
either accessed directly by user, for example by using kubectl top
command, or used by a controller in the cluster, e.g. Horizontal Pod
Autoscaler, to make decisions.
@[https://kubernetes.io/docs/tasks/debug-application-cluster/resource-metrics-pipeline/]
@[https://github.com/kubernetes-incubator/metrics-server]
Extracted from @[https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/]
"""...If running on Minikube, run the following command to enable the metrics-server:
$º$ minikube addons enable metrics-server º
... to see whether the metrics-server is running, or another provider of the resource metrics
API (metrics.k8s.io), run the following command:
$º$ kubectl get apiservices º
output must include a reference to metrics.k8s.io.
→ ...
→ v1beta1.metrics.k8s.io
"""
Auditing
Auditing
@[https://kubernetes.io/docs/tasks/debug-application-cluster/audit/]
- Kubernetes auditing provides a security-relevant chronological set of
records documenting the sequence of activities that have affected
system by individual users, administrators or other components of the
system. It allows cluster administrator to answer the following
questions:
- what happened?
- when did it happen?
- who initiated it?
- on what did it happen?
- where was it observed?
- from where was it initiated?
- to where was it going?
HELM Charts Summary: HELM Charts is Package manager for k8s apps with versioning, upgrades and rollbacks. ºPRE-SETUPº └ "kubectl" installed locally. ºINITIAL-SETUPº └ Download from @[https://github.com/helm/helm/releases], use homebrew (MacOSX),... and add "helm" command to path. Oº#######################º Oº# Creating new Charts #º Oº#######################º @[https://www.youtube.com/watch?v=3GPpm2nZb2s] @[https://helm.sh/docs/developing_charts/] $º$ helm create myapp º ← alt 1: Create from helm template myapp/ ← template layout ├─ charts/ ← A complex app can consists of many "parallel" charts. One for the │ front-end service/s, other one for the backend/s, middleware, cache, ... │ other chart-dependencies come here. │ Alternatively use requirements.yaml like: │ $º$ cat requirements.yaml │ | dependencies: │ | - name: apache │ | version: 1.2.3 │ | repository: http://repo1.com/charts ← Pre-Setup: │ | - name: mysql $º$ helm repo add ...º │ | version: 4.5.6 │ | repository: http://repo2.com/charts │ $º$ helm dependency update º ← Dowload deps. to charts/ │ ├─ Chart.yaml ← Sort of chart metadata │ | apiversion: v1 │ | name: my-nginx │ | version: 0.1.0 ← Change to 0.2.0 then : │ | $º$ helm upgrade my-nginx . º ← Upgrade 0.1.0 → 0.2.0 │ | $º$ helm rollback my-nginx 1 º ← Rollback to rev 1 (0.1.0) │ | $º$ helm rollback my-nginx 2 º ← Rollback to rev 2 (0.2.0) │ | appVersion: 1.0-SNAPSHOT (For example we can have different type of │ | helm deployments -version:- with different types of │ | replica instances for the same web application │ | - same appVersion: ... ) │ | │ | description: (description of chart -vs app-) │ | ... (kubeVersion, maintainers, template engine, deprecated, ...) │ ├─ values.yaml ← Default values for parameters in yaml templates │ (ports, replica counts, max/min, ...) to be replaced as │ {{ .Values.configKey.subConfigKey.param }}. │ They can be overloaded at install/upgrade like: │ $º$ helm install|upgrade ... --set configKey.subConfigKey.param=... º │ └─ templates ├─ deployment.yaml ← Can be created manually like: │ $º$ kubectl create deploy nginx --image nginx \º │ $º --dry-run -o yaml ˃ deployment.yaml º │ ├─ service.yaml ← Can be created manually like: │ $º$ kubectl expose deploy mn-nginx -port 80 \º │ $º --dry-run -o yaml ˃ service.yaml º │ ├─ _helpers.tpl ← files prefixed with '_' do not create k8s output │ Used for reusable pieces of template ├─ hpa.yaml ├─ ingress.yaml ├─ NOTES.txt ├─ serviceaccount.yaml └─ tests └── test-connection.yaml BºDebug the template/s like: $º$ helm lint º ← verify best-practices $º$ helm install --dry-run --debug myApp º $º$ helm template --debug º -BºInstalling the chart:º $º$ helm install --name my-nginx . º ← Intall from current dir. (vs URL / Repo) ^ TIP: Monitor deployment in real time in another console like: $º$ watch -n 2 "kubectl get all" º $º$ helm delete --purge my-nginx º ← Clean-up install Oº#################################º Oº# Consumming third party charts #º Oº#################################º @[https://helm.sh/docs/intro/quickstart/] └ Initialize aºHelm Chart Repositoryº $º$ helm repo add stable \ º $º https://kubernetes-charts.storage.googleapis.com/º ← Official repo. Oº################º Oº# DAILY USSAGE #º Oº################º $º$ helm get -h º $º$ helm search repo stable º ← list charts available for installation. $ºNAME CHART VERSION APP VERSION DESCRIPTION º $ºstable/acs-engine-autoscaler 2.2.2 2.1.1 DEPRECATED Scales worker nodes within agent pools º $ºstable/aerospike 0.2.8 v4.5.0.5 A Helm chart for Aerospike in Kubernetes º $ºstable/airflow 4.1.0 1.10.4 Airflow is a platform to programmatically autho...º $º... º $º$ helm repo update º ← Make sure we get the latest list of charts ┌→$º$ helm install stable/mysql --generate-nameº ← Install MySQL │ ^^^^^^ │ Helm has several ways to find and install a chart, │ official stable charts is the easiests. │ $º$ helm show chart stable/mysql º ← get an idea of the chart features │ $º$ helm show all stable/mysql º ← get all information about chart │ └ Whenever an install is done, a new local release is created cluster, allowing to install multiple times into the same cluster. Each local release can be independently managed and upgraded. To list local released: $º$ helm ls º $ºNAME VERSION UPDATED STATUS CHART º $ºsmiling-penguin 1 Wed Sep 28 12:59:46 2016 DEPLOYED mysql-0.1.0 º To uninstall a release: $º$ helm uninstall \ º $º$ smiling-penguin --keep-history º ←·························┐ $ºRemoved smiling-penguin º | $º$ helm status smiling-penguin º ← works only in using the --keep-history $ºStatus: UNINSTALLED º flag during uninstall. $... º It also allows to roolback a deleted release using $º$ helm rollback ...º TODO: @[https://helm.sh/docs/intro/using_helm/] NOTE:@[https://www.infoworld.com/article/3541608/kubernetes-helm-gets-full-cncf-approval.html]
Helm Operator @[https://docs.fluxcd.io/projects/helm-operator/en/1.0.0-rc9/references/helmrelease-custom-resource.html] - operator watching for (Custom Resource) HelmRelease change events (from K8s). It reacts by installing or upgrading the named Helm release. (See oficial dock for setup/config params) apiVersion: helm.fluxcd.io/v1 Bºkind: HelmReleaseº ← Custom resource metadata: name: rabbit namespace: default spec: · releaseName: rabbitmq ← Automatically generated if not provided · targetNamespace: mq ← same as HelmRelease project if not provided. · timeout: 500 ← Defaults to 300secs. · resetValues: false ← Reset values on helm upgrade · wait: false ← true: operator waits for Helm upgrade completion · forceUpgrade: false ← true: force Helm upgrade through delete/recreate · chart: ←ºalt 1:ºchart from helm repository · repository: https://charts.... ← HTTP/s and also S3 / GCP Storage through extensions · name: rabbitmq · version: 3.3.6 · # chart: ←ºalt 2:ºchart from git repository: Repo will be cloned · # - git pooled every 5 minutes (See oficial doc. to · # skip waiting) º*1º · # · # git: git@github.com:fluxcd/flux-get-started · # ref: master · # path: charts/ghost · values: ← values to override on source chart · replicas: 1 · valuesFrom: ← Secrets, config Maps, external sources, ... · - configMapKeyRef: · name: default-values ← mandatory · namespace: my-ns ← (of config. map) defaults to HelmRelease one · key: values.yaml ← Key in config map to get values from · (defaults to values.yaml) · optional: false ← defaults to false (fail-fast), · true=˃ continue if values not found · - secretKeyRef: · name: default-values ← mandatory · namespace: my-ns ← (Of secrets) defautl to HelRelease one · key: values.yaml ← Key in the secret to get the values from · (defaults to values.yaml) · optional: false ← defaults to false (fail-fast), · true=˃ continue if values not found · - externalSourceRef: · url: https://example.com/static/raw/values.yaml · optional: false ← defaults to false (fail-fast), · true=˃ continue if values not found · · rollback: ← What to do if new release fails enable: false ← true : perform rollbacks for releas. force: false ← Rollback through delete/recreate if needed. disableHooks: false ← Prevent hooks from running during rollback. timeout: 300 ← Timeout to consider failure install wait: false ← true =˃ wait for min.number of Pods, PVCs, Services to be ready state before marking release as successful. $º$ kubectl delete hr/my-release º ← force reinstall Helm release (upgrade fails, ... ) Helm Operator will receive delete-event and force purge of Helm release. On next Flux sync, a new Helm Release object will be created and Helm Operator will install it. º*1º: Git Authentication how-to: - Setup the SSH key with read-only access ("deploy key" in GitHub) (one for each accessed repo) - Inject each read-only ssh key into Helm Release operator by mounting "/etc/fluxd/ssh/ssh_config" into the container operator and mounting also (as secrets) each referenced priv.key in "/etc/fluxd/ssh/ssh_config"
Kubeapps
- Application Dashboard for Kubernetes
@[https://kubeapps.com/]
- Deploy Apps Internally:
Browse Helm charts from public or your own private chart repositories
and deploy them into your cluster.
- Manage Apps:
Upgrade, manage and delete the applications that are deployed in your
Kubernetes cluster.
- Service Catalog:
Browse and provision external services from the Service Catalog and
available Service Brokers.
Teresa
@[https://github.com/luizalabs/teresa]
- extremely simple platform as a service that runs on top
of Kubernetes. It uses a client-server model: the client sends high
level commands (create application, deploy, etc.) to the server,
which translates them to the Kubernetes API.
Gravity Portable clusters
"take a cluster as is and deploy it somewhere else"
- Gravity takes snapshots of Kubernetes clusters, their container
registries, and their running applications, called "application
bundles." The bundle, which is just a .tar file, can replicate the
cluster anywhere Kubernetes runs.
- Gravity also ensures that the target infrastructure can support the
same behavioral requirements as the source, and that the Kubernetes
runtime on the target is up to snuff. The enterprise version of
Gravity adds security features including role-based access controls
and the ability to synchronize security configurations across
multiple cluster deployments.
- latest major version, Gravity 7, can deploy a Gravity image into
an existing Kubernetes cluster, versus spinning up an all-new cluster
using the image. Gravity 7 can also deploy into clusters that
aren’t already running a Gravity-defined image. Plus, Gravity now
supports SELinux and integrates natively with the Teleport SSH
gateway.
Kaniko, Container Build
- Kaniko performs container builds inside a container environment,
Bºwithout relying on a container daemon (Docker Daemon)º.
- Kaniko extracts the file system from the base image,
executes the build commands in user space atop the extracted
file system, taking a snapshot of the file system after each command.
-
KubeDB: Production DBs
- operators exists for MySQL/PostgreSQL/Redis/... Rºbut there are plenty of gapsº.
BºKubeDB allows to create custom Kubernetes operators for managing databasesº
- Running backups, cloning, monitoring, snapshotting, and declaratively
creating databases are all part of the mix.
Rºsupported features vary among databasesº. Ex:
clustering is available for PostgreSQL but not MySQL.
Kube-monkey: Chaos testing
- stress test, breaking stuff at random.
- It works by randomly killing pods in a cluster
that you specifically designate, and can be fine-tuned
to operate within specific time windows.
See also:
https://kubernetes.io/blog/2020/01/22/kubeinvaders-gamified-chaos-engineering-tool-for-kubernetes/
Ingress Controller for AWS
- Allows to reuse AWS load balancing functionality.
- It uses AWS CloudFormation to ensure that cluster state
remains consistent.
Gatekeeper policy controls
The Open Policy Agent project (OPA) provides a way to create policies
across cloud-native application stacks, from ingress to service-mesh
components to Kubernetes. Gatekeeper provides a Kubernetes-native way
to enforce OPA policies on a cluster automatically, and to audit for
any events or resources violating policy. All this is handled by a
relatively new mechanism in Kubernetes, admission controller
Webhooks, that fire on changes to resources. With Gatekeeper, OPA
policies can be maintained as just another part of your Kubernetes
cluster’s defined state, without needing constant babysitting.
Teleport
- implement industry-best practices for SSH and
Kubernetes access, meet compliance requirements, and have complete
visibility into access and behavior.
· Security best practices out-of-the-box.
· Isolate critical infra and enforce 2FA with SSH and Kubernetes.
· Provide role-based access controls (RBAC) using short-lived
certificates and your existing identity management service.
· Log events and record session activity for full auditability.
Kubecost: Cost metrics running k8s
- monitor "the dollars" cost of running Kubernetes?
- Kubecost uses real-time Kubernetes metrics, and real-world cost
information derived from running clusters on the major cloud
providers, to provide a dashboard view of the monthly cost of each
cluster deployment. Costs for memory, CPU, GPU, and storage are all
broken out by Kubernetes component (container, pod, service,
deployment, etc.).
- Kubecost can also track the costs of “out of cluster” resources,
such as Amazon S3 buckets, although this is currently limited to AWS.
- Kubecost is free to use if you only need to keep 15 days of logs. For
more advanced features, pricing starts at $199 per month for
monitoring 50 nodes.
Kubespray: Cluster Bootstrap
@[https://github.com/kubernetes-sigs/kubespray/blob/master/docs/comparisons.md]
- Kubespray is formed by a set of Ansible playbooks (optionally Vagrant) allowing
to create a new production-ready k8s cluster (mono/multi-master,
single/distributed etcd, Flannel|Calico|Weave|... network...).
Extra playbooks allows to add/remove new nodes to a running k8s cluster.
- It targets "mostly" any environment, from bare metal (RedHat, CoreOS, Ubuntu,...)
to public clouds.
- Compared to "Kops", "Kops" is more tightly integrated and profits better from
the unique features of the supported clouds (AWS,GCE, ...). "Kops" doesn't
allow to deploy on standard linux distros running on bare-metal/VMs.
KOPS: Deploy on Cloud
@[https://github.com/kubernetes/kops/blob/master/README.md]
@[https://github.com/kubernetes/kops/blob/master/README.md]
- Kops cli-tool allows the automation of production-grade k8s cluster
(creation/destruction, upgrade, maintenance).
- AWS is currently officially supported, with GCE in beta support,
and VMware vSphere in alpha, and other platforms planned.
- It does NOT support standard linux distros on bare-metal/VMs.
Kubespray is prefered in this case.
Kubeadm Overview of kubeadm kubeadm init kubeadm join kubeadm upgrade kubeadm config kubeadm reset kubeadm token kubeadm version kubeadm alpha Implementation details Note: Kubespray offers a "simpler" front-end based on Ansible playbooks to Kubeadm. Example Tasks: - Upgrading kubeadm HA clusters from 1.9.x to 1.9.y @[https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-ha/] - Upgrading/downgrading kubeadm clusters between v1.8 to v1.9: @[https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-1-9/] - Upgrading kubeadm clusters from 1.7 to 1.8 @[https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-1-8/] - Upgrading kubeadm clusters from v1.10 to v1.11: @[https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-1-11/]
WKSctl GitOps install
- Tool for Kubernetes Cluster Management Using BºGitOpsº.
WKSctl is an open-source project to install, bootstrap, and manage Kubernetes
clusters, including add-ons, through SSH. WKS is a provider of the Cluster API
(CAPI) using the GitOps approach. Kubernetes cluster configuration is defined
in YAML, and WKSctl applies the updates after every push in Git, allowing users
to have repeatable clusters on-demand.
- CAPI (@[https://cluster-api.sigs.k8s.io/] k8s sub-project is focused on providing
declarative APIs and tooling to simplify provisioning, upgrading, and operating
multiple Kubernetes clusters.
Other Admin Tasks
TLS Cert Mng:
- Managing TLS Certs:
@[https://kubernetes.io/docs/tasks/tls/managing-tls-in-a-cluster/]
@[https://kubernetes.io/docs/tasks/tls/certificate-rotation/]
- kubelet TLS setup:
@[https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-authentication-authorization/]
@[https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/]
private Reg.
- using a private (image) registry:
@[https://kubernetes.io/docs/concepts/containers/images/#using-a-private-registry]
- Pull Image from a Private Registry:
@[https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/]
- imagePullSecrets:
@[https://kubernetes.io/docs/concepts/configuration/secret/]
- method to pass a secret that contains a Docker image registry password
to the Kubelet so it can pull a private image on behalf of your Pod.
Mng.Cluster DaemonSets
- Perform a Rollback on a DaemonSet:
@[https://kubernetes.io/docs/tasks/manage-daemon/rollback-daemon-set/]
- Perform a Rolling Update on a DaemonSet:
@[https://kubernetes.io/docs/tasks/manage-daemon/update-daemon-set/]
Install Service Catalog
- Install Service Catalog using Helm:
@[https://kubernetes.io/docs/tasks/service-catalog/install-service-catalog-using-helm/]
- Install Service Catalog using SC:
@[https://kubernetes.io/docs/tasks/service-catalog/install-service-catalog-using-sc/]
Max Node Lattency
@[https://stackoverflow.com/questions/46891273/kubernetes-what-is-the-maximum-distance-latency-supported-between-kubernetes-no]
"""....There is no latency limitation between nodes in kubernetes cluster. They are configurable parameters.
└ For kubelet on worker node:
--node-status-update-frequency duration Specifies how often kubelet posts node status to master.
Note: be cautious when changing the constant, it must work
with nodeMonitorGracePeriod in nodecontroller. (default 10s)
└ For controller-manager on master node:
--node-monitor-grace-period duration Amount of time which we allow running Node to be unresponsive
before marking it unhealthy. Must be N times more than kubelet's
nodeStatusUpdateFrequency, where N means number of retries allowed
for kubelet to post node status. (default 40s)
--node-monitor-period duration The period for syncing NodeStatus in NodeController. (default 5s)
--node-startup-grace-period duration Amount of time which we allow starting Node to be unresponsive before
marking it unhealthy. (default 1m0s)
"""
Admin Tasks Config Ref() @[https://kubernetes.io/docs/concepts/cluster-administration/cluster-administration-overview/] @[https://kubernetes.io/docs/concepts/cluster-administration/certificates/] @[https://kubernetes.io/docs/concepts/cluster-administration/cloud-providers/] @[https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/] @[https://kubernetes.io/docs/concepts/cluster-administration/networking/] @[https://kubernetes.io/docs/concepts/cluster-administration/logging/] @[https://kubernetes.io/docs/concepts/cluster-administration/kubelet-garbage-collection/] @[https://kubernetes.io/docs/concepts/cluster-administration/proxies/] @[https://kubernetes.io/docs/concepts/cluster-administration/controller-metrics/] @[https://kubernetes.io/docs/concepts/cluster-administration/addons/]
Config. files (documented in the Reference section of the online documentation, under each binary:) kubelet kube-apiserver kube-controller-manager kube-scheduler.
Compute, Storage, and Networking Extensions Network Plugins Device Plugins Service Catalog
kubecfg k8s as code @[https://github.com/ksonnet/kubecfg] @[https://www.youtube.com/watch?v=zpgp3yCmXok] Writing less yaml A tool for managing Kubernetes resources as code. kubecfg allows you to express the patterns across your infrastructure and reuse these powerful "templates" across many services, and then manage those templates as files in version control. The more complex your infrastructure is, the more you will gain from using kubecfg. The idea is to describe as much as possible about your configuration as files in version control (eg: git).
Cluster Federation @[https://kubernetes.io/docs/concepts/cluster-administration/federation/] ºFederation APIº: - @[https://kubernetes.io/docs/reference/federation/extensions/v1beta1/definitions/] - @[https://kubernetes.io/docs/reference/federation/extensions/v1beta1/operations/] - @[https://kubernetes.io/docs/reference/federation/v1/definitions/] - @[https://kubernetes.io/docs/reference/federation/v1/operations/] ºExternal references:º - @[https://kubernetes.io/docs/reference/command-line-tools-reference/federation-apiserver/] - @[https://kubernetes.io/docs/reference/command-line-tools-reference/federation-controller-manager/] - Federated Cluster - Federated ConfigMap - Federated DaemonSet - Federated Deployment - Federated Events - Federated Horizontal Pod Autoscalers (HPA) - Federated Ingress - Federated Jobs - Federated Namespaces - Federated ReplicaSets - Federated Secrets - Extend kubectl with plugins - Manage HugePages ºSecretsº - Federated Secrets
kubefed(eration):
Controls cluster federation
- Cross-cluster Service Discovery using Federated Services
https://kubernetes.io/docs/tasks/federation/federation-service-discovery/
- https://kubernetes.io/docs/tasks/federation/set-up-cluster-federation-kubefed/
- https://kubernetes.io/docs/tasks/federation/set-up-coredns-provider-federation/
- https://kubernetes.io/docs/tasks/federation/set-up-placement-policies-federation/
REFERENCE:
@[https://kubernetes.io/docs/reference/setup-tools/kubefed/kubefed/]
@[https://kubernetes.io/docs/reference/setup-tools/kubefed/kubefed-options/]
@[https://kubernetes.io/docs/reference/setup-tools/kubefed/kubefed-init/]
@[https://kubernetes.io/docs/reference/setup-tools/kubefed/kubefed-join/]
@[https://kubernetes.io/docs/reference/setup-tools/kubefed/kubefed-unjoin/]
@[https://kubernetes.io/docs/reference/setup-tools/kubefed/kubefed-version/]
Federated Services
@[https://kubernetes.io/docs/tasks/federation/federation-service-discovery/]
Using the k8s API Kubernetes API Overview
Accessing the API Controlling Access to the Kubernetes API Authenticating Authenticating with Bootstrap Tokens Using Admission Controllers Dynamic Admission Control Managing Service Accounts Authorization Overview Using RBAC Authorization Using ABAC Authorization Using Node Authorization Webhook Mode
K8s Implementation Summary - Julia Evans "A few things I've learned about Kubernetes" - """... you can run the kubelet by itself! And if you have a kubelet, you can add the API server and just run those two things by themselves! Okay, awesome, now let’s add the scheduler!""" - the “kubelet” is in charge of running containers on nodes - If you tell the API server to run a container on a node, it will tell the kubelet to get it done (indirectly) - The scheduler translates "run a container" to "run a container on node X" ºetcd is Kubernetes’ brainº - Every component in Kubernetes (API server, scheduler, kubelets, controller manager, ...) is stateless. All of the state is stored in the (key-value store) etcd database. - Communication between components (often) happens via etcd. -Oºbasically everything in Kubernetes works by watching etcd for stuff it has to do,º Oºdoing it, and then writing the new state back into etcd º Ex 1: Run a container on Machine "X": Wrong way: ask kubelet@Machine"X" to run the container. Right way: kubectl*1 →(API Server)→ etcd: "This pod should run on Machine X" kubelet@Machine"X" → etcd: check work to do kubelet@Machine"X" ← etcd: "This pod should run on Machine X" kubelet@Machine"X" ← kubelet@Machine"X": Run pod Ex 2: Run a container anywhere on the k8s cluster kubectl*1 → (API Server) → etcd: "This pod should run somewhere" scheduler → etcd: Something to run? scheduler ← etcd: "This pod should run somewhere" scheduler → kuberlet@Machine"Y": Run pod *1 The kubectl is used from the command line. In the sequence diagram it can be replaced for any of the existing controllers (ReplicaSet, Deployment, DaemonSet, Job,...) ºAPI server roles in cluster:º API Server is responsible for: 1.- putting stuff into etcd kubectl → API Server : put "stuff" in etcd API Server → API Server : check "stuff" alt 1: kubectl ← API Server : error: "stuff" is wrong alt 2: API Server → etcd : set/update "stuff" 2.- Managing authentication: ("who is allowed to put what stuff into etcd") The normal way is through X509 client certs. ºcontroller manager does a bunch of stuffº Responsible for: - Inspect etcd for pending to schedule pods. - daemon-set-controllers will inspect etcd for pending daemonsets and will call the scheduler to run them on every machine with the given pod configuration. - The "replica set controller" will inspect etcd for pending replicasets and will create 5 pods that the scheduler will then schedule. - "deployment controller" ... ºTroubleshooting:º something isn’t working? figure out which controller is responsible and look at its logs ºCore K8s components run inside of k8sº - Only 5 things needs to be running before k8s starts up: - the scheduler - the API server - etcd - kubelets on every node (to actually execute containers) - the controller manager (because to set up daemonsets you need the controller manager) Any other core system (DNS, overlay network,... ) can be scheduled by k8s inside k8s
API Conventions
@[https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md]
Source Code Layout Note: Main k8s are placed in kubernetes/pkg/ (API, kubectl, kubelet, controller, ...) - REF: A Tour of the Kubernetes Source Code Part One: From kubectl to API Server ºExamining kubectl sourceº Locating the implementation of kubectl commands in the Kubernetes source code - kubectl entry point for all commands - Inside there is a name of a go directory that matches the kubectl command: Example: kubectl create/create.go ºK8s loves the Cobra Command Frameworkº - k8s commands are implemented using the Cobra command framework. - Cobra provides lot of features for building command line interfaces amongst them, Cobra puts the command usage message and command descriptions adjacent to the code that runs the command. Ex.: | // NewCmdCreate returns new initialized instance of create sub command | func NewCmdCreate(f cmdutil.Factory, ioStreams genericclioptions.IOStreams) *cobra.Command { | o := NewCreateOptions(ioStreams) | | cmd := ⅋cobra.Command{ | Use: "create -f FILENAME", | DisableFlagsInUseLine: true, | Short: i18n.T("Create a resource from a file or from stdin."), | Long: createLong, | Example: createExample, | Run: func(cmd *cobra.Command, args []string) { | if cmdutil.IsFilenameSliceEmpty(o.FilenameOptions.Filenames) { | defaultRunFunc := cmdutil.DefaultSubCommandRun(ioStreams.ErrOut) | defaultRunFunc(cmd, args) | return | } | cmdutil.CheckErr(o.Complete(f, cmd)) | cmdutil.CheckErr(o.ValidateArgs(cmd, args)) | cmdutil.CheckErr(o.RunCreate(f, cmd)) | }, | } | | // bind flag structs | o.RecordFlags.AddFlags(cmd) | | usage := "to use to create the resource" | cmdutil.AddFilenameOptionFlags(cmd, ⅋o.FilenameOptions, usage) | ... | o.PrintFlags.AddFlags(cmd) | | // create subcommands | cmd.AddCommand(NewCmdCreateNamespace(f, ioStreams)) | ... | return cmd | } ºBuilders and Visitors Abound in Kubernetesº Ex. code: | r := f.NewBuilder(). | Unstructured(). | Schema(schema). | ContinueOnError(). | NamespaceParam(cmdNamespace).DefaultNamespace(). | FilenameParam(enforceNamespace, ⅋o.FilenameOptions). | LabelSelectorParam(o.Selector). | Flatten(). | Do() The functions Unstructured, Schema, ContinueOnError,... Flatten all take in a pointer to a Builder struct, perform some form of modification on the Builder struct, and then return the pointer to the Builder struct for the next method in the chain to use when it performs its modifications defined at: https://github.com/kubernetes/cli-runtime/blob/master/pkg/genericclioptions/resource/builder.go: | ... | func (b ºBuilder) Schema(schema validation.Schema) ºBuilder { | b.schema = schema | return b | } | ... | func (b ºBuilder) ContinueOnError() ºBuilder { | b.continueOnError = true | return b | } The Do function finally returns a Result object that will be used to drive the creation of our resource. It also creates a Visitor object that can be used to traverse the list of resources that were associated with this invocation of resource.NewBuilder. The Do function implementation is shown below. a new DecoratedVisitor is created and stored as part of the Result object that is returned by the Builder Do function. The DecoratedVisitor has a Visit function that will call the Visitor function that is passed into it. |// Visit implements Visitor |func (v DecoratedVisitor) Visit(fn VisitorFunc) error { | return v.visitor.Visit(func(info *Info, err error) error { | if err != nil { | return err | } | for i := range v.decorators { | if err := v.decorators[i](info, nil); err != nil { | return err | } | } | return fn(info, nil) | }) |} Create eventually will call the anonymous function that contains the createAndRefresh function that will lead to the code making a REST call to the API server. The createAndRefresh function invokes the Resource NewHelper function found in ...helper.go returning a new Helper object: | func NewHelper(client RESTClient, mapping ºmeta.RESTMapping) ºHelper { | return ⅋Helper{ | Resource: mapping.Resource, | RESTClient: client, | Versioner: mapping.MetadataAccessor, | NamespaceScoped: mapping.Scope.Name() == meta.RESTScopeNameNamespace, | } | } Finally the Create function iwill invoke a createResource function of the Helper Create function. The Helper createResource function, will performs the actual REST call to the API server to create the resource we defined in our YAML file. ºCompiling and Running Kubernetesº - we are going to use a special option that informs the Kubernetes build process $ make WHAT='cmd/kubectl' # ← compile only kubectl Test it like: On terminal 1 boot up local test hack cluster: $ PATH=$PATH KUBERNETES_PROVIDER=local hack/local-up-cluster.sh On terminal 2 execute the compiled kubectl: $ cluster/kubectl.sh create -f nginx_replica_pod.yaml ºCode Learning Toolsº Tools and techniques that can really help accelerate your ability to learn the k8s src: - Chrome Sourcegraph Plugin: provides several advanced IDE features that make it dramatically easier to understand Kubernetes Go code when browsing GitHub repositories. Ussage: - start by looking at an absolutely depressing snippet of code, with ton of functions. - Hover over each code function with Chrome browser + Sourcegraph extension installed: It will popup a description of the function, what is passed into it and what it returns. - It also provides advanced view with the ability to peek into the function being invoked. - Properly formatted print statements: fmt.Println("\n createAndRefresh Info = %#v", info) - Use of a go panic to get desperately needed stack traces: | func createAndRefresh(info *resource.Info) error { | fmt.Println("\n createAndRefresh Info = %#v", info) | ºpanic("Want Stack Trace")º | obj, err := resource.NewHelper(info.Client, info.Mapping).Create(info.Namespace, true, info.Object) | if err != nil { | return err | } | info.Refresh(obj, true) | return nil | } - GitHub Blame to travel back in time: "What was the person thinking when they committed those lines of code?" - GitHub browser interface has a blame option available as a button on the user interface: It returns a view of the code that has the commits responsible for each line of code in the source file. This allows you to go back in time and look at the commit that added a particular line of code and determine what the developer was trying to accomplish when that line of code was added.
Extending k8s API
- Custom Resources:
@[https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/]
- Extend API with CustomResourceDefinitions:
@[https://kubernetes.io/docs/tasks/access-kubernetes-api/custom-resources/custom-resource-definitions/]
- Versions of CustomResourceDefinitions:
@[https://kubernetes.io/docs/tasks/access-kubernetes-api/custom-resources/custom-resource-definition-versioning/]
- Migrate a ThirdPartyResource to CustomResourceDefinition:
@[https://kubernetes.io/docs/tasks/access-kubernetes-api/custom-resources/migrate-third-party-resource/]
- Aggregation Layer:
@[https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/]
- Configure the Aggregation Layer:
@[https://kubernetes.io/docs/tasks/access-kubernetes-api/configure-aggregation-layer/]
- Setup an Extension API Server:
@[https://kubernetes.io/docs/tasks/access-kubernetes-api/setup-extension-api-server/">
- Use HTTP Proxy to Access the k8s API:
@[https://kubernetes.io/docs/tasks/access-kubernetes-api/http-proxy-access-api/]
GPU and Kubernetes
@[https://www.infoq.com/news/2018/01/gpu-workflows-kubernetes]
@[https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/]
Skaffold
@[https://www.infoq.com/news/2018/03/skaffold-kubernetes/]
- Tool to facilitate Continuous Development by allowing Local Kubernetes Development.
optimized “Source to Kubernetes” - Skaffold detects changes in
your source code and handles the pipeline to build, push, and deploy
your application automatically with policy-based image tagging and
highly optimized, fast local workflows.
- Skaffold handles the workflow for building, pushing and deploying the application,
allowing to focus on what matters most: writing code.
-ºLightweightº: client-side only.
-ºWorks Everywhereº: share your project with 'git clone' → 'skaffold run'.
- Support forºprofiles, local user config, environment variables, and flagsº
to easily incorporate differences across environments.
-ºFeature Richº: policy-based image tagging, resource port-forwarding and
logging, file syncing, and much more.
-ºOptimized Developmentº: Skaffold's inner loop is tight and highly optimized,
providing instant feedback while developing.
JAVA api Client
- Useful for example to get k8s service config inside
running container.
Summary extracted from:
https://github.com/hyperledger/besu/blob/master/nat/src/main/java/org/hyperledger/besu/nat/kubernetes/KubernetesNatManager.java
import io.kubernetes.client.ApiClient;
import io.kubernetes.client.Configuration;
import io.kubernetes.client.apis.CoreV1Api;
import io.kubernetes.client.models.V1Service;
import io.kubernetes.client.util.ClientBuilder;
import io.kubernetes.client.util.KubeConfig;
import io.kubernetes.client.util.authenticators.GCPAuthenticator;
...
try {
KubeConfig.registerAuthenticator(new GCPAuthenticator());
final ApiClient client = ClientBuilder.cluster().build();
Configuration.setDefaultApiClient(client);
final CoreV1Api api = new CoreV1Api();
// invokes the CoreV1Api client
final V1Service service =
api.listServiceForAllNamespaces(null, null, null, null, null, null, null, null, null)
.getItems().stream()
.filter(
v1Service -˃ v1Service.getMetadata().getName().contains(besuServiceNameFilter))
.findFirst()
.orElseThrow(() -˃ new NatInitializationException("Service not found"));
updateUsingBesuService(service);
internalAdvertisedHost =
getIpDetector(service)
.detectAdvertisedIp()
.orElseThrow(
() -˃ new NatInitializationException("Unable to retrieve IP from service"));
final String internalHost = queryLocalIPAddress().get(TIMEOUT_SECONDS, TimeUnit.SECONDS);
service.getSpec().getPorts().forEach( v1ServicePort -˃ { ... } );
final String serviceType = service.getSpec().getType();
final Optional˂String˃ clusterIP = v1Service.getSpec().getClusterIP();
} catch (Exception e) {
throw new RuntimeException(
"Failed update information using pod metadata : " + e.getMessage(), e);
}
Running Apps Run Applications - Run Single-Instance Stateful Application Run a Replicated Stateful Application Update API Objects in Place Using kubectl patch Scale a StatefulSet Delete a StatefulSet Force Delete StatefulSet Pods Perform Rolling Update Using a Replication Controller Horizontal Pod Autoscaler Horizontal Pod Autoscaler Walkthrough
CRDs (Custom Resource Definition)
https://kubernetes.io/blog/page/17/
KubeDirector: The easy way to run complex stateful applications on Kubernetes
open source project designed to make it easy to run complex stateful
scale-out application clusters on Kubernetes. KubeDirector is built
using the custom resource definition (CRD) framework and leverages
the native Kubernetes API extensions and design philosophy. This
enables transparent integration with Kubernetes user/resource
management as well as existing clients and tools.
Topology-Aware Volume Provisioning
@[https://kubernetes.io/blog/page/12/]
- multi-zone cluster experience with persistent volumes is
improving in k8s 1.12 with topology-aware dynamic provisioning
beta feature.
- It allows k8s to make intelligent decisions when dynamically
provisioning volumes by getting scheduler input on the best place(zone)
to provision a volume for a pod.
IPVS-Based In-Cluster Load Balancing
https://kubernetes.io/blog/2018/07/09/ipvs-based-in-cluster-load-balancing-deep-dive/
IPVS-Based In-Cluster Load Balancing Deep Dive
What Is IPVS?
IPVS (IP Virtual Server) is built on top of the Netfilter and
implements transport-layer load balancing as part of the Linux kernel.
IPVS is incorporated into the LVS (Linux Virtual Server), where it
runs on a host and acts as a load balancer in front of a cluster of
real servers. IPVS can direct requests for TCP- and UDP-based
services to the real servers, and make services of the real servers
appear as virtual services on a single IP address. Therefore, IPVS
naturally supports Kubernetes Service.
audits log events
(1.11+)
k8s DNS
RBAC TLS Sec
https://sysdig.com/blog/kubernetes-security-rbac-tls/
https://www.cncf.io/blog/2018/11/05/34097/
Failures learnt
https://es.slideshare.net/try_except_/kubernetes-on-aws-at-zalando-failures-learnings-devops-nrw
IntelliJ k8s plugin
https://blog.jetbrains.com/idea/2018/03/intellij-idea-2018-1-kubernetes-support/
k8s Native IDE
https://www.zdnet.com/article/red-hat-introduces-first-kubernetes-native-ide/?ftag=TRE-03-10aaa6b&bhid=28374205867001011904732094012637
https://www.infoq.com/articles/ambassador-api-gateway-kubernetes
Best Patterns
"...Without an indication how much CPU and memory a container needs,
Kubernetes has no other option than to treat all containers equally.
That often produces a very uneven distribution of resource usage.
Asking Kubernetes to schedule containers without resource
specifications is like entering a taxi driven by a blind person..."
"...The fact that we can use Deployments with PersistentVolumes
does not mean that is the best way to run stateful applications..."
Hybric Cloud+On-Premise
https://cloud.google.com/blog/products/gcp/going-hybrid-with-kubernetes-on-google-cloud-platform-and-nutanix
Rancher K3s
https://www.itprotoday.com/containers/rancher-labs-k3s-shrinks-kubernetes-edge
Rancher Labs' K3s Shrinks Kubernetes for the Edge
For running containers at the edge, Rancher Labs has created K3s, a
Kubernetes distribution that weighs-in at 40MB and needs only 512MB
RAM to run.
"Clearing House" for k8s Ops
https://www.datacenterknowledge.com/open-source/aws-google-microsoft-red-hats-new-registry-act-clearing-house-kubernetes-operators
AWS, Google, Microsoft, Red Hat's New Registry to Act as Clearing
House for Kubernetes Operators
Operators make life easier for Kubernetes users, but they're so
popular that finding good ones is not easy. Operatorhub.io is an
attempt to fix that.
Gluster-k8s
https://github.com/gluster/gluster-kubernetes
gluster-kubernetes is a project to provide Kubernetes administrators a
mechanism to easily deploy GlusterFS as a native storage service onto an
existing Kubernetes cluster. Here, GlusterFS is managed and orchestrated like
any other app in Kubernetes. This is a convenient way to unlock the power of
dynamically provisioned, persistent GlusterFS volumes in Kubernetes.
airbnb workflow
https://www.infoq.com/news/2019/03/airbnb-kubernetes-workflow
Ranher Submariner Multicluster
https://www.infoq.com/news/2019/03/rancher-submariner-multicluster
proper k8s cluster shutdown
https://serverfault.com/questions/893886/proper-shutdown-of-a-kubernetes-cluster
K8s Bible
- The K8s Bible for Beginners and developers:
https://docs.google.com/document/d/1O-BwDTuE4qI0ASE7iFp6qFpTj8uIVrl9F0HUrC4u_GQ/edit
Jaeger
https://opensource.com/article/19/3/getting-started-jaeger
- https://developers.redhat.com/blog/2019/11/14/tracing-kubernetes-applications-with-jaeger-and-eclipse-che/
- https://www.jaegertracing.io/
- Jaeger: open source, end-to-end distributed tracing
- Monitor and troubleshoot transactions in complex distributed systems
As on-the-ground microservice practitioners are quickly realizing, the
majority of operational problems that arise when moving to a distributed
architecture are ultimately grounded in two areas: networking and
observability. It is simply an orders of magnitude larger problem to network
and debug a set of intertwined distributed services versus a single
monolithic application.
pci/dss compliance
https://www.infoq.com/news/2019/04/kubernetes-pci-dss-compliance
Zero Downtime Migrations in Istio era
C⅋P from JBCNConf 2019:
By Alex Soto
Java Champion, Engineer @ Red Hat. Speaker, CoAuthor of Testing Java
Microservices book, Member of JSR374 and Java advocate
Zero Downtime Migrations in Istio era
- You joined the DevOps movement and want to release software even
faster and safer. You started reading about Advanced deployment
techniques like Blue-Green Deployment, Canary Releases or Dark Shadow
Technique. But how do you implement them without disrupting your
users in production? With Zero Downtime! This is easy with your code,
but what about ephemeral and persistent states? Most of the examples
out there does not tackle this scenario (which is the most common in
real systems). Come to this session and you’ll learn in a practical
way how you can achieve zero downtime deployments applying advanced
deployment techniques for maintaining the ephemeral and persistent
state with Istio
Litmus chaos engineering framework
https://www.infoq.com/news/2019/05/litmus-chaos-engineering-kube/
Chaos Engineering Kubernetes with the Litmus Framework
Litmus:
- open source chaos engineering framework for Kubernetes
environments runninGºstateful applicationsº
- Litmus can be added to CI/CD pipelines
- designed to catch hard-to-detect bugs in Kubernetes
that might be missed by unit or integration tests.
- focus on application resilience:
- pre-existing tests for undesirable behavior such:
- container crash
- disk failure
- or network delay
- packet loss.
- can also be used to determine if a Kubernetes deployment
is suitable for stateful workloads.
cli reference Feature Gates cloud-controller-manager kube-apiserver kube-controller-manager kube-proxy kube-scheduler kubelet Master-Node communication
Kubermatic
https://www.loodse.com/
python + k8s api
https://www.redhat.com/sysadmin/create-kubernetes-cron-job-okd
12factor
https://12factor.net/
k8s cli utilities
https://developers.redhat.com/blog/2019/05/27/command-line-tools-for-kubernetes-kubectl-stern-kubectx-kubens/
-ºsternº
- display the tail end of logs for containers and multiple pods.
- the stern project comes from Wercker (acquired by Oracle in 2017).
- rather than viewing an entire log to see what happened most
recently, you can use stern to watch a log unfurl.
-ºkubectxº
- helpful for multi-cluster installations, where you need to
switch context between one cluster and another.
- Rather than type a series of lengthy kubectl command, kubectx
works it magic in one short command.
- It also allows you to alias a lengthy cluster name into an alias.
- For example (taken directly from the kubectx website),
kubectx eu=gke_ahmetb-samples-playground_europe-west1-b_dublin allows
you to switch to that cluster by running kubectx eu.
- Another slick trick is that kubectx remembers your previous context—
much like the “Previous” button on a television remote—and allows
you to switch back by running kubectx -.
-ºkubensº
- easily switch between Kubernetes namespaces.
$ kubens foo # activate namespace.
$ kubens - # back to previous value.
- Author: Ahmet Alp Balkan
-ºkubetailº
Bash script that enables you to aggregate (tail/follow) logs from
multiple pods into one stream. This is the same as running "kubectl
logs -f " but for multiple pods.
portworx Persistent Storage
https://portworx.com/
calable Persistent Storage for Kubernetes
Built from the ground up for containers, PX-Store provides cloud
native storage for applications running in the cloud, on-prem and in
hybrid/multi-cloud environments.
Tigera.io
https://www.tigera.io/
(authors of the Calico Zero-Trust Network with Policy-based micro-segmentation)
https://www.projectcalico.org/,
Why add another layer of overhead when you don't need it?
Sometimes, an overlay network (encapsulating packets inside an extra
IP header) is necessary. Often, though, it just adds unnecessary
overhead, resulting in multiple layers of nested packets, impacting
performance and complicating trouble-shooting. Wouldn't it be nice if
your virtual networking solution adapted to the underlying
infrastructure, using an overlay only when required? That's what
Calico does. In most environments, Calico simply routes packets from
the workload onto the underlying IP network without any extra
headers. Where an overlay is needed – for example when crossing
availability zone boundaries in public cloud – it can use
lightweight encapsulation including IP-in-IP and VxLAN. Project
Calico even supports both IPv4 and IPv6 networks!
https://k3s.io/ Rancher Labs Lightweight Kubernetes: - Easy to install. A binary of less than 40 MB, Only 512MB of RAM required
Kube-router
https://github.com/cloudnativelabs/kube-router
kubectl-trace (bpftrace)
https://github.com/iovisor/kubectl-trace
Schedule bpftrace programs on your kubernetes cluster using the kubectl
kubeless
https://github.com/kubeless/kubeless
Kubernetes Native Serverless Framework https://kubeless.io
Automatically sync groups into Kubernetes RBAC @[https://github.com/cruise-automation/rbacsync]
Red Hat Quay 3.1
https://www.zdnet.com/article/red-hat-quay-3-1-a-highly-available-kubernetes-container-registry-arrives/
- Red Hat Quay 3.1, a highly available Kubernetes container registry, arrives
raefik edge router
https://docs.traefik.io/
raefik is an open-source Edge Router that makes publishing your
services a fun and easy experience. It receives requests on behalf of
your system and finds out which components are responsible for
handling them.
What sets Traefik apart, besides its many features, is that it
automatically discovers the right configuration for your services.
The magic happens when Traefik inspects your infrastructure, where it
finds relevant information and discovers which service serves which
request.
Traefik is natively compliant with every major cluster technology,
such as Kubernetes, Docker, Docker Swarm, AWS, Mesos, Marathon, and
the list goes on; and can handle many at the same time. (It even
works for legacy software running on bare metal.)
K8s Sec. Policies
- Security Policies - To define ⅋ control security aspects of Pods,
use Pod Security Policy (available on v1.15). According Kubernetes
Documentation, it would enable fine-grained authorization of pod
creation and updates. Defines a set of conditions that a pod must run
with in order to be accepted into the system, as well as defaults for
the related fields. They allow an administrator to control the
following:
Running of privileged containers
Usage of host namespaces
Usage of host networking and ports
Usage of volume types
Usage of the host filesystem
Restricting escalation to root privileges
The user and group IDs of the container
AppArmor or seccomp or sysctl profile used by containers
Automated Testing of IaC
https://qconsf.com/system/files/presentation-slides/qconsf2019-yevgeniy-brikman-automated-testing-for-terraform-docker-packer-kubernetes-and-more.pdf
https://www.infoq.com/news/2019/11/automated-testing-infrastructure/
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Microsoft Keda
https://www.infoq.com/news/2019/11/microsoft-keda-1-0-kubernetes/
Microsoft Announces 1.0 Release of Kubernetes-Based Event-Driven Autoscaling (KEDA)
k8s reality check
https://enterprisersproject.com/article/2019/11/kubernetes-reality-check-3-takeaways-kubecon
"Hard Way" datadog
https://www.infoq.com/news/2019/12/kubernetes-hard-way-datadog/
A traditional architecture approach is to have all Kubernetes master
components in one server, and have at least three servers for high
availability. However, these components have different
responsibilities and can’t or don't need to scale in the same way.
For instance, the scheduler and the controller are stateless
components, making them easy to rotate. But the etcd component is
stateful and needs to have redundant copies of the data. Also,
components like the scheduler work with an election mechanism where
only one of their instances is active. Bernaille said that it
doesn’t make sense to scale out the scheduler.
Falco Security
https://www.infoq.com/news/2020/01/falco-security-cncf/
BrewOPA: Admin Policy made easy
https://www.zdnet.com/article/kubernetes-administration-policy-made-easy-with-brewopa/
Kubernetes administration policy made easy with brewOPA
Administering policies across Kubernetes and other cloud native environments
isn't easy. Now, Cyral wants to take the trouble out of the job with brewOPA.
Kured reboot daemon
Kured, an open-source reboot daemon for Kubernetes.
Kured runs as a [DaemonSet][aks-daemonset] and monitors each node for the presence of
a file indicating that a reboot is required.
Harbor Operator
https://www.infoq.com/news/2020/03/harbor-kubernetes-operator/
k8s the hard way
kubernetes-the-hard-way-virtualbox/README.md
https://github.com/sgargel/kubernetes-the-hard-way-virtualbox/blob/master/README.md
WebAssembly meets Kubernetes with Krustlet
https://cloudblogs.microsoft.com/opensource/2020/04/07/announcing-krustlet-kubernetes-rust-kubelet-webassembly-wasm/
App Manager for GCP
New Application Manager Brings GitOps to Google Kubernetes Engine
https://www.infoq.com/news/2020/03/Kubernetes-application-manager/
K8s operators
Other common misunderstandings, according to Deo: That Operators are
set-it-and-forget-it propositions (as I mentioned above), and that
Operators require a lot of heavy-duty development effort. For the
latter, yes, someone has to write the Operator, but it doesn’t
necessarily need to be you, thanks to repositories like
OperatorHub.io. Moreover, toolkits and SDKs like Operator Framework
and Kubebuilder can cut down development effort.
Gadi Naor, founder and CTO at Alcide, gives an example, from a
security perspective, of where an Operator might not be the ideal
fit: When a cron job will do the trick just fine.
By design, Naor explains, Operators can provision, update, and delete
Kubernetes resources - which means they are in-cluster privileged
components. “This means that Operators represent a persistent
in-cluster risk; therefore these are components that we must treat
appropriately as far as our threat and risk modeling,” Naor says.
According to Naor, if a task can be appropriately handled with a cron
job, that may be the better choice.
“This reduces the time a privileged component is running without
compromising on the required functionality,” Naor explains. “The
key questions before choosing an Operator are: Do we really need to
implement certain workflows or functionality as an Operator? Can we
achieve the same using a cron job, or use cluster external automation
Cilium
https://github.com/cilium/cilium
Cilium is open source software for providing and transparently securing network connectivity and loadbalancing between application workloads such as application containers or processes
limits/request by example
Understanding Kubernetes limits and requests by example | Sysdig
https://sysdig.com/blog/kubernetes-limits-requests/
Okteto
A Tool to Develop Applications in Kubernetes
https://github.com/okteto/okteto/blob/master/README.md
https://www.infoq.com/news/2020/01/qa-okteto-kubernetes-development/
We created a few guides to get you started with your favorite programming language:
- ASP.NET Core
- Golang
- Java Gradle
- Java Maven
- Node
- PHP
- Python
- Ruby
k8s new features:
https://lwn.net/Articles/806896/
Kogito
https://kogito.kie.org/get-started/
Kogito is a cloud-native business automation technology for building
cloud-ready business applications. The name Kogito derives from the
Latin "Cogito", as in "Cogito, ergo sum" ("I think, therefore I am"),
and is pronounced [ˈkoː.d͡ʒi.to] (KO-jee-to). The letter K has
reference to Kubernetes, the base for OpenShift as the target cloud
platform for Kogito, and to the Knowledge Is Everything (KIE) open
source business automation project from which Kogito originates.
Kogito is designed specifically to excel in a hybrid cloud
environment and to be adaptable to your domain and tooling needs. The
core objective of Kogito is to help you mold a set of business
processes and decisions into your own domain-specific cloud-native
set of services.
Image of business assets moving to cloud services
Figure 1. Business processes and decisions to cloud services
When you are using Kogito, you are building a cloud-native
application as a set of independent domain-specific services,
collaborating to achieve some business value. The processes and
decisions that you use to describe the target behavior are executed
as part of the services that you create. The resulting services are
highly distributed and scalable with no centralized orchestration
service, and the runtime that your service uses is optimized for what
your service needs.
Kogito includes components that are based on well-known business
automation KIE projects, specifically Drools, jBPM, and OptaPlanner,
to offer dependable, open source solutions for business rules,
business processes, and constraint solving.
The following list describes some of the examples provided with Kogito:
dmn-quarkus-example and dmn-springboot-example: A decision
service (on Quarkus or Spring Boot) that uses DMN to determine driver
penalty and suspension based on traffic violations
rules-quarkus-helloworld: A Hello World decision service on
Quarkus with a single DRL rule unit
ruleunit-quarkus-example and ruleunit-springboot-example: A
decision service (on Quarkus or Spring Boot) that uses DRL with rule
units to validate a loan application and that exposes REST operations
to view application status
process-quarkus-example and process-springboot-example: A process
service (on Quarkus or Spring Boot) for ordering items and that
exposes REST operations to create new orders or to list and delete
active orders
onboarding-example: A combination of a process service and two
decision services that use DMN and DRL for onboarding new employees
kogito-travel-agency: A combination of process services and
decision services that use DRL and XLS for travel booking, intended
for deployment on OpenShift
Cluster Access
@[https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/]
Kustomization Object kustomization
Manage Mem./CPU/API Resrc. Configure Default Memory Requests and Limits for a Namespace Configure Default CPU Requests and Limits for a Namespace Configure Minimum and Maximum Memory Constraints for a Namespace Configure Minimum and Maximum CPU Constraints for a Namespace Configure Memory and CPU Quotas for a Namespace Configure a Pod Quota for a Namespace Namespaces Walkthrough Share a Cluster with Namespaces
Cluster Debug/Troubleshoot
See also:
- @[https://kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster/]
- @[https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/]
RestartPolicy
Debug Pods and ReplicationControllers
Determine the Reason for Pod Failure
Config Pods/Cont. Assign Pods to Nodes Config.Pod I12n Attach Cont.Lifecycle Handlers Share Process Namespace between Containers in a Pod Translate Docker-Compose to k8s Resources Pod Priority and Preemption Assigning Pods to Nodes Pod Security Policies Resource Quotas
Dynamic Volume Provisioning
@[https://kubernetes.io/docs/concepts/storage/dynamic-provisioning/]
Node-specific Volume Limits
@[https://kubernetes.io/docs/concepts/storage/storage-limits/]
k8s+okHTTP Load balancing
https://medium.com/wandera-engineering/kubernetes-network-load-balancing-using-okhttp-client-54a2db8fc668
lightbitslab
Persistent storage for Kubernetes is readily available but persistent
storage that performs like local NVMe flash, not so much.
git-sync
https://github.com/kubernetes/git-sync
git-sync is a simple command that pulls a git repository into a local
directory. It is a perfect "sidecar" container in Kubernetes - it can
periodically pull files down from a repository so that an application
can consume them.
git-sync can pull one time, or on a regular interval. It can pull
from the HEAD of a branch, from a git tag, or from a specific git
hash. It will only re-pull if the target of the run has changed in
the upstream repository. When it re-pulls, it updates the destination
directory atomically. In order to do this, it uses a git worktree in
a subdirectory of the --root and flips a symlink.
git-sync can pull over HTTP(S) (with authentication or not) or SSH.
git-sync can also be configured to make a webhook call upon
successful git repo synchronization. The call is made after the
symlink is updated.
SideCars
https://kubernetes.io/docs/tasks/access-application-cluster/communicate-containers-same-pod-shared-volume/
https://banzaicloud.com/blog/k8s-sidecars/
VMware ingress
https://www.datacenterknowledge.com/vmware/vmware-hands-control-kubernetes-ingress-project-contour-over-cncf
Developping and Debugging locally Developing and debugging services locally
Access Applications in a Cluster Web UI (Dashboard) Accessing Clusters Configure Access to Multiple Clusters Use Port Forwarding to Access Applications in a Cluster Provide Load-Balanced Access to an Application in a Cluster Use a Service to Access an Application in a Cluster Connect a Front End to a Back End Using a Service Create an External Load Balancer Configure Your Cloud Provider's Firewalls List All Container Images Running in a Cluster Communicate Between Containers in the Same Pod Using a Shared Volume Configure DNS for a Cluster
Logging apps
- Logging Using Elasticsearch and Kibana
https://kubernetes.io/docs/tasks/debug-application-cluster/logging-elasticsearch-kibana/
- Logging Using Stackdriver
https://kubernetes.io/docs/tasks/debug-application-cluster/logging-stackdriver/
- Tools for Monitoring Resources: [app resources]
- https://kubernetes.io/docs/tasks/debug-application-cluster/resource-usage-monitoring/
6 Tips for Running Scalable Workloads
@[https://www.infoq.com/articles/tips-running-scalable-workloads-kubernetes]
Q⅋A with K8s...
@[https://www.infoq.com/news/2018/02/dist-system-patterns-burns]
Distributed Systems programming is not for the faint of heart, and despite
the evolution of platforms and tools from COM, CORBA, RMI, Java EE, Web
Services, Services Oriented Architecture (SOA) and so on, it's more of an art
than a science.
Brendan Burns outlined many of the patterns that enables distributed systems
programming in the blog he wrote in 2015. He and David Oppenheimer, both
original contributors for Kubernetes, presented a paper at Usenix based
around design patterns and containers shortly after.
InfoQ caught up with Burns, who recently authored an ebook titled Designing
Distributed Systems, Patterns and Paradigms for Scaleable Microservices. He
talks about distributed systems patterns and how containers enable it.
Atlassian escalator
@[https://www.infoq.com/news/2018/05/atlassian-kubernetes-autoscaler]
In Kubernetes, scaling can mean different things to different users. We
distinguish between two cases:
- Cluster scaling, sometimes called infrastructure-level scaling,
refers to the (automated) process of adding or removing worker nodes
based on cluster utilization.
- Application-level scaling, sometimes called pod scaling, refers to
the (automated) process of manipulating pod characteristics based on
a variety of metrics, from low-level signals such as CPU utilization
to higher-level ones, such as HTTP requests served per second, for a
given pod. Two kinds of pod-level scalers exist:
- Horizontal Pod Autoscalers (HPAs), which increase or decrease the
number of pod replicas depending on certain metrics.
- Vertical Pod Autoscalers (VPAs), which increase or decrease the
resource requirements of containers running in a pod.
- Atlassian released their in-house OOSS Escalator tool providing
configuration-driven preemptive scale-up and faster scale-down for
k8s nodes.
@[https://developers.atlassian.com/blog/2018/05/introducing-escalator/]
@[https://github.com/atlassian/escalator/]
- Kubernetes has two autoscalers:
-ºhorizontal pod autoscalerº:
@[https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/]
Bºpods can scale down very quicklyº.
-ºcluster autoscalerº to scales the compute infrastructure itself.
@[https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler]
Understandably, it takes a longer time to scale up and down since
VMs must be created/recycled.
RºDelays in cluster autoscaler translate to delays in the pod autoscalerº.
- Atlassian's problem was very specific to batch workloads, with a low tolerance
for delay in scaling up and down forcing to write their own autoscaling functionality
to solve these problems on top of Kubernetes.
- "Escalator" configurable thresholds for upper and lower capacity of
the compute VMs.
@[https://github.com/atlassian/escalator/blob/master/docs/configuration/advanced-configuration.md]
- Some of the configuration properties work by modifying a Kubernetes
feature called º"taint"º:
@[https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/]
- A VM node can be ‘tainted’ (marked) with a certain value so
that pods with a related marker are not scheduled onto it. Unused
nodes would be brought down faster by the Kubernetes standard cluster
autoscaler when they are marked. The scale-up configuration parameter
is a threshold expressed as a percentage of utilization, usually less
than 100 so that there is a buffer. Escalator autoscales the compute
VMs when utilization reaches the threshold, thus making room for
containers that might come up later, and allowing them to boot up
fast.
ArgoCD
@[https://argoproj.github.io/argo-cd/]
Argo CD is a declarative, GitOps continuous delivery tool for Kubernetes.
Application definitions, configurations, and environments should be declarative
and version controlled. Application deployment and lifecycle management should
be automated, auditable, and easy to understand.
Multi-Zone Clusters:
@[https://kubernetes.io/docs/setup/best-practices/multiple-zones/]
See also:
Kubernetes 1.12 introduces topology-aware dynamic provisioning beta,
which aims to improve the regional cluster experience for stateful
workloads. It means Kubernetes now understands the inherent zonal
restrictions of Compute Engine Persistent Disks (PDs) and Regional
PD, and provisions them in the zone that is best suited to run the
pod. Another addition to topology is the Container Storage Interface
(CSI) plugin, which is intended to make it easier for third party
developers to write and deploy volume plugins exposing new storage
systems in Kubernetes.
Kubelet TLS Bootstrap
(k8s 1.12+)
@[https://github.com/kubernetes/enhancements/issues/43]
kubelet generates a private key and a CSR for submission to a
cluster-level certificate signing process.
Volume Snapshot,Restore
- Volume Snapshot/Restore: k8s 1.12+
@[https://kubernetes.io/docs/concepts/storage/persistent-volumes/#volume-snapshot-and-restore-volume-from-snapshot-support]
- TODO: Storage/Persistence: @ma
@[https://kubernetes.io/docs/concepts/storage/volume-pvc-datasource/]
@[https://kubernetes.io/docs/concepts/storage/storage-classes/]
@[https://kubernetes.io/docs/concepts/storage/volume-snapshot-classes/]
@[https://kubernetes.io/docs/concepts/storage/dynamic-provisioning/]
@[https://kubernetes.io/docs/concepts/storage/storage-limits/]
What's New 1.12 @[https://www.infoq.com/news/2018/10/kubernetes-1-12] - Kubernetes 1.12 Brings: - Horizontal Pod Autoscaler: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ The Horizontal Pod Autoscaler automatically scales the number of pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics). Note that Horizontal Pod Autoscaling does not apply to objects that can't be scaled, for example, DaemonSets. - Support for the Container Storage Interface (CSI) plugin: @[https://kubernetes.io/blog/2018/04/10/container-storage-interface-beta/]
k8s on a Pine64 ARM (4GB RAM)
https://itnext.io/building-an-arm-kubernetes-cluster-ef31032636f9
about Local PV
Speaking about Local PV:
"""...Thanks to the Kubernetes scheduler’s intelligent handling of volume topology,
M3DB is able to programmatically evenly disperse its replicas across multiple
local persistent volumes in all available cloud zones, or, in the case of
on-prem clusters, across all available server racks."""
Monitoring .NET Apps
@[https://developers.redhat.com/blog/2020/08/05/monitoring-net-core-applications-on-kubernetes/]
JKube: Deploying Java Apps
Deploy your Java web application into the cloud using Eclipse JKube
@[https://developers.redhat.com/blog/2020/07/27/deploy-your-java-web-application-into-the-cloud-using-eclipse-jkube/]
Best Practices @ma
@[https://dzone.com/articles/best-practices-for-advanced-deployment-patterns]
(Rolling, Blue/Green, Canary, BigBan,...)
5 things to remember
Managing Kubernetes resources: 5 things to remember | The Enterprisers Project
https://enterprisersproject.com/article/2020/8/managing-kubernetes-resources-5-things-remember
flagger: Progressive k8s operator
@[https://github.com/weaveworks/flagger ]
weaveworks/flagger: Progressive delivery Kubernetes operator
Canary → A/B Testing → Blue/Green deployments
Flagger is a progressive delivery tool that automates the release
process for applications running on Kubernetes.It reduces the risk of
introducing a new software version in productionby gradually shifting
traffic to the new version while measuring metrics and running
conformance tests
Argo CD: GitOps
@[https://argoproj.github.io/argo-cd/]
- Declarative GitOps CD for Kubernetes
AWS Controllers k8s
Amazon Announces the Preview of AWS Controllers for Kubernetes (ACK)
https://www.infoq.com/news/2020/09/aws-controllers-k8s-preview
TLDR
@[https://github.com/tldr-pages/tldr/tree/master/pages/common]
kubervisor
https://github.com/AmadeusITGroup/kubervisor
AmadeusITGroup/kubervisor: The Kubervisor allow you to control which
pods should receive traffic or not based on anomaly detection.
It is a new kind of health check system.
Canary Automation:
@[https://github.com/AmadeusITGroup/kanary]
The goal of Kanary project is to bring full automation of Canary
Deployment to kubernetes. So far only rolling update is automated and
fully integrated to Kubernetes. Using CRD (Custom Resource
Definition) and an associated controller, this project will allow you
to define your Canary deployment and chain it with a classic rolling
update in case of success.
All components of a typical k8s app!!
@[https://medium.com/geekculture/how-to-deploy-spring-boot-and-mongodb-to-kubernetes-minikube-71c92c273d5e]
Namespace How-to summary
- scope for names
- Can be assigned max Memory/CPU Quotas
- Useful in "many-users/teams/projects"
(not needed otherwise).
- use labels, not namespaces, to distinguish
resources within the same namespace
- Services are created with DNS entry
"service-name"."namespace".svc.cluster.local
$º$ kubectl get namespaces º
NAME STATUS AGE
default Active 1d
kube-system Active 1d
$º$ kubectl create namespace namespc01 º
·
$º$ kubectl --namespace=namespc01 run \º ← Run on given N.S.
$º nginx --image=nginx º
$º$ kubectl config set-context \ º ← permanently save NS
$º $(kubectl config current-context) \º ← $(...) bash syntax
$º --namespace=MyFavouriteNS º
$º$ kubectl config view | \ º
$º grep namespace: # Validate º
About helm-charts pod templates
https://helm.sh/docs/chart_best_practices/pods/
"...A container image should use a fixed tag or the SHA of the image. It
should not use the tags latest, head, canary, or other tags that are
designed to be "floating"..."
Knative
https://cloud.google.com/knative/
Kubernetes-based platform to build, deploy, and manage modern serverless workloads.
Essential base primitives for all
- set of middleware components to build source-centric+container-based apps
that can run anywhere: on premises or cloud applying best practices shared
by successful real-world Kubernetes-based frameworks.
- It enables developers to focus on writing interesting code, without worrying [low_code]
about the “boring but difficult” parts of building, deploying, and managing
Askingan application.
- solve mundane but difficult tasks such as orchestrating source-to-container workflows,
routing and managing traffic during deployment, auto-scaling your workloads,
or binding running services to eventing ecosystems.
- It supports common development patterns such GitOps, DockerOps, ManualOps, ...Django,
Ruby on Rails, Spring, ...
- Operator-friendly
kubectl recipes
$º$ kubectl get events --all-namespaces º
NAMESPACE LAST SEEN TYPE REASON OBJECT MESSAGE
default 29d Normal NodeNotReady node/node1 Node node1 status is now: NodeNotReady
default 7d4h Normal Starting node/node1 Starting kubelet.
default 7d4h Normal NodeHasSufficientMemory node/node1 Node node1 status is now: NodeHasSufficientMemory
default 7d4h Normal NodeHasNoDiskPressure node/node1 Node node1 status is now: NodeHasNoDiskPressure
...
kube-system 7d4h Normal Pulled pod/calico-kube-controllers-856d44f6cc-cx2c9 Container image "docker.io/calico/kube-controllers:v3.15.1" already present on machine
kube-system 7d4h Normal Created pod/calico-kube-controllers-856d44f6cc-cx2c9 Created container calico-kube-controllers
kube-system 7d4h Normal Started pod/calico-kube-controllers-856d44f6cc-cx2c9 Started container calico-kube-controllers
...
kube-system 7d4h RºWarningº FailedCreatePodSandBox pod/dns-autoscaler-66498f5c5f-dn458 Failed to create pod sandbox: rpc error: code ...
...
kube-system 7d4h Normal LeaderElection endpoints/kube-controller-manager node1_e4a58997-c39f-430d-a942-31d53124c5d5 became leader
kube-system 7d4h Normal LeaderElection lease/kube-controller-manager node1_e4a58997-c39f-430d-a942-31d53124c5d5 became leader
...
kube-system 34m RºWarningº FailedMount pod/kubernetes-dashboard-57777fbdcb-6wrk7 MountVolume.SetUp failed for volume "kubernetes-dashboard-certs"...
$ kubectl get endpoints --all-namespaces
NAMESPACE NAME ENDPOINTS AGE
default kubernetes 192.168.1.2:6443 202d
kube-system coredns 10.233.90.23:53,10.233.90.23:9153,10.233.90.23:53 202d
kube-system dashboard-metrics-scraper 10.233.90.24:8000 202d
kube-system kube-controller-manager ˂none˃ 202d
kube-system kube-scheduler ˂none˃ 202d
kube-system kubernetes-dashboard 10.233.90.22:8443 202d
$ kubectl get deployments --all-namespaces
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system calico-kube-controllers 1/1 1 1 202d
kube-system coredns 1/2 2 1 202d
kube-system dns-autoscaler 1/1 1 1 202d
kube-system kubernetes-dashboard 1/1 1 1 202d
kube-system kubernetes-metrics-scraper 1/1 1 1 202d