몇 달전 Prometheus를 설치했었다.
그런데 오랜만에 사용하려고 보니, ImagePullBackOff가 떠있는 것이다.
controller-0:~/Jane# kubectl get po -n jane-infra-monitoring NAME READY STATUS RESTARTS AGE alertmanager-jane-prometheus-kube-promet-alertmanager-0 2/2 Running 0 24m jane-prometheus-grafana-5d7d5b55dd-hrdcn 0/3 ImagePullBackOff 0 4m59s jane-prometheus-grafana-66b98448b8-ddb7d 3/3 Running 0 23m jane-prometheus-kube-promet-operator-5579fff8bf-rbtcl 1/1 Running 0 23m jane-prometheus-kube-promet-operator-6c84f4b45c-f2p89 0/1 ImagePullBackOff 0 4m59s jane-prometheus-kube-state-metrics-59c698f9d6-zjl6f 1/1 Running 0 20m jane-prometheus-prometheus-node-exporter-rbmgx 1/1 Running 0 10d jane-prometheus-prometheus-node-exporter-szcnj 1/1 Running 0 157m jane-prometheus-prometheus-node-exporter-twlwt 1/1 Running 0 24d prometheus-jane-prometheus-kube-promet-prometheus-0 0/2 CrashLoopBackOff 0 4m51s |
한 pod를 describe로 조회해보니 image가 없다고 한다.
controller-0:~/Jane# kubectl describe po -n jane-infra-monitoring jane-prometheus-grafana-5d7d5b55dd-hrdcn Warning Failed 89s (x3 over 2m9s) kubelet, controller-1 Failed to pull image "registry.infra.jane.cluster.local:19092/quay.io/prometheus/prometheus:v2.33.1": rpc error: code = NotFound desc = failed to pull and unpack image "registry.infra.jane.cluster.local:19092/quay.io/prometheus/prometheus:v2.33.1": failed to resolve reference "registry.infra.jane.cluster.local:19092/quay.io/prometheus/prometheus:v2.33.1": registry.infra.jane.cluster.local:19092/quay.io/prometheus/prometheus:v2.33.1: not found |
그래서 image를 다른 registry에서 가져오고, deployment에서 image에 맞는 tag값까지 수정해줬다.
(관련 포스팅은 https://countrymouse.tistory.com/entry/docker )
그런데, 맨 밑 pod(prometheus-jane-prometheus-kube-promet-prometheus)는 deployment가 아닌 statefulset이였다.
그래서 이 또한 수정을 했는데,
controller-0:~$ kubectl edit statefulsets.apps -n dso-infra-monitoring prometheus-dso-prometheus-kube-promet-prometheus statefulset.apps/prometheus-dso-prometheus-kube-promet-prometheus edited |
노답. 수정하자마자 기존 image tag로 값이 원복되는 것이다.
그래서 pod가 안 산다.
해결책
결국 해결책은, cr(custom resource)에서 직접 tag값을 수정하는 것이었다.
1. statfulset owner 확인: Prometheus, dso-prometheus-kube-promet-prometheus
controller-0:~# kubectl get sts -n jane-infra-monitoring prometheus-jane-prometheus-kube-promet-prometheus -oyaml ownerReferences: - apiVersion: monitoring.coreos.com/v1 blockOwnerDeletion: true controller: true kind: Prometheus name: jane-prometheus-kube-promet-prometheus uid: fb999cb7-55e2-4756-8f74-eeeba6e943c1 |
2. cr 수정 : 이미지 tag 변경
# kubectl edit Prometheus -n jane-infra-monitoring
... spec: alerting: alertmanagers: - apiVersion: v2 name: jane-prometheus-kube-promet-alertmanager namespace: jane-infra-monitoring pathPrefix: / port: http-web enableAdminAPI: false externalUrl: http://jane-prometheus-kube-promet-prometheus.jane-infra-monitoring:9090 image: registry.infra.jane.cluster.local:19092/quay.io/prometheus/prometheus:v2.22.1 imagePullSecrets: - name: registrykey listenLocal: false logFormat: logfmt logLevel: info paused: false podMonitorNamespaceSelector: {} podMonitorSelector: matchLabels: release: jane-prometheus portName: http-web probeNamespaceSelector: {} probeSelector: matchLabels: release: jane-prometheus replicas: 1 retention: 10d routePrefix: / ruleNamespaceSelector: {} ruleSelector: matchLabels: release: jane-prometheus securityContext: fsGroup: 2000 runAsGroup: 2000 runAsNonRoot: true runAsUser: 1000 serviceAccountName: jane-prometheus-kube-promet-prometheus serviceMonitorNamespaceSelector: {} serviceMonitorSelector: matchLabels: release: jane-prometheus shards: 1 version: v2.22.1 // 여기서 변경 |
'직장생활 > Kubernetes(K8s), Docker' 카테고리의 다른 글
Docker-compose로 Prometheus + Grafana 설치 (0) | 2022.06.08 |
---|---|
Docker, DOCKERFILE 기본 명령어 모음 (0) | 2022.05.28 |
Docker Image 불러오기 (docker pull/save/load/push) (0) | 2022.04.20 |
Certified Kubernetes Administrator (CKA) - 10 Troubleshooting (ft. Udemy) (0) | 2021.11.01 |
K8s - LVM 설정 적용 (0) | 2021.10.27 |