CloudNativePG 레플리카 클러스터의 스위치오버와 스위치백 (K8s 분산 토폴로지) – 2부
Swapnil Suryawanshi · 2026년 5월 26일
이 런북은 CloudNativePG(CNPG)가 관리하는 두 개의 EDB Postgres Advanced 18 클러스터 사이에서 제어된 스위치오버(Switchover)와 스위치백(Switchback)을 수행하는 운영 절차를 자세히 다룹니다. 본 글은 시리즈의 2부입니다. 아직 분산 토폴로지를 구성하지 않으셨다면 1부: Barman Cloud Plugin을 이용한 CloudNativePG PostgreSQL 레플리카 클러스터 배포부터 시작하세요.
목표
목표는 분산 토폴로지 안에서 프라이머리 클러스터(cluster-primary)와 레플리카 클러스터(cluster-replica) 사이의 프라이머리 역할을 안전하게 교대(rotate)하는 것입니다. 이 과정은 CNPG 네이티브 승격(promotion)·강등(demotion) 워크플로를 활용해 무손실(zero data loss)을 보장합니다.
환경 정보
- 오퍼레이터 버전: 1.27.0
- 데이터베이스: EDB Postgres Advanced 18
- 백업: Barman-Plugin
- 프라이머리 클러스터:
cluster-primary(네임스페이스:primary) - 레플리카 클러스터:
cluster-replica(네임스페이스:replica)
1단계 페이즈: 최초 스위치오버 (프라이머리 → 레플리카)
Step 1: 두 클러스터의 초기 상태 확인
작업을 시작하기 전에 프라이머리와 레플리카 클러스터의 상태(health)와 LSN을 확인합니다.
프라이머리:
user% kubectl cnp status cluster-primary -n primary Cluster Summary Name primary/cluster-primary System ID: 7611448666720448538 PostgreSQL Image: docker.enterprisedb.com/k8s/edb-postgres-advanced:18-standard-ubi9 Primary instance: cluster-primary-1 Primary promotion time: 2026-02-27 07:48:28 +0000 UTC (310h54m54s) Status: Cluster in healthy state Instances: 3 Ready instances: 3 Size: 185M Current Write LSN: 0/9000060 (Timeline: 1 - WAL File: 000000010000000000000009) Continuous Backup status (Barman Cloud Plugin) ObjectStore / Server name: s3-store/cluster-primary First Point of Recoverability: 2026-02-27 13:20:08 IST Last Successful Backup: 2026-02-27 13:20:08 IST Last Failed Backup: - Working WAL archiving: OK WALs waiting to be archived: 0 Last Archived WAL: 000000010000000000000008 @ 2026-02-27T07:54:14.417091Z Last Failed WAL: - Streaming Replication status Replication Slots Enabled Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority Replication Slot ---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- ------------- ---------------- cluster-primary-2 0/9000060 0/9000060 0/9000060 0/9000060 00:00:00 00:00:00 00:00:00 streaming async 0 active cluster-primary-3 0/9000060 0/9000060 0/9000060 0/9000060 00:00:00 00:00:00 00:00:00 streaming async 0 active Instances status Name Current LSN Replication role Status QoS Manager Version Node ---- ----------- ---------------- ------ --- --------------- ---- cluster-primary-1 0/9000060 Primary OK BestEffort 1.27.0 cnp-1.27.0-control-plane cluster-primary-2 0/9000060 Standby (async) OK BestEffort 1.27.0 cnp-1.27.0-control-plane cluster-primary-3 0/9000060 Standby (async) OK BestEffort 1.27.0 cnp-1.27.0-control-plane Plugins status Name Version Status Reported Operator Capabilities ---- ------- ------ ------------------------------ barman-cloud.cloudnative-pg.io 0.11.0 N/A Reconciler Hooks, Lifecycle Service
레플리카:
user% kubectl cnp status cluster-replica -n replica Replica Cluster Summary Name replica/cluster-replica System ID: 7611448666720448538 PostgreSQL Image: docker.enterprisedb.com/k8s/edb-postgres-advanced:18-standard-ubi9 Designated primary: cluster-replica-1 Source cluster: cluster-primary Primary promotion time: 2026-02-27 07:52:58 +0000 UTC (310h50m34s) Status: Cluster in healthy state Instances: 3 Ready instances: 3 Size: 104M Continuous Backup status (Barman Cloud Plugin) ObjectStore / Server name: s3-store/cluster-replica First Point of Recoverability: - Last Successful Backup: - Last Failed Backup: - Working WAL archiving: OK WALs waiting to be archived: 0 Last Archived WAL: 000000010000000000000008 @ 2026-02-27T07:54:22.112507Z Last Failed WAL: - Instances status Name Current LSN Replication role Status QoS Manager Version Node ---- ----------- ---------------- ------ --- --------------- ---- cluster-replica-1 0/9000000 Designated primary OK BestEffort 1.27.0 cnp-1.27.0-control-plane cluster-replica-2 0/9000000 Standby (in Replica Cluster) OK BestEffort 1.27.0 cnp-1.27.0-control-plane cluster-replica-3 0/9000000 Standby (in Replica Cluster) OK BestEffort 1.27.0 cnp-1.27.0-control-plane Plugins status Name Version Status Reported Operator Capabilities ---- ------- ------ ------------------------------ barman-cloud.cloudnative-pg.io 0.11.0 N/A Reconciler Hooks, Lifecycle Service
Step 2: 프라이머리 클러스터 강등(Demote)
프라이머리의 replica 섹션을 변경하고 강등 토큰(demotion token)을 획득합니다.
프라이머리 클러스터 설정 변경:
변경 전:
replica: primary: cluster-primary source: cluster-primary
변경 후:
replica: primary: cluster-replica source: cluster-replica
토큰 획득 및 상태 확인:
user% kubectl get cluster cluster-primary -n primary \
-o jsonpath='{.status.demotionToken}'
eyJsYXRlc3RDaGVja3BvaW50VGltZWxpbmVJRCI6IjEiLCJyZWRvV2FsRmlsZSI6IjAwMDAwMDAxMDAwMDAwMDAwMDAwMDAwQSIsImRhdGFiYXNlU3lzdGVtSWRlbnRpZmllciI6Ijc2MTE0NDg2NjY3MjA0NDg1MzgiLCJsYXRlc3RDaGVja3BvaW50UkVET0xvY2F0aW9uIjoiMC9BMDAwMDI4IiwidGltZU9mTGF0ZXN0Q2hlY2twb2ludCI6IlRodSBNYXIgMTIgMDY6NTA6NTEgMjAyNiIsIm9wZXJhdG9yVmVyc2lvbiI6IjEuMjcuMCJ9%
user% kubectl cnp status cluster-primary -n primary
Replica Cluster Summary
Name primary/cluster-primary
System ID: 7611448666720448538
PostgreSQL Image: docker.enterprisedb.com/k8s/edb-postgres-advanced:18-standard-ubi9
Designated primary: cluster-primary-1
Source cluster: cluster-replica
Primary promotion time: 2026-03-12 06:50:52 +0000 UTC (1m25s)
Status: Cluster in healthy state
Instances: 3
Ready instances: 3
Size: 200M
Demotion token
Token eyJsYXRlc3RDaGVja3BvaW50VGltZWxpbmVJRCI6IjEiLCJyZWRvV2FsRmlsZSI6IjAwMDAwMDAxMDAwMDAwMDAwMDAwMDAwQSIsImRhdGFiYXNlU3lzdGVtSWRlbnRpZmllciI6Ijc2MTE0NDg2NjY3MjA0NDg1MzgiLCJsYXRlc3RDaGVja3BvaW50UkVET0xvY2F0aW9uIjoiMC9BMDAwMDI4IiwidGltZU9mTGF0ZXN0Q2hlY2twb2ludCI6IlRodSBNYXIgMTIgMDY6NTA6NTEgMjAyNiIsIm9wZXJhdG9yVmVyc2lvbiI6IjEuMjcuMCJ9
Validity valid
Latest checkpoint's TimeLineID 1
Latest checkpoint's REDO WAL file 00000001000000000000000A
Latest checkpoint's REDO location 0/A000028
Database system identifier 7611448666720448538 (ok)
Time of latest checkpoint Thu Mar 12 06:50:51 2026
Version of the operator 1.27.0
Continuous Backup status (Barman Cloud Plugin)
ObjectStore / Server name: s3-store/cluster-primary
First Point of Recoverability: 2026-02-27 13:20:08 IST
Last Successful Backup: 2026-02-27 13:20:08 IST
Last Failed Backup: -
Working WAL archiving: OK
WALs waiting to be archived: 0
Last Archived WAL: 00000001000000000000000A @ 2026-03-12T06:52:15.544285Z
Last Failed WAL: -
Instances status
Name Current LSN Replication role Status QoS Manager Version Node
---- ----------- ---------------- ------ --- --------------- ----
cluster-primary-1 0/A0000A0 Designated primary OK BestEffort 1.27.0 cnp-1.27.0-control-plane
cluster-primary-2 0/A0000A0 Standby (in Replica Cluster) OK BestEffort 1.27.0 cnp-1.27.0-control-plane
cluster-primary-3 0/A0000A0 Standby (in Replica Cluster) OK BestEffort 1.27.0 cnp-1.27.0-control-plane
Plugins status
Name Version Status Reported Operator Capabilities
---- ------- ------ ------------------------------
barman-cloud.cloudnative-pg.io 0.11.0 N/A Reconciler Hooks, Lifecycle Service
Step 3: 레플리카 클러스터 승격(Promote)
promotionToken을 레플리카 클러스터 설정에 적용합니다.
레플리카 클러스터 설정 변경:
변경 전:
replica: primary: cluster-primary source: cluster-primary
변경 후:
replica: primary: cluster-replica promotionToken: eyJsYXRlc3RDaGVja3BvaW50VGltZWxpbmVJRCI6IjEiLCJyZWRvV2FsRmlsZSI6IjAwMDAwMDAxMDAwMDAwMDAwMDAwMDAwQSIsImRhdGFiYXNlU3lzdGVtSWRlbnRpZmllciI6Ijc2MTE0NDg2NjY3MjA0NDg1MzgiLCJsYXRlc3RDaGVja3BvaW50UkVET0xvY2F0aW9uIjoiMC9BMDAwMDI4IiwidGltZU9mTGF0ZXN0Q2hlY2twb2ludCI6IlRodSBNYXIgMTIgMDY6NTA6NTEgMjAyNiIsIm9wZXJhdG9yVmVyc2lvbiI6IjEuMjcuMCJ9 source: cluster-primary
Step 4: 첫 스위치오버/승격 후 상태 확인
역할이 올바르게 뒤바뀌었는지 확인합니다.
A] 레플리카 클러스터가 프라이머리로 정상 승격됨:
user% kubectl cnp status cluster-replica -n replica Cluster Summary Name replica/cluster-replica System ID: 7611448666720448538 PostgreSQL Image: docker.enterprisedb.com/k8s/edb-postgres-advanced:18-standard-ubi9 Primary instance: cluster-replica-1 Primary promotion time: 2026-02-27 07:52:58 +0000 UTC (311h6m25s) Status: Cluster in healthy state Instances: 3 Ready instances: 3 Size: 169M Current Write LSN: 0/B006E90 (Timeline: 2 - WAL File: 00000002000000000000000B) Continuous Backup status (Barman Cloud Plugin) ObjectStore / Server name: s3-store/cluster-replica First Point of Recoverability: - Last Successful Backup: - Last Failed Backup: - Working WAL archiving: OK WALs waiting to be archived: 0 Last Archived WAL: 00000002000000000000000A @ 2026-03-12T06:57:33.229487Z Last Failed WAL: - Streaming Replication status Replication Slots Enabled Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority Replication Slot ---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- ------------- ---------------- cluster-replica-2 0/B006E90 0/B006E90 0/B006E90 0/B006E90 00:00:00 00:00:00 00:00:00 streaming async 0 active cluster-replica-3 0/B006E90 0/B006E90 0/B006E90 0/B006E90 00:00:00 00:00:00 00:00:00 streaming async 0 active Instances status Name Current LSN Replication role Status QoS Manager Version Node ---- ----------- ---------------- ------ --- --------------- ---- cluster-replica-1 0/B006E90 Primary OK BestEffort 1.27.0 cnp-1.27.0-control-plane cluster-replica-2 0/B006E90 Standby (async) OK BestEffort 1.27.0 cnp-1.27.0-control-plane cluster-replica-3 0/B006E90 Standby (async) OK BestEffort 1.27.0 cnp-1.27.0-control-plane Plugins status Name Version Status Reported Operator Capabilities ---- ------- ------ ------------------------------ barman-cloud.cloudnative-pg.io 0.11.0 N/A Reconciler Hooks, Lifecycle Service
B] 프라이머리 클러스터가 레플리카로 정상 강등됨:
user% kubectl cnp status cluster-primary -n primary Replica Cluster Summary Name primary/cluster-primary System ID: 7611448666720448538 PostgreSQL Image: docker.enterprisedb.com/k8s/edb-postgres-advanced:18-standard-ubi9 Designated primary: cluster-primary-1 Source cluster: cluster-replica Primary promotion time: 2026-03-12 06:50:52 +0000 UTC (9m32s) Status: Cluster in healthy state Instances: 3 Ready instances: 3 Size: 216M Demotion token Token eyJsYXRlc3RDaGVja3BvaW50VGltZWxpbmVJRCI6IjEiLCJyZWRvV2FsRmlsZSI6IjAwMDAwMDAxMDAwMDAwMDAwMDAwMDAwQSIsImRhdGFiYXNlU3lzdGVtSWRlbnRpZmllciI6Ijc2MTE0NDg2NjY3MjA0NDg1MzgiLCJsYXRlc3RDaGVja3BvaW50UkVET0xvY2F0aW9uIjoiMC9BMDAwMDI4IiwidGltZU9mTGF0ZXN0Q2hlY2twb2ludCI6IlRodSBNYXIgMTIgMDY6NTA6NTEgMjAyNiIsIm9wZXJhdG9yVmVyc2lvbiI6IjEuMjcuMCJ9 Validity valid Latest checkpoint's TimeLineID 1 Latest checkpoint's REDO WAL file 00000001000000000000000A Latest checkpoint's REDO location 0/A000028 Database system identifier 7611448666720448538 (ok) Time of latest checkpoint Thu Mar 12 06:50:51 2026 Version of the operator 1.27.0 Continuous Backup status (Barman Cloud Plugin) ObjectStore / Server name: s3-store/cluster-primary First Point of Recoverability: 2026-02-27 13:20:08 IST Last Successful Backup: 2026-02-27 13:20:08 IST Last Failed Backup: - Working WAL archiving: OK WALs waiting to be archived: 0 Last Archived WAL: 00000002000000000000000A @ 2026-03-12T06:57:39.660756Z Last Failed WAL: - Instances status Name Current LSN Replication role Status QoS Manager Version Node ---- ----------- ---------------- ------ --- --------------- ---- cluster-primary-1 0/B000000 Designated primary OK BestEffort 1.27.0 cnp-1.27.0-control-plane cluster-primary-2 0/B000000 Standby (in Replica Cluster) OK BestEffort 1.27.0 cnp-1.27.0-control-plane cluster-primary-3 0/B000000 Standby (in Replica Cluster) OK BestEffort 1.27.0 cnp-1.27.0-control-plane Plugins status Name Version Status Reported Operator Capabilities ---- ------- ------ ------------------------------ barman-cloud.cloudnative-pg.io 0.11.0 N/A Reconciler Hooks, Lifecycle Service
Step 5: 복제 확인
새 프라이머리에서 테이블을 생성하고, 레플리카에서 확인합니다.
승격된 프라이머리 (cluster-replica):
user% kubectl cnp psql cluster-replica -n replica psql (18.1.0) Type "help" for help. postgres=# \dt Did not find any tables. postgres=# create table test (id int); CREATE TABLE postgres=# insert into test values (1); INSERT 0 1 postgres=# select * from test; id ---- 1 (1 row)
강등된 레플리카 (cluster-primary):
user% kubectl cnp psql cluster-primary -n primary psql (18.1.0) Type "help" for help. postgres=# \dt Did not find any tables.
Step 6: 현재 프라이머리의 신규 백업 수행
user% kubectl cnp backup cluster-replica -n replica --method=plugin --plugin-name=barman-cloud.cloudnative-pg.io backup/cluster-replica-20260312133305 created swapnilsuryawanshi@MAC-CR9L20YFN6 plugin % kubectl get backup -n replica NAME AGE CLUSTER METHOD PHASE ERROR cluster-replica-20260312133305 58s cluster-replica plugin completed
Step 7: 백업 후 복제 확인
user% kubectl cnp psql cluster-primary -n primary
psql (18.1.0)
Type "help" for help.
postgres=# \dt
List of tables
Schema | Name | Type | Owner
--------+------+-------+----------
public | test | table | postgres
(1 row)
postgres=# select * from test;
id
----
1
(1 row)
2단계 페이즈: 스위치백 (레플리카 → 프라이머리)
Step 8: 현재 프라이머리(cluster-replica)를 레플리카로 강등
설정을 변경하고 새 강등 토큰을 가져옵니다.
레플리카 클러스터 설정 변경:
변경 전:
replica: primary: cluster-replica promotionToken: eyJsYXRlc3RDaGVja3BvaW50VGltZWxpbmVJRCI6IjEiLCJyZWRvV2FsRmlsZSI6IjAwMDAwMDAxMDAwMDAwMDAwMDAwMDAwQSIsImRhdGFiYXNlU3lzdGVtSWRlbnRpZmllciI6Ijc2MTYyNzEyOTAzMDUyODIwNzMiLCJsYXRlc3RDaGVja3BvaW50UkVET0xvY2F0aW9uIjoiMC9BMDAwMDI4IiwidGltZU9mTGF0ZXN0Q2hlY2twb2ludCI6IlRodSBNYXIgMTIgMDc6NTE6MjMgMjAyNiIsIm9wZXJhdG9yVmVyc2lvbiI6IjEuMjcuMCJ9 source: cluster-primary
변경 후:
replica: primary: cluster-primary source: cluster-primary
토큰 획득 및 상태:
user% kubectl get cluster cluster-replica -n replica -o jsonpath='{.status.demotionToken}'
eyJsYXRlc3RDaGVja3BvaW50VGltZWxpbmVJRCI6IjIiLCJyZWRvV2FsRmlsZSI6IjAwMDAwMDAyMDAwMDAwMDAwMDAwMDAxMCIsImRhdGFiYXNlU3lzdGVtSWRlbnRpZmllciI6Ijc2MTYyNzEyOTAzMDUyODIwNzMiLCJsYXRlc3RDaGVja3BvaW50UkVET0xvY2F0aW9uIjoiMC8xMDAwMDAyOCIsInRpbWVPZkxhdGVzdENoZWNrcG9pbnQiOiJUaHUgTWFyIDEyIDA4OjEwOjA4IDIwMjYiLCJvcGVyYXRvclZlcnNpb24iOiIxLjI3LjAifQ==%
user% kubectl cnp status cluster-replica -n replica
Replica Cluster Summary
Name replica/cluster-replica
System ID: 7616271290305282073
PostgreSQL Image: docker.enterprisedb.com/k8s/edb-postgres-advanced:18-standard-ubi9
Designated primary: cluster-replica-1
Source cluster: cluster-primary
Primary promotion time: 2026-03-12 08:10:09 +0000 UTC (14s)
Status: Cluster in healthy state
Instances: 3
Ready instances: 3
Size: 248M
Demotion token
Token eyJsYXRlc3RDaGVja3BvaW50VGltZWxpbmVJRCI6IjIiLCJyZWRvV2FsRmlsZSI6IjAwMDAwMDAyMDAwMDAwMDAwMDAwMDAxMCIsImRhdGFiYXNlU3lzdGVtSWRlbnRpZmllciI6Ijc2MTYyNzEyOTAzMDUyODIwNzMiLCJsYXRlc3RDaGVja3BvaW50UkVET0xvY2F0aW9uIjoiMC8xMDAwMDAyOCIsInRpbWVPZkxhdGVzdENoZWNrcG9pbnQiOiJUaHUgTWFyIDEyIDA4OjEwOjA4IDIwMjYiLCJvcGVyYXRvclZlcnNpb24iOiIxLjI3LjAifQ==
Validity valid
Latest checkpoint's TimeLineID 2
Latest checkpoint's REDO WAL file 000000020000000000000010
Latest checkpoint's REDO location 0/10000028
Database system identifier 7616271290305282073 (ok)
Time of latest checkpoint Thu Mar 12 08:10:08 2026
Version of the operator 1.27.0
Continuous Backup status (Barman Cloud Plugin)
ObjectStore / Server name: s3-store/cluster-replica
First Point of Recoverability: 2026-03-12 13:33:10 IST
Last Successful Backup: 2026-03-12 13:33:10 IST
Last Failed Backup: -
Working WAL archiving: OK
WALs waiting to be archived: 0
Last Archived WAL: 00000002.history @ 2026-03-12T08:10:15.149343Z
Last Failed WAL: -
Instances status
Name Current LSN Replication role Status QoS Manager Version Node
---- ----------- ---------------- ------ --- --------------- ----
cluster-replica-1 0/100000A0 Designated primary OK BestEffort 1.27.0 cnp-1.27.0-control-plane
cluster-replica-2 0/100000A0 Standby (in Replica Cluster) OK BestEffort 1.27.0 cnp-1.27.0-control-plane
cluster-replica-3 0/100000A0 Standby (in Replica Cluster) OK BestEffort 1.27.0 cnp-1.27.0-control-plane
Plugins status
Name Version Status Reported Operator Capabilities
---- ------- ------ ------------------------------
barman-cloud.cloudnative-pg.io 0.11.0 N/A Reconciler Hooks, Lifecycle Service
Step 9: 현재 레플리카(cluster-primary)를 프라이머리로 승격
원래 프라이머리에 승격 토큰을 다시 적용합니다.
프라이머리 클러스터 설정 변경:
변경 전:
replica: primary: cluster-replica source: cluster-replica
변경 후:
replica: primary: cluster-primary promotionToken: eyJsYXRlc3RDaGVja3BvaW50VGltZWxpbmVJRCI6IjIiLCJyZWRvV2FsRmlsZSI6IjAwMDAwMDAyMDAwMDAwMDAwMDAwMDAxMCIsImRhdGFiYXNlU3lzdGVtSWRlbnRpZmllciI6Ijc2MTYyNzEyOTAzMDUyODIwNzMiLCJsYXRlc3RDaGVja3BvaW50UkVET0xvY2F0aW9uIjoiMC8xMDAwMDAyOCIsInRpbWVPZkxhdGVzdENoZWNrcG9pbnQiOiJUaHUgTWFyIDEyIDA4OjEwOjA4IDIwMjYiLCJvcGVyYXRvclZlcnNpb24iOiIxLjI3LjAifQ== source: cluster-replica
Step 10: 두 클러스터의 상태 확인
최종 스위치백 상태를 검증합니다.
A] cluster-primary가 프라이머리로 정상 재승격됨:
user% kubectl cnp status cluster-primary -n primary Cluster Summary Name primary/cluster-primary System ID: 7616271290305282073 PostgreSQL Image: docker.enterprisedb.com/k8s/edb-postgres-advanced:18-standard-ubi9 Primary instance: cluster-primary-1 Primary promotion time: 2026-03-12 07:51:23 +0000 UTC (25m20s) Status: Cluster in healthy state Instances: 3 Ready instances: 3 Size: 344M Current Write LSN: 0/11001F18 (Timeline: 3 - WAL File: 000000030000000000000011) Continuous Backup status (Barman Cloud Plugin) ObjectStore / Server name: s3-store/cluster-primary First Point of Recoverability: 2026-03-12 13:15:19 IST Last Successful Backup: 2026-03-12 13:15:19 IST Last Failed Backup: - Working WAL archiving: OK WALs waiting to be archived: 0 Last Archived WAL: 000000030000000000000010 @ 2026-03-12T08:16:05.880203Z Last Failed WAL: - Streaming Replication status Replication Slots Enabled Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority Replication Slot ---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- ------------- ---------------- cluster-primary-2 0/11001F18 0/11001F18 0/11001F18 0/11001F18 00:00:00 00:00:00 00:00:00 streaming async 0 active cluster-primary-3 0/11001F18 0/11001F18 0/11001F18 0/11001F18 00:00:00 00:00:00 00:00:00 streaming async 0 active Instances status Name Current LSN Replication role Status QoS Manager Version Node ---- ----------- ---------------- ------ --- --------------- ---- cluster-primary-1 0/11001F18 Primary OK BestEffort 1.27.0 cnp-1.27.0-control-plane cluster-primary-2 0/11001F18 Standby (async) OK BestEffort 1.27.0 cnp-1.27.0-control-plane cluster-primary-3 0/11001F18 Standby (async) OK BestEffort 1.27.0 cnp-1.27.0-control-plane Plugins status Name Version Status Reported Operator Capabilities ---- ------- ------ ------------------------------ barman-cloud.cloudnative-pg.io 0.11.0 N/A Reconciler Hooks, Lifecycle Service
B] cluster-replica가 레플리카로 정상 재강등됨:
user% kubectl cnp status cluster-replica -n replica Replica Cluster Summary Name replica/cluster-replica System ID: 7616271290305282073 PostgreSQL Image: docker.enterprisedb.com/k8s/edb-postgres-advanced:18-standard-ubi9 Designated primary: cluster-replica-1 Source cluster: cluster-primary Primary promotion time: 2026-03-12 08:10:09 +0000 UTC (8m7s) Status: Cluster in healthy state Instances: 3 Ready instances: 3 Size: 264M Demotion token Token eyJsYXRlc3RDaGVja3BvaW50VGltZWxpbmVJRCI6IjIiLCJyZWRvV2FsRmlsZSI6IjAwMDAwMDAyMDAwMDAwMDAwMDAwMDAxMCIsImRhdGFiYXNlU3lzdGVtSWRlbnRpZmllciI6Ijc2MTYyNzEyOTAzMDUyODIwNzMiLCJsYXRlc3RDaGVja3BvaW50UkVET0xvY2F0aW9uIjoiMC8xMDAwMDAyOCIsInRpbWVPZkxhdGVzdENoZWNrcG9pbnQiOiJUaHUgTWFyIDEyIDA4OjEwOjA4IDIwMjYiLCJvcGVyYXRvclZlcnNpb24iOiIxLjI3LjAifQ== Validity valid Latest checkpoint's TimeLineID 2 Latest checkpoint's REDO WAL file 000000020000000000000010 Latest checkpoint's REDO location 0/10000028 Database system identifier 7616271290305282073 (ok) Time of latest checkpoint Thu Mar 12 08:10:08 2026 Version of the operator 1.27.0 Continuous Backup status (Barman Cloud Plugin) ObjectStore / Server name: s3-store/cluster-replica First Point of Recoverability: 2026-03-12 13:33:10 IST Last Successful Backup: 2026-03-12 13:33:10 IST Last Failed Backup: - Working WAL archiving: OK WALs waiting to be archived: 0 Last Archived WAL: 000000030000000000000010 @ 2026-03-12T08:16:08.35668Z Last Failed WAL: - Instances status Name Current LSN Replication role Status QoS Manager Version Node ---- ----------- ---------------- ------ --- --------------- ---- cluster-replica-1 0/11000000 Designated primary OK BestEffort 1.27.0 cnp-1.27.0-control-plane cluster-replica-2 0/11000000 Standby (in Replica Cluster) OK BestEffort 1.27.0 cnp-1.27.0-control-plane cluster-replica-3 0/11000000 Standby (in Replica Cluster) OK BestEffort 1.27.0 cnp-1.27.0-control-plane Plugins status Name Version Status Reported Operator Capabilities ---- ------- ------ ------------------------------ barman-cloud.cloudnative-pg.io 0.11.0 N/A Reconciler Hooks, Lifecycle Service
Step 11: 최종 복제 상태 검증
데이터 일관성을 확인하고, 원래 프라이머리에서 쓰기/아카이브 기능을 테스트합니다.
프라이머리 (cluster-primary):
user% kubectl cnp psql cluster-primary -n primary
psql (18.1.0)
Type "help" for help.
postgres=# \dt
List of tables
Schema | Name | Type | Owner
--------+------+-------+----------
public | test | table | postgres
(1 row)
postgres=# select * from test;
id
----
1
2
(2 rows)
postgres=# insert into test values (3);
INSERT 0 1
postgres=# checkpoint;
CHECKPOINT
postgres=# select * from pg_switch_wal();
pg_switch_wal
---------------
0/11002180
(1 row)
레플리카 (cluster-replica):
user% kubectl cnp psql cluster-replica -n replica
psql (18.1.0)
Type "help" for help.
postgres=# \dt
List of tables
Schema | Name | Type | Owner
--------+------+-------+----------
public | test | table | postgres
(1 row)
postgres=# select * from test;
id
----
1
2
3
(3 rows)
결론
이 런북은 CNPG 네이티브 강등 토큰(demotion token)·승격 토큰(promotion token) 워크플로를 사용해, 두 CloudNativePG 클러스터 사이에서 완전한 무손실(zero-data-loss) 스위치오버·스위치백 사이클을 시연했습니다.
구조화된 2단계 접근법, 즉 프라이머리 역할을 먼저 cluster-replica로 교대했다가 다시 cluster-primary로 되돌리는 과정을 통해 다음을 확인했습니다.
- 강등 토큰은 승격이 시작되기 전에 기존 프라이머리를 안전하게 펜싱(fencing)합니다.
- 승격 토큰은 새 프라이머리가 이전 프라이머리가 멈춘 지점을 정확히 이어받도록 보장하여, 타임라인 전환 전반에 걸쳐 WAL 연속성을 유지합니다.
- 각 페이즈 동안 기록된 모든 데이터는 스탠바이 측에 정확히 복제되어 검증되었습니다.
이 패턴은 WAL 아카이빙과 백업을 위한 Barman Cloud Plugin과 완전히 호환되므로, 쿠버네티스 상의 프로덕션 분산 토폴로지에 적합합니다.
EDB Postgres AI와 CloudNativePG, 더 알아보기
쿠버네티스 기반 PostgreSQL 운영(HA·DR·마이그레이션)이나 PoC가 궁금하시다면 EDB Korea로 문의해 주세요.
메일: salesinquiry@enterprisedb.com
원문: Switchover and Switchback of CloudNativePG Replica Clusters in a Distributed Topology (K8s) – Part 2 (EDB Blog)

