avatar

甄天祥-Linux-个人小站

A text-focused Halo theme

  • 首页
  • 分类
  • 标签
  • 关于
Home kubernetes 部署 redis-cluster
文章

kubernetes 部署 redis-cluster

Posted 2025-06-4 Updated 2025-06- 19
By Administrator
122~157 min read

一、关于 redis

1. 主从模式

  • 核心作用:数据冗余、读写分离(主节点写,从节点读)。

  • 架构:

    • 1 个主节点(Master) + N 个从节点(Slave)。

    • 主节点负责写操作,从节点异步复制主节点数据。

  • 优点:

    • 简单易配置,适合读多写少场景。

    • 从节点可分担主节点的读负载。

  • 缺点:

    • 无自动故障转移:主节点宕机需手动切换。

    • 数据不一致风险:异步复制可能导致从节点数据滞后。

2. 哨兵模式

  • 核心作用:监控主从节点,实现自动故障转移(主节点宕机时提升从节点为新主节点)。

  • 架构:

    • 由多个 Sentinel 节点(奇数个,如 3 个)组成分布式监控系统。

    • Sentinel 监控主从节点状态,通过投票机制完成故障转移。

  • 优点:

    • 高可用:自动切换主节点,避免手动干预。

    • 客户端可通过 Sentinel 获取最新的主节点地址。

  • 缺点:

    • 不解决数据分片问题:单主节点写入性能受限。

    • 配置较复杂,需依赖 Sentinel 客户端。

3. 集群模式

  • 核心作用:数据分片(Sharding) + 高可用,Redis 3.0 引入。

  • 架构:

    • 多个主节点(至少 6 个节点,3 主 3 从),每个主节点负责一部分哈希槽(16384 slots)。

    • 数据按槽分布,支持跨节点查询(如 MGET)。

    • 主节点宕机时,从节点自动接替。

  • 优点:

    • 横向扩展:支持海量数据和高并发。

    • 高可用:集成故障转移能力。

  • 缺点:

    • 客户端需支持集群协议(如 -c 参数)。

    • 不支持跨槽事务和多键操作(除非在同槽)。

二、部署

1. 部署 3 主 3 从

1.1 statefulset

[root@k8s-master1 redis-sts-cluster]# cat redis-sts-cluster.yaml
---
apiVersion: v1
kind: Service                                     # 先创建一个无头service
metadata:
  name: redis-sts-cluster-headless                # service的名称,下面创建的StatefulSet就要引用这个service名称
  namespace: redis-cluster                        # 资源所属命名空间
  labels:                                         # service本身的标签
    app: redis-sts-cluster
spec:
  ports:
  - port: 6379                                    # service本身的端口
    protocol: TCP
    targetPort: 6379                              # 目标端口6379,redis默认端口是6379
  selector:
    app: redis-sts-cluster                        # 标签选择器要与下面创建的pod的标签一样
  type: ClusterIP
  clusterIP: None                                 # clusterIP为None表示创建的service为无头service
---
apiVersion: apps/v1
kind: StatefulSet                                 # 创建StatefulSet资源
metadata:
  name: redis-sts-cluster                         # 资源名称
  namespace: redis-cluster                        # 资源所属命名空间
  labels:                                         # StatefulSet本身的标签
    app: redis-sts-cluster
spec:
  selector:                                       # 标签选择器,要与下面pod模板定义的pod标签保持一致
    matchLabels:
      app: redis-sts-cluster
  replicas: 6                                     # 副本数为6个,redis集群模式最少要为6个节点,构成3主3从
  serviceName: redis-sts-cluster-headless         # 指定使用service为上面我们创建的无头service的名称
  template:                     
    metadata:
      labels:                                     # pod的标签,上面的无头service的标签选择器和sts标签选择器都要与这个相同
        app: redis-sts-cluster
    spec:
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              preference:
                matchExpressions:
                  - key: middleware
                    operator: In
                    values:
                      - "true"
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchLabels:
                    app: redis-sts-cluster
                topologyKey: kubernetes.io/hostname
      tolerations:
        - key: middleware
          operator: Equal
          value: "true"
          effect: NoSchedule
      containers:
        - name: redis                             # 主容器:Redis Server
          image: redis:7.2.4
          command: ["redis-server"]
          args:
            - "/etc/redis/redis.conf"
            - "--cluster-announce-ip"         # 这个参数和下面的这个参数
            - "$(POD_IP)"                     # 这个参数是为了解决pod重启ip变了之后,redis集群状态无法自动同步问题
          env:
            - name: POD_IP
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: status.podIP
          ports:
            - containerPort: 6379
          volumeMounts:
            - name: redis-conf            # 主容器 redis 配置文件
              mountPath: /etc/redis
            - name: redis-data            # 数据存储目录(和配置文件中一致)
              mountPath: /data
            - name: localtime                                             # 挂载本地时间
              mountPath: /etc/localtime
              readOnly: true
        - name: cluster-manager           # Sidecar 容器
          image: harbor.meta42-uat.com/library/redis-cluster-manager:7.2.4
          command: ["/bin/sh", "-c"]
          args:
            - |
              exec /scripts/cluster-manager.sh
          env:
            - name: CURRENT_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: MY_POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: SERVER_NAME
              value: redis-sts-cluster  # 根据你的 server 名字设定, 无需加上headless, 脚本会自动补齐
            - name: REDIS_PASSWORD
              value: "iloveyou"  # 和 redis-conf 的 configmap 中一致
            - name: INITIAL_REPLICAS
              value: "6"     # statefulset 控制器副本数量
            - name: MAX_RETRIES # 启动等待CoreDNS加入集群时间
              value: "180"  # 根据集群规模可以适当调整
            - name: DNS_SERVER
              value: "10.96.0.10"  # 默认K8s集群的CoreDNS地址
          volumeMounts:
            - name: cluster-script        # sidecar 容器脚本文件
              mountPath: /scripts
            - name: localtime             # 挂载本地时间
              mountPath: /etc/localtime
              readOnly: true
      volumes:
        - name: "redis-conf"
          configMap:
            name: "redis-conf"
            items:
              - key: "redis.conf"
                path: "redis.conf"
        - name: "cluster-script"
          configMap:
            name: "redis-cluster-manager-script"
            items:
              - key: "cluster-manager.sh"
                path: "cluster-manager.sh"
            defaultMode: 0744  # 确保脚本可执行
        - name: localtime
          hostPath:
            path: /etc/localtime
            type: File
      restartPolicy: Always
  volumeClaimTemplates:                           # 定义创建pvc的模板
    - metadata:
        name: "redis-data"                        # 模板名称
      spec:
        accessModes:                            
        - ReadWriteOnce                           # 访问模式为RWO
        resources:                                # 资源请求
          requests:
            storage: 100Gi                        # 需要100Gi的存储空间
        storageClassName: openebs-hostpath        # 指定使用的存储类,实现动态分配pv
        volumeMode: Filesystem

1.2 redis 配置文件

[root@k8s-master1 redis-sts-cluster]# cat redis-sts-config.yaml 
apiVersion: v1
kind: ConfigMap
metadata:
  name: redis-conf
  namespace: redis-cluster
data:
  redis.conf: |
    # 坚听地址
    bind 0.0.0.0
    protected-mode yes
    # redis端口
    port 6379
    tcp-backlog 511
    timeout 0
    tcp-keepalive 300
    # redis是否以后台模式运行,必须设置no
    daemonize no
    supervised no
    # redis的pid文件,放到/data目录下
    pidfile /data/redis.pid
    loglevel notice
    # redis日志文件,放到/data目录下
    #logfile /data/redis_log
    logfile ""
    databases 16
    always-show-logo yes
    save 900 1
    save 300 10
    save 60 10000
    stop-writes-on-bgsave-error yes
    rdbcompression yes
    rdbchecksum yes
    # 这个文件会放在dir定义的/data目录
    dbfilename dump.rdb
    # 数据目录
    dir /data
    # redis集群各节点相互认证的密码,必须配置和下面的requirepass一致
    masterauth iloveyou
    replica-serve-stale-data yes
    replica-read-only yes
    repl-diskless-sync no
    repl-diskless-sync-delay 5
    repl-disable-tcp-nodelay no
    replica-priority 100
    # reids 服务密码
    requirepass iloveyou
    lazyfree-lazy-eviction no
    lazyfree-lazy-expire no
    lazyfree-lazy-server-del no
    replica-lazy-flush no
    appendonly no
    # 这个文件会放在dir定义的/data目录
    appendfilename "appendonly.aof"
    appendfsync everysec
    no-appendfsync-on-rewrite no
    auto-aof-rewrite-percentage 100
    auto-aof-rewrite-min-size 64mb
    aof-load-truncated yes
    aof-use-rdb-preamble yes
    lua-time-limit 5000
    # 是否启用集群模式,必须去掉注释设为yes
    cluster-enabled yes
    # 这个文件会放在dir定义的/data目录
    cluster-config-file nodes.conf
    cluster-node-timeout 15000
    slowlog-log-slower-than 10000
    slowlog-max-len 128
    latency-monitor-threshold 0
    notify-keyspace-events ""
    hash-max-ziplist-entries 512
    hash-max-ziplist-value 64
    list-max-ziplist-size -2
    list-compress-depth 0
    set-max-intset-entries 512
    zset-max-ziplist-entries 128
    zset-max-ziplist-value 64
    hll-sparse-max-bytes 3000
    stream-node-max-bytes 4096
    stream-node-max-entries 100
    activerehashing yes
    client-output-buffer-limit normal 0 0 0
    client-output-buffer-limit replica 256mb 64mb 60
    client-output-buffer-limit pubsub 32mb 8mb 60
    hz 10
    dynamic-hz yes
    aof-rewrite-incremental-fsync yes
    rdb-save-incremental-fsync yes

1.3 配置集群脚本文件

[root@k8s-master1 redis-sts-cluster]# cat redis-sts-manager-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: redis-cluster-manager-script
  namespace: redis-cluster
data:
  cluster-manager.sh: |
    #!/bin/bash
    
    set -euo pipefail
    
    # 日志函数
    log() {
      local level="INFO"
      if [[ "$1" == *"错误"* || "$1" == *"失败"* ]]; then
        level="ERROR"
      fi
      echo "[$(date '+%Y-%m-%d %H:%M:%S')][$level][$POD_NAME] $1" >&2
    }
    
    # 捕获终止信号
    trap "log '收到终止信号,正在清理...'; exit 0" SIGINT SIGTERM
    
    # 环境变量
    CURRENT_IP="${CURRENT_IP:-}"
    INITIAL_REPLICAS="${INITIAL_REPLICAS:-6}"
    MAX_RETRIES=${MAX_RETRIES:-180}      # 最大重试次数
    RETRY_DELAY=1       # 每次重试的间隔(秒)
    TIMEOUT=$(( MAX_RETRIES * RETRY_DELAY ))  # 计算总超时时间
    POD_NAME="${POD_NAME:-}"
    NAMESPACE="${MY_POD_NAMESPACE:-default}"
    SERVER_NAME="${SERVER_NAME:-redis-sts-cluster}"
    REDIS_PASSWORD="${REDIS_PASSWORD:-iloveyou}"
    DNS_SERVER="${DNS_SERVER:-10.96.0.10}"  # 默认K8s集群的CoreDNS地址
    REDIS_PORT="${REDIS_PORT:-6379}"

    # 获取 pod ip 地址
    get_redis_ips() {
      local headless_domain="${SERVER_NAME}-headless.${NAMESPACE}.svc.cluster.local"
      local attempt=0
      local count=0
      local ips=""
    
      while (( attempt < MAX_RETRIES )); do
        ips=$(dig -t A "${headless_domain}" @"${DNS_SERVER}" +short)
        count=$(echo "$ips" | wc -l)
    
        if (( count >= INITIAL_REPLICAS )); then
          echo "$ips" | awk -v port="$REDIS_PORT" '{printf "%s:%s ", $1, port}'
          return 0
        fi
    
        log "等待 Redis 实例全部注册到 DNS(已获取 ${count} 个,剩余重试次数 $((MAX_RETRIES - attempt)))..."
        sleep "$RETRY_DELAY"
        ((attempt++))
      done
    
      log "错误:获取 IP 超时(${TIMEOUT} 秒),最终仅获取到 ${count} 个 IP"
      exit 1
    }

    # 校验是否为偶数节点
    validate_even_replicas() {
      local count
      count=$(get_redis_ips | wc -w)
      if (( count % 2 != 0 )); then
        log "错误:实例数量必须是偶数,当前为 ${count} 个,退出。"
        exit 1
      fi
      log "当前实例数量为偶数:${count},满足条件。"
    }
    
    # 初始化 Redis 集群(仅在 redis-sts-0 上执行)
    initialize_cluster() {
      log "当前 Pod 为 ${POD_NAME},正在检查是否为主节点..."
      if [[ "$POD_NAME" == "${SERVER_NAME}-0" ]]; then
        local redis_ips
        redis_ips=$(get_redis_ips)
    
        log "开始初始化 Redis 集群..."
        log "使用节点:${redis_ips}"
    
        if redis-cli --cluster create --cluster-replicas 1 \
            ${redis_ips} -a "${REDIS_PASSWORD}" --cluster-yes; then
          log "Redis 集群初始化完成。"
        else
          log "错误:Redis 集群初始化失败。"
          exit 1
        fi
      else
        log "当前 Pod 不是主节点,跳过集群初始化。"
      fi
    }

    # 检查是否已加入集群,如果未加入,则执行加入逻辑
    join_cluster_if_needed() {
      local self_ip="${CURRENT_IP}:${REDIS_PORT}"
      local any_node_ip
      any_node_ip=$(get_redis_ips | awk '{print $1}' | head -n1)
    
      # 检查本节点是否已在集群中
      if redis-cli -h "${any_node_ip%:*}" -p "$REDIS_PORT" -a "$REDIS_PASSWORD" cluster nodes | grep -q "$CURRENT_IP"; then
        log "当前节点 $self_ip 已存在于 Redis 集群中。"
        return 0
      fi
    
      log "当前节点 $self_ip 不在 Redis 集群中,尝试加入..."
      if redis-cli --cluster add-node "$self_ip" "$any_node_ip" -a "$REDIS_PASSWORD"; then
        log "成功将 $self_ip 加入 Redis 集群。"
      else
        log "错误:无法将 $self_ip 加入 Redis 集群。"
        exit 1
      fi
    }

    # 监听扩缩容(占位)
    watch_scaling() {
      log "开启扩缩容监听(占位逻辑,可自定义扩展)..."
      while true; do
        sleep 10
        # 你可以在此添加自动 rebalance 检查逻辑
      done
    }
    
     main() {
      validate_even_replicas
    
      if [[ "$POD_NAME" == "${SERVER_NAME}-0" ]]; then
        initialize_cluster
      else
        join_cluster_if_needed
      fi
    
      watch_scaling
    }
 
    main

2. 启动服务

2.1 注意事项

sidecart 容器相关变量

SERVER_NAME: 控制器明和service名称,一定要一致
REDIS_PASSWORD: 密码要和配置文件中的一致
INITIAL_REPLICAS: 副本数量也要和实际情况一致
MAX_RETRIES: 等待 pod 全部 running 启动时间,如果镜像拉取比较慢则数值需要调大
DNS_SERVER: coredns 服务的地址

2.2 自定义 redis 镜像

因为要安装一个 dig 命令

[root@k8s-master1 redis-sts-cluster]#  cat Dockerfile 
# 使用阿里云镜像的基础镜像
FROM redis:7.2.4

# 替换为阿里云 Debian 源并安装依赖
RUN sed -i 's|deb.debian.org|mirrors.aliyun.com|g' /etc/apt/sources.list.d/debian.sources && \
    sed -i 's|security.debian.org|mirrors.aliyun.com|g' /etc/apt/sources.list.d/debian.sources && \
    apt-get update && \
    apt-get install -y --no-install-recommends dnsutils && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

build 制作

[root@k8s-master1 redis-sts-cluster]# docker build . -t harbor.meta42-uat.com/library/redis-cluster-manager:7.2.4
[root@k8s-master1 redis-sts-cluster]# docker push harbor.meta42-uat.com/library/redis-cluster-manager:7.2.4
The push refers to repository [harbor.meta42-uat.com/library/redis-cluster-manager]
a2683e2c476e: Pushed 
1e5c32803ebe: Pushed 
5f70bf18a086: Mounted from kubesphere/examples-bookinfo-ratings-v1 
8bda18e2b70f: Pushed 
a59762f8ecda: Pushed 
3b86ccb39e58: Pushed 
6fa3d2d4aa11: Pushed 
7238a2a7c554: Pushed 
5d4427064ecc: Pushed 
7.2.4: digest: sha256:f54d1ec77f5190b1fd41740be0ac8a9a3bb0df1c627b32a6dfd8b99a4ac32c37 size: 2199

2.2 启动

[root@k8s-master1 redis-sts-cluster]# kubectl apply -f .
service/redis-sts-cluster-headless created
statefulset.apps/redis-sts-cluster created
configmap/redis-conf created
configmap/redis-cluster-manager-script created

2.3 查看日志

[root@k8s-master1 redis-sts-cluster]# kubectl -n redis-cluster logs --tail=20 -f redis-sts-cluster-0 cluster-manager
[2025-06-04 17:31:08][INFO][redis-sts-cluster-0] 等待 Redis 实例全部注册到 DNS(已获取 1 个,剩余重试次数 180)...
[2025-06-04 17:31:09][INFO][redis-sts-cluster-0] 等待 Redis 实例全部注册到 DNS(已获取 1 个,剩余重试次数 179)...
[2025-06-04 17:31:10][INFO][redis-sts-cluster-0] 等待 Redis 实例全部注册到 DNS(已获取 1 个,剩余重试次数 178)...
[2025-06-04 17:32:07][INFO][redis-sts-cluster-0] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 123)...
[2025-06-04 17:32:08][INFO][redis-sts-cluster-0] 当前实例数量为偶数:6,满足条件。
[2025-06-04 17:32:08][INFO][redis-sts-cluster-0] 当前 Pod 为 redis-sts-cluster-0,正在检查是否为主节点...
[2025-06-04 17:32:09][INFO][redis-sts-cluster-0] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 180)...
[2025-06-04 17:32:10][INFO][redis-sts-cluster-0] 开始初始化 Redis 集群...
[2025-06-04 17:32:10][INFO][redis-sts-cluster-0] 使用节点:172.18.186.103:6379 172.18.169.131:6379 172.18.36.103:6379 172.18.189.38:6379 172.18.107.248:6379 172.18.122.105:6379 
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 172.18.107.248:6379 to 172.18.186.103:6379
Adding replica 172.18.122.105:6379 to 172.18.169.131:6379
Adding replica 172.18.189.38:6379 to 172.18.36.103:6379
M: 6385efc0d2fcd47158e8fa42d054d25483b1639d 172.18.186.103:6379
   slots:[0-5460] (5461 slots) master
M: a3a463f7a54b1ac3c8b7c42a9b9f869df8686ab1 172.18.169.131:6379
   slots:[5461-10922] (5462 slots) master
M: 46dfe276b86126ff6ca0db8c24307b98cce6ce15 172.18.36.103:6379
   slots:[10923-16383] (5461 slots) master
S: 2e8e3de5b7a8c36840a2fe256aa51addf9d6fbda 172.18.189.38:6379
   replicates 46dfe276b86126ff6ca0db8c24307b98cce6ce15
S: f1dbc33582f0188d4b1e13595b3b7282225aa275 172.18.107.248:6379
   replicates 6385efc0d2fcd47158e8fa42d054d25483b1639d
S: d62059259d6f3772e3d53f7ac47caaa4b0485ad2 172.18.122.105:6379
   replicates a3a463f7a54b1ac3c8b7c42a9b9f869df8686ab1
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
.
>>> Performing Cluster Check (using node 172.18.186.103:6379)
M: 6385efc0d2fcd47158e8fa42d054d25483b1639d 172.18.186.103:6379
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
M: 46dfe276b86126ff6ca0db8c24307b98cce6ce15 172.18.36.103:6379
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 2e8e3de5b7a8c36840a2fe256aa51addf9d6fbda 172.18.189.38:6379
   slots: (0 slots) slave
   replicates 46dfe276b86126ff6ca0db8c24307b98cce6ce15
M: a3a463f7a54b1ac3c8b7c42a9b9f869df8686ab1 172.18.169.131:6379
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: d62059259d6f3772e3d53f7ac47caaa4b0485ad2 172.18.122.105:6379
   slots: (0 slots) slave
   replicates a3a463f7a54b1ac3c8b7c42a9b9f869df8686ab1
S: f1dbc33582f0188d4b1e13595b3b7282225aa275 172.18.107.248:6379
   slots: (0 slots) slave
   replicates 6385efc0d2fcd47158e8fa42d054d25483b1639d
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
[2025-06-04 17:32:12][INFO][redis-sts-cluster-0] Redis 集群初始化完成。
[2025-06-04 17:32:12][INFO][redis-sts-cluster-0] 开启扩缩容监听(占位逻辑,可自定义扩展)...

其余 pod 日志

[root@k8s-master1 redis-sts-cluster]# kubectl -n redis-cluster logs --tail=20 -f redis-sts-cluster-1 cluster-manager
[2025-06-04 17:31:52][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 144)...
[2025-06-04 17:31:53][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 143)...
[2025-06-04 17:31:54][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 142)...
[2025-06-04 17:31:55][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 141)...
[2025-06-04 17:31:56][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 140)...
[2025-06-04 17:31:57][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 139)...
[2025-06-04 17:31:58][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 138)...
[2025-06-04 17:31:59][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 137)...
[2025-06-04 17:32:00][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 136)...
[2025-06-04 17:32:01][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 135)...
[2025-06-04 17:32:02][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 134)...
[2025-06-04 17:32:03][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 133)...
[2025-06-04 17:32:04][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 132)...
[2025-06-04 17:32:06][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 131)...
[2025-06-04 17:32:07][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 130)...
[2025-06-04 17:32:08][INFO][redis-sts-cluster-1] 等待 Redis 实例全部注册到 DNS(已获取 5 个,剩余重试次数 129)...
[2025-06-04 17:32:09][INFO][redis-sts-cluster-1] 当前实例数量为偶数:6,满足条件。
[2025-06-04 17:32:09][INFO][redis-sts-cluster-1] 当前 Pod 为 redis-sts-cluster-1,正在检查是否为主节点...
[2025-06-04 17:32:09][INFO][redis-sts-cluster-1] 当前 Pod 不是主节点,跳过集群初始化。
[2025-06-04 17:32:09][INFO][redis-sts-cluster-1] 开启扩缩容监听(占位逻辑,可自定义扩展)...

最终效果

[root@k8s-master1 redis-sts-cluster]# kubectl get pod -n redis-cluster -o wide
NAME                  READY   STATUS    RESTARTS   AGE   IP               NODE        NOMINATED NODE   READINESS GATES
redis-sts-cluster-0   2/2     Running   0          17m   172.18.36.103    k8s-node1   <none>           <none>
redis-sts-cluster-1   2/2     Running   0          17m   172.18.169.131   k8s-node2   <none>           <none>
redis-sts-cluster-2   2/2     Running   0          17m   172.18.107.248   k8s-node3   <none>           <none>
redis-sts-cluster-3   2/2     Running   0          17m   172.18.186.103   k8s-node6   <none>           <none>
redis-sts-cluster-4   2/2     Running   0          15m   172.18.122.123   k8s-node4   <none>           <none>
redis-sts-cluster-5   2/2     Running   0          17m   172.18.189.38    k8s-node7   <none>           <none>

3. 验证集群

3.1 检查集群状态

[root@k8s-master1 redis-sts-cluster]# kubectl exec -it -n redis-cluster redis-sts-cluster-0 -c redis -- bash

root@redis-sts-cluster-0:/data# export REDISCLI_AUTH="${REDIS_PASSWORD:-iloveyou}"

root@redis-sts-cluster-0:/data# redis-cli cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:3
cluster_stats_messages_ping_sent:1084
cluster_stats_messages_pong_sent:984
cluster_stats_messages_meet_sent:1
cluster_stats_messages_sent:2069
cluster_stats_messages_ping_received:984
cluster_stats_messages_pong_received:1084
cluster_stats_messages_received:2068
total_cluster_links_buffer_limit_exceeded:0

root@redis-sts-cluster-0:/data# redis-cli cluster nodes
f1dbc33582f0188d4b1e13595b3b7282225aa275 172.18.107.248:6379@16379 slave 6385efc0d2fcd47158e8fa42d054d25483b1639d 0 1749030619551 1 connected
a3a463f7a54b1ac3c8b7c42a9b9f869df8686ab1 172.18.169.131:6379@16379 master - 0 1749030616539 2 connected 5461-10922
46dfe276b86126ff6ca0db8c24307b98cce6ce15 172.18.36.103:6379@16379 myself,master - 0 1749030619000 3 connected 10923-16383
6385efc0d2fcd47158e8fa42d054d25483b1639d 172.18.186.103:6379@16379 master - 0 1749030618548 1 connected 0-5460
2e8e3de5b7a8c36840a2fe256aa51addf9d6fbda 172.18.189.38:6379@16379 slave 46dfe276b86126ff6ca0db8c24307b98cce6ce15 0 1749030620556 3 connected
d62059259d6f3772e3d53f7ac47caaa4b0485ad2 172.18.122.123:6379@16379 slave a3a463f7a54b1ac3c8b7c42a9b9f869df8686ab1 0 1749030618000 2 connected

3.2 测试写入

[root@k8s-master1 redis-sts-cluster]# kubectl exec -it -n redis-cluster redis-sts-cluster-0 -c redis -- bash

root@redis-sts-cluster-0:/data# export REDISCLI_AUTH="${REDIS_PASSWORD:-iloveyou}"

root@redis-sts-cluster-0:/data# redis-cli -c set test_key "hello"
OK
root@redis-sts-cluster-0:/data# redis-cli -c get test_key
"hello"

验证其他机器

[root@k8s-master1 redis-sts-cluster]# kubectl exec -it -n redis-cluster redis-sts-cluster-4 -c redis -- bash

root@redis-sts-cluster-4:/data# export REDISCLI_AUTH="${REDIS_PASSWORD:-iloveyou}"

root@redis-sts-cluster-4:/data# redis-cli -c get test_key
"hello"

4. 扩容副本数量

4.1 扩容至 8 个副本

[root@k8s-master1 redis-sts-cluster]# kubectl scale statefulset -n redis-cluster redis-sts-cluster --replicas=8

查看新增两个 pod

[root@k8s-master1 redis-sts-cluster]# kubectl get pod -n redis-cluster -o wide
NAME                  READY   STATUS    RESTARTS        AGE     IP               NODE        NOMINATED NODE   READINESS GATES
redis-sts-cluster-0   2/2     Running   0               3m5s    172.18.36.74     k8s-node1   <none>           <none>
redis-sts-cluster-1   2/2     Running   2 (2m16s ago)   2m59s   172.18.169.167   k8s-node2   <none>           <none>
redis-sts-cluster-2   2/2     Running   1 (2m20s ago)   2m53s   172.18.107.209   k8s-node3   <none>           <none>
redis-sts-cluster-3   2/2     Running   2 (2m19s ago)   2m46s   172.18.186.113   k8s-node6   <none>           <none>
redis-sts-cluster-4   2/2     Running   2 (2m19s ago)   2m41s   172.18.122.92    k8s-node4   <none>           <none>
redis-sts-cluster-5   2/2     Running   2 (2m19s ago)   2m34s   172.18.189.53    k8s-node7   <none>           <none>
redis-sts-cluster-6   2/2     Running   0               20s     172.18.195.234   k8s-node5   <none>           <none>
redis-sts-cluster-7   2/2     Running   0               14s     172.18.36.68     k8s-node1   <none>           <none>

4.2 查看日志

虽然自动加入集群了,但是没有自动执行 Rebalance

oot@k8s-master1 redis-sts-cluster]# kubectl -n redis-cluster logs -f redis-sts-cluster-7 cluster-manager
[2025-06-09 15:03:35][INFO][redis-sts-cluster-7] 当前实例数量为偶数:6,满足条件。
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
[2025-06-09 15:03:35][INFO][redis-sts-cluster-7] 当前节点 172.18.36.68:6379 不在 Redis 集群中,尝试加入...
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 172.18.36.68:6379 to cluster 172.18.107.209:6379
>>> Performing Cluster Check (using node 172.18.107.209:6379)
M: 2892403400cee040776e6e0dbdc822fbc16897e3 172.18.107.209:6379
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: ec2592aacef6cac35663905667c70580750f812e 172.18.36.74:6379
   slots: (0 slots) slave
   replicates 2892403400cee040776e6e0dbdc822fbc16897e3
M: 5b676bdcf61eff0b08b533ad4dcbc8bf16962946 172.18.186.113:6379
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
M: 883785f5fe6861a206c678d666c69728251865d4 172.18.195.234:6379
   slots: (0 slots) master
M: 53aa1105d765ca71adbcf2c9f36423fc2837a619 172.18.122.92:6379
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 68cd19b6240e863eaa0b6e771e6ed0eed57ccef6 172.18.189.53:6379
   slots: (0 slots) slave
   replicates 53aa1105d765ca71adbcf2c9f36423fc2837a619
S: 00ea13e5c369dd5b4a86cdf3144275851a295881 172.18.169.167:6379
   slots: (0 slots) slave
   replicates 5b676bdcf61eff0b08b533ad4dcbc8bf16962946
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Getting functions from cluster
>>> Send FUNCTION LIST to 172.18.36.68:6379 to verify there is no functions in it
>>> Send FUNCTION RESTORE to 172.18.36.68:6379
>>> Send CLUSTER MEET to node 172.18.36.68:6379 to make it join the cluster.
[OK] New node added correctly.
[2025-06-09 15:03:35][INFO][redis-sts-cluster-7] 成功将 172.18.36.68:6379 加入 Redis 集群。
[2025-06-09 15:03:35][INFO][redis-sts-cluster-7] 开启扩缩容监听(占位逻辑,可自定义扩展)...

Redis Cluster 会将 16384 个槽(slots)平均分布到主节点上,每个主节点负责一部分数据

redis-cli --cluster create ... --cluster-replicas 1

Redis 会自动将槽分给主节点(比如原来是 3 个主节点,每个分到大约 5461 个槽)。

🔸 问题来了:

当你新增主节点时(比如从 3 主变成 5 主),Redis 不会自动将槽再均衡给新的节点。新增的主节点就会:

  • 不拥有任何槽(slot);

  • 不存储任何数据;

  • 形同虚设,只是“空的主节点”。

🔸 rebalance 的作用:

redis-cli --cluster rebalance 会自动将槽从已有主节点“搬迁”到新增主节点,实现“真正扩容”,让新节点承担负载。

4.3 查看集群信息

cluster_size-old.png

4.4 执行 rebalance

root@redis-sts-cluster-7:/data# export REDISCLI_AUTH="${REDIS_PASSWORD:-iloveyou}"
root@redis-sts-cluster-7:/data# redis-cli --cluster rebalance 172.18.36.74:6379 --cluster-use-empty-masters

cluster_size-new.png

4.5 再次查看集群信息

5个master.png

5. 缩容副本数量

[root@k8s-master1 redis-sts-cluster]# kubectl get pod -n redis-cluster -o wide
NAME                  READY   STATUS    RESTARTS        AGE     IP               NODE        NOMINATED NODE   READINESS GATES
redis-sts-cluster-0   2/2     Running   0               3m5s    172.18.36.74     k8s-node1   <none>           <none>
redis-sts-cluster-1   2/2     Running   2 (2m16s ago)   2m59s   172.18.169.167   k8s-node2   <none>           <none>
redis-sts-cluster-2   2/2     Running   1 (2m20s ago)   2m53s   172.18.107.209   k8s-node3   <none>           <none>
redis-sts-cluster-3   2/2     Running   2 (2m19s ago)   2m46s   172.18.186.113   k8s-node6   <none>           <none>
redis-sts-cluster-4   2/2     Running   2 (2m19s ago)   2m41s   172.18.122.92    k8s-node4   <none>           <none>
redis-sts-cluster-5   2/2     Running   2 (2m19s ago)   2m34s   172.18.189.53    k8s-node7   <none>           <none>
redis-sts-cluster-6   2/2     Running   0               20s     172.18.195.234   k8s-node5   <none>           <none>
redis-sts-cluster-7   2/2     Running   0               14s     172.18.36.68     k8s-node1   <none>           <none>

当前有 8 个节点,其中 redis-sts-cluster-6 和 redis-sts-cluster-7 是两个新增的 master 节点,分别持有部分槽位:

883785f5fe6861a206c678d666c69728251865d4 172.18.195.234:6379 master - slots: 5461-7646, 10923-12014  ← redis-sts-cluster-6
91ab95a8a087b51c6d6ec930bb57c371b4d9bd18 172.18.36.68:6379 master - slots: 0-2184, 12015-13107       ← redis-sts-cluster-7

5.2 查看要迁移的 master 的槽位数

首先我们要知道 redis 共有 16384 个哈希槽

Redis Cluster 中的槽从 0 到 16383,一共 16384 个槽

1) 0       - 1092     → 共 1093 个
2) 1093    - 2184     → 共 1092 个
3) 2185    - 6552     → 共 4368 个
4) 6553    - 7646     → 共 1094 个
5) 7647    - 10922    → 共 3276 个
6) 10923   - 12014    → 共 1092 个
7) 12015   - 13107    → 共 1093 个
8) 13108   - 16383    → 共 3276 个

通过执行下面的命令获取全部 slots 信息

root@redis-sts-cluster-7:/data# redis-cli cluster slots
1) 1) (integer) 0
   2) (integer) 1092
   3) 1) "172.18.107.209"
      2) (integer) 6379
      3) "2892403400cee040776e6e0dbdc822fbc16897e3"
      4) (empty array)
   4) 1) "172.18.36.74"
      2) (integer) 6379
      3) "ec2592aacef6cac35663905667c70580750f812e"
      4) (empty array)
2) 1) (integer) 1093
   2) (integer) 2184
   3) 1) "172.18.36.68"
      2) (integer) 6379
      3) "91ab95a8a087b51c6d6ec930bb57c371b4d9bd18"
      4) (empty array)
3) 1) (integer) 2185
   2) (integer) 6552
   3) 1) "172.18.186.113"
      2) (integer) 6379
      3) "5b676bdcf61eff0b08b533ad4dcbc8bf16962946"
      4) (empty array)
   4) 1) "172.18.169.167"
      2) (integer) 6379
      3) "00ea13e5c369dd5b4a86cdf3144275851a295881"
      4) (empty array)
4) 1) (integer) 6553
   2) (integer) 7646
   3) 1) "172.18.195.234"
      2) (integer) 6379
      3) "883785f5fe6861a206c678d666c69728251865d4"
      4) (empty array)
5) 1) (integer) 7647
   2) (integer) 10922
   3) 1) "172.18.107.209"
      2) (integer) 6379
      3) "2892403400cee040776e6e0dbdc822fbc16897e3"
      4) (empty array)
   4) 1) "172.18.36.74"
      2) (integer) 6379
      3) "ec2592aacef6cac35663905667c70580750f812e"
      4) (empty array)
6) 1) (integer) 10923
   2) (integer) 12014
   3) 1) "172.18.195.234"
      2) (integer) 6379
      3) "883785f5fe6861a206c678d666c69728251865d4"
      4) (empty array)
7) 1) (integer) 12015
   2) (integer) 13107
   3) 1) "172.18.36.68"
      2) (integer) 6379
      3) "91ab95a8a087b51c6d6ec930bb57c371b4d9bd18"
      4) (empty array)
8) 1) (integer) 13108
   2) (integer) 16383
   3) 1) "172.18.122.92"
      2) (integer) 6379
      3) "53aa1105d765ca71adbcf2c9f36423fc2837a619"
      4) (empty array)
   4) 1) "172.18.189.53"
      2) (integer) 6379
      3) "68cd19b6240e863eaa0b6e771e6ed0eed57ccef6"
      4) (empty array)

其中你可以手动用减法加 1 来算:

  • 7646 - 6553 + 1 = 1094

  • 12014 - 10923 + 1 = 1092

  • 13107 - 12015 + 1 = 1093

移除以下两个节点:

IP	            Node ID (前缀)	        槽段	                    槽数量
172.18.195.234	883785f5fe6861a20...	6553-7646、10923-12014	1094 + 1092 = 2186
172.18.36.68	91ab95a8a087b51c6...	1093-2184、12015-13107	1092 + 1093 = 2185

5.2 将它们的槽位迁移到其他 master

将 redis-sts-cluster-6(172.18.195.234) 的槽位迁移出去,1092 指的是当前的槽位数

root@redis-sts-cluster-7:/data# export REDISCLI_AUTH="${REDIS_PASSWORD:-iloveyou}"
root@redis-sts-cluster-7:/data# redis-cli --cluster reshard 172.18.195.234:6379 \
  --cluster-from 883785f5fe6861a206c678d666c69728251865d4 \
  --cluster-to 5b676bdcf61eff0b08b533ad4dcbc8bf16962946 \
  --cluster-slots 2186 \
  --cluster-yes

将 redis-sts-cluster-7(172.18.36.68) 的槽位迁移出去,1092 指的是当前的槽位数

root@redis-sts-cluster-7:/data# redis-cli --cluster reshard 172.18.36.68:6379 \
  --cluster-from 91ab95a8a087b51c6d6ec930bb57c371b4d9bd18 \
  --cluster-to 2892403400cee040776e6e0dbdc822fbc16897e3 \
  --cluster-slots 2185 \
  --cluster-yes

5.4 删除节点

root@redis-sts-cluster-7:/data# redis-cli --cluster del-node 172.18.36.74:6379 \
  883785f5fe6861a206c678d666c69728251865d4
root@redis-sts-cluster-7:/data# redis-cli --cluster del-node 172.18.36.74:6379 \
  91ab95a8a087b51c6d6ec930bb57c371b4d9bd18

5.5 登陆到 redis-sts-cluster-0 节点验证

[root@k8s-master1 redis-sts-cluster]# kubectl exec -it -n redis-cluster redis-sts-cluster-0 -c redis -- bash
root@redis-sts-cluster-0:/data# export REDISCLI_AUTH="${REDIS_PASSWORD:-iloveyou}"
root@redis-sts-cluster-0:/data# redis-cli cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:10
cluster_my_epoch:10
cluster_stats_messages_ping_sent:2764
cluster_stats_messages_pong_sent:2820
cluster_stats_messages_meet_sent:1
cluster_stats_messages_update_sent:4
cluster_stats_messages_sent:5589
cluster_stats_messages_ping_received:2819
cluster_stats_messages_pong_received:15877
cluster_stats_messages_meet_received:1
cluster_stats_messages_update_received:1
cluster_stats_messages_received:18698
total_cluster_links_buffer_limit_exceeded:0

5.6 缩容副本数量

[root@k8s-master1 redis-sts-cluster]# kubectl scale -n redis-cluster statefulset redis-sts-cluster --replicas=6
statefulset.apps/redis-sts-cluster scaled
[root@k8s-master1 redis-sts-cluster]# kubectl get pod -n redis-cluster -o wide
NAME                  READY   STATUS    RESTARTS      AGE   IP               NODE        NOMINATED NODE   READINESS GATES
redis-sts-cluster-0   2/2     Running   0             58m   172.18.36.74     k8s-node1   <none>           <none>
redis-sts-cluster-1   2/2     Running   2 (57m ago)   58m   172.18.169.167   k8s-node2   <none>           <none>
redis-sts-cluster-2   2/2     Running   1 (57m ago)   58m   172.18.107.209   k8s-node3   <none>           <none>
redis-sts-cluster-3   2/2     Running   2 (57m ago)   58m   172.18.186.113   k8s-node6   <none>           <none>
redis-sts-cluster-4   2/2     Running   2 (57m ago)   57m   172.18.122.92    k8s-node4   <none>           <none>
redis-sts-cluster-5   2/2     Running   2 (57m ago)   57m   172.18.189.53    k8s-node7   <none>           <none>

可选删除 pvc 存储

[root@k8s-master1 redis-sts-cluster]# kubectl -n redis-cluster delete pvc redis-data-redis-sts-cluster-6 redis-data-redis-sts-cluster-7
persistentvolumeclaim "redis-data-redis-sts-cluster-6" deleted
persistentvolumeclaim "redis-data-redis-sts-cluster-7" deleted

6. 部署单机版本 redis

主服务

[root@k8s-master1 redis-stand]# cat redis-stand-sts.yaml 
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis-stand
  namespace: middleware
  labels:
    app: redis-stand
spec:
  replicas: 1
  serviceName: redis-stand-headless
  selector:
    matchLabels:
      app: redis-stand
  template:
    metadata:
      labels:
        app: redis-stand
    spec:
      initContainers:
      - name: system-init
        image: busybox:1.32
        imagePullPolicy: IfNotPresent
        command:
        - "/bin/sh"
        - "-c"
        - |
          echo "调整内存大页"
          echo 2048 > /proc/sys/net/core/somaxconn && echo never > /sys/kernel/mm/transparent_hugepage/enabled
        securityContext:
          privileged: true
          runAsUser: 0
        volumeMounts:
        - name: sys
          mountPath: /sys
      containers:
      - name: redis-stand
        image: redis:6.2
        imagePullPolicy: IfNotPresent
        command: ["redis-server", "/usr/local/etc/redis/redis.conf"]  # 指定配置文件启动
        ports:
        - name: redis
          containerPort: 6379
          protocol: TCP
        resources:
          limits:
            cpu: 1000m
            memory: 2Gi
          requests:
            cpu: 500m
            memory: 500Mi
        # 存活探针
        livenessProbe:
          exec:
            command:
            - redis-cli
            - ping
          initialDelaySeconds: 30
          timeoutSeconds: 5
        # 就绪性探针
        readinessProbe:
          exec:
            command:
            - redis-cli
            - ping
          initialDelaySeconds: 30
          timeoutSeconds: 1
        volumeMounts:
          - name: host-time
            mountPath: /etc/localtime
            readOnly: true
          - name: redis-data
            mountPath: /data
          - name: redis-stand-config
            mountPath: /usr/local/etc/redis/redis.conf
            subPath: redis.conf
            readOnly: true
      volumes:
      - name: host-time
        hostPath:
          path: /etc/localtime
      - name: sys
        hostPath:
          path: /sys
      - name: redis-stand-config
        configMap:
          name: redis-stand-config
          items:
          - key: redis.conf
            path: redis.conf
      restartPolicy: Always
  volumeClaimTemplates:                           # 定义创建pvc的模板
    - metadata:
        name: "redis-data"                        # 模板名称
      spec:
        accessModes:                            
        - ReadWriteOnce                           # 访问模式为RWO
        resources:                                # 资源请求
          requests:
            storage: 100Gi                        # 需要100Gi的存储空间
        storageClassName: openebs-hostpath        # 指定使用的存储类,实现动态分配pv
        volumeMode: Filesystem
---
apiVersion: v1
kind: Service
metadata:
  name: redis-stand-headless
  namespace: middleware
  labels:
    app: redis-stand
spec:
  ports:
    - name: redis
      port: 6379
      targetPort: 6379
  selector:
    app: redis-stand

配置文件

[root@k8s-master1 redis-stand]# cat redis.conf
# Redis 单机版配置文件示例

# 基本配置
# 允许所有IP连接,生产环境建议绑定具体IP
bind 0.0.0.0 
# 关闭保护模式,允许外部连接
protected-mode no
# 默认端口
port 6379
tcp-backlog 511
timeout 0
tcp-keepalive 300

# 通用设置
# 不以守护进程运行(在Docker中应设为no)
daemonize no
# 不使用upstart或systemd管理
supervised no
pidfile /var/run/redis_6379.pid
# 日志级别:debug/verbose/notice/warning
loglevel notice
# 日志文件路径,空字符串表示输出到标准输出
logfile ""
# 数据库数量
databases 16
always-show-logo no

# 快照持久化(RDB)
# 900秒内至少有1个key被修改则触发保存
save 900 1
# 300秒内至少有10个key被修改
save 300 10
# 60秒内至少有10000个key被修改
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
# 持久化文件存储目录
dir /data

# 主从复制(单机版不需要)
# replicaof <masterip> <masterport>

# 安全设置
# requirepass foobared

# 客户端限制
maxclients 10000

# 内存管理
# 最大内存限制,根据实际情况调整
maxmemory 2gb
# 内存满时的淘汰策略
maxmemory-policy volatile-lru

# 追加模式持久化(AOF)
# 默认关闭AOF
appendonly no
appendfilename "appendonly.aof"
# 每秒同步
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

# Lua脚本
lua-time-limit 5000

# 慢查询日志
# 超过10毫秒的查询
slowlog-log-slower-than 10000
slowlog-max-len 128

# 事件通知
notify-keyspace-events ""

# 高级配置
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
stream-node-max-bytes 4096
stream-node-max-entries 100
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit replica 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
dynamic-hz yes
aof-rewrite-incremental-fsync yes
rdb-save-incremental-fsync yes

启动服务

[root@k8s-master1 redis-stand]# kubectl -n middleware create configmap redis-stand-config --from-file=redis.conf
[root@k8s-master1 redis-stand]# kubectl apply -f .
statefulset.apps/redis-stand created
service/redis-stand-headless created

三、跨集群数据同步

1. 基本功能

redis-shake是我们基于 redis-port 基础上进行改进的一款产品。它支持解析、恢复、备份、同步四个功能。以下主要介绍同步sync。

  • 恢复restore:将RDB文件恢复到目的redis数据库。

  • 备份dump:将源redis的全量数据通过RDB文件备份起来。

  • 解析decode:对RDB文件进行读取,并以json格式解析存储。

  • 同步sync:支持源redis和目的redis的数据同步,支持全量和增量数据的迁移,支持从云下到阿里云云上的同步,也支持云下到云下不同环境的同步,支持单节点、主从版、集群版之间的互相同步。需要注意的是,如果源端是集群版,可以启动一个RedisShake,从不同的db结点进行拉取,同时源端不能开启move slot功能;对于目的端,如果是集群版,写入可以是1个或者多个db结点。

  • 同步rump:支持源redis和目的redis的数据同步,仅支持全量的迁移。采用scan和restore命令进行迁移,支持不同云厂商不同redis版本的迁移。

2. 原理

redis-shake的基本原理就是模拟一个从节点加入源redis集群,首先进行全量拉取并回放,然后进行增量的拉取(通过psync命令)。如下图所示:

redis-shake.pngredis-shake.png

如果源端是集群模式,只需要启动一个 redis-shake 进行拉取,同时不能开启源端的 move slot 操作。如果目的端是集群模式,可以写入到一个结点,然后再进行slot的迁移,当然也可以多对多写入。

目前,redis-shake 到目的端采用单链路实现,对于正常情况下,这不会成为瓶颈,但对于极端情况,qps 比较大的时候,此部分性能可能成为瓶颈,后续我们可能会计划对此进行优化。另外,redis-shake 到目的端的数据同步采用异步的方式,读写分离在2个线程操作,降低因为网络时延带来的同步性能下降。

1. 下载 redis-shake

$ wget https://github.com/tair-opensource/RedisShake/releases/download/v4.4.0/redis-shake-v4.4.0-linux-amd64.tar.gz

2. 解压

$ tar xvf redis-shake-v4.4.0-linux-amd64.tar.gz

3. 配置文件

$ cat >shake.toml<<'EOF'
[sync_reader]
cluster = true             # 源是Redis Cluster
address = "172.18.36.80:6379" 
username = ""              # 如果没有ACL用户保持为空
password = "iloveyou"      # 源Redis密码
tls = false                
sync_rdb = true            # 同步RDB快照
sync_aof = true            # 同步AOF增量
prefer_replica = false     
try_diskless = false       

[redis_writer]
cluster = true             # 目标是Redis Cluster          
master = ""                
address = "172.18.36.74:6379" # 可以是集群中任意节点地址
username = ""              
password = "iloveyou"      # 目标Redis密码
tls = false
off_reply = false          

[advanced]
dir = "data"               # 日志存储目录
ncpu = 4                   # 根据CPU核心数调整
pprof_port = 0  
status_port = 0 

log_file = "shake.log"
log_level = "info"     
log_interval = 5       

rdb_restore_command_behavior = "rewrite" # 覆盖已存在的key

pipeline_count_limit = 1024
target_redis_client_max_querybuf_len = 1024_000_000
target_redis_proto_max_bulk_len = 512_000_000   # 大 key 可能需要调整参数
EOF

4. 启动

$ ./redis-shake shake.toml

..........................
..........................
2025-06-10 11:35:06 INF read_count=[1786107], read_ops=[43959.34], write_count=[1786106], write_ops=[43959.34], src-2, syncing rdb, size=[26 MiB/29 MiB]
2025-06-10 11:35:11 INF read_count=[1977422], read_ops=[47005.58], write_count=[1977421], write_ops=[47005.58], src-0, syncing aof, diff=[53132360]
2025-06-10 11:35:16 INF read_count=[2000000], read_ops=[0.00], write_count=[2000000], write_ops=[0.00], src-1, syncing aof, diff=[0]

数据库, 云原生与容器技术
redis kubernetes
License:  CC BY 4.0
Share

Further Reading

Aug 3, 2025

Kubernetes 安装部署 MySQL-Operater

本文详细介绍了如何在Kubernetes集群中部署和管理MySQL InnoDB集群,使用了MySQL Operator这一工具。首先,通过Helm添加仓库并更新,下载所需的离线包。接着,根据需要修改配置文件,包括镜像源、资源请求等,并启动MySQL Operator服务及InnoDB集群。文章还提供了详细的配置示例,如设置Pod的调度策略、优化MySQL服务器配置以及创建和管理MySQL自动备份的方法。最后,文中说明了如何解决sidecar容器权限问题,并指导用户如何正确卸载和删除整个集群。整个过程涵盖了从部署到维护的全过程,适合有一定Kubernetes基础的运维人员参考。

Jun 4, 2025

kubernetes 部署 redis-cluster

本文详细介绍了Redis的多种部署模式及其优缺点,包括主从模式、哨兵模式和集群模式。主从模式通过一个主节点和多个从节点实现数据冗余和读写分离,但无自动故障转移;哨兵模式则增加了自动故障转移功能,提高了系统的高可用性;而集群模式不仅支持数据分片,还集成了故障转移能力,适用于海量数据和高并发场景。文章随后展示了如何在Kubernetes环境中部署一个3主3从的Redis集群,包括配置文件和服务启动过程,并演示了集群扩容与缩容的具体步骤。最后,介绍了使用redis-shake工具进行跨集群数据同步的方法,涵盖了解析、恢复、备份及同步等功能,特别强调了其在不同环境下的应用灵活性。

May 8, 2025

常见数据库备份方案

本文档详细描述了一个数据库备份与恢复系统的实现,包括策略、计划安排、脚本编写及启动方式。该系统支持MySQL、PostgreSQL和MongoDB三种类型的数据库,并采用每日全量备份策略,备份源为从库以减少对主库的影响。备份文件按时间戳命名并压缩存储,同时在多个地理位置保存副本以确保数据安全。通过cron作业结合Docker容器执行自动备份任务,且有飞书通知机制实时反馈备份状态。此外,还提供了详尽的数据库恢复脚本,支持多线程操作和多种格式的备份文件处理,能够智能识别并恢复指定数据库或所有数据库,并同样具备发送恢复结果至飞书的功能。整个过程强调了自动化与安全性,确保了数据备份与恢复的高效性和可靠性。

OLDER

Kubernetes 创建 Pod 底层原理

NEWER

Prometheus 监控非 K8S 集群节点

Recently Updated

  • Kubernetes 安装部署 Alist 并配置 Onlyoffice
  • KubeSphere-04-Dev-ops 流水线插件的使用
  • KubeSphere-03-Logging 日志插件的使用
  • KubeSphere-02-Service Mesh 的使用
  • KubeSphere-01-介绍与基础使用

Trending Tags

KVM Service Mesh Docker shell 路由规则 Mysql Containerd GitOps 网络设备 Prometheus

Contents

©2025 甄天祥-Linux-个人小站. Some rights reserved.

Using the Halo theme Chirpy