Linux HA 集群原理和配置

2024-06-23 05:53| 来源: 网络整理| 查看: 265

本文介绍在Linux HA集群中的stonith模块功能。

Stonith，全称Shoot The Other Node In The Head，用于防止集群出现脑裂现象。简单来说，一旦集群中的节点相互之间失去了通信，无法知道其他节点的状态，此时集群中的每个节点将尝试fence（隔离或“射杀”）失去通信的节点，确保这些节点不再抢夺资源，然后才继续启动服务资源，对外提供服务。

1. Stonith安装及其Agent简介

在3台集群主机上安装fence-agents软件包。

# yum -y install fence-agents

安装完毕后可查看到系统支持的stonith设备类型：

[root@ha-host1 ~]# pcs stonith list fence_apc - Fence agent for APC over telnet/ssh fence_apc_snmp - Fence agent for APC, Tripplite PDU over SNMP fence_bladecenter - Fence agent for IBM BladeCenter fence_brocade - Fence agent for HP Brocade over telnet/ssh fence_cisco_mds - Fence agent for Cisco MDS fence_cisco_ucs - Fence agent for Cisco UCS fence_compute - Fence agent for the automatic resurrection of OpenStack compute instances fence_drac5 - Fence agent for Dell DRAC CMC/5 fence_eaton_snmp - Fence agent for Eaton over SNMP fence_emerson - Fence agent for Emerson over SNMP fence_eps - Fence agent for ePowerSwitch fence_evacuate - Fence agent for the automatic resurrection of OpenStack compute instances fence_hpblade - Fence agent for HP BladeSystem fence_ibmblade - Fence agent for IBM BladeCenter over SNMP fence_idrac - Fence agent for IPMI fence_ifmib - Fence agent for IF MIB fence_ilo - Fence agent for HP iLO fence_ilo2 - Fence agent for HP iLO fence_ilo3 - Fence agent for IPMI fence_ilo3_ssh - Fence agent for HP iLO over SSH fence_ilo4 - Fence agent for IPMI fence_ilo4_ssh - Fence agent for HP iLO over SSH fence_ilo_moonshot - Fence agent for HP Moonshot iLO fence_ilo_mp - Fence agent for HP iLO MP fence_ilo_ssh - Fence agent for HP iLO over SSH fence_imm - Fence agent for IPMI fence_intelmodular - Fence agent for Intel Modular fence_ipdu - Fence agent for iPDU over SNMP fence_ipmilan - Fence agent for IPMI fence_kdump - Fence agent for use with kdump fence_mpath - Fence agent for multipath persistent reservation fence_rhevm - Fence agent for RHEV-M REST API fence_rsa - Fence agent for IBM RSA fence_rsb - I/O Fencing agent for Fujitsu-Siemens RSB fence_sbd - Fence agent for sbd fence_scsi - Fence agent for SCSI persistent reservation fence_virt - Fence agent for virtual machines fence_vmware_soap - Fence agent for VMWare over SOAP API fence_wti - Fence agent for WTI fence_xvm - Fence agent for virtual machines

以上输出中的每个Fence agent都是一种Stonith设备，从名字的后缀可以看出，这些Agent有以下几类：

通过服务器的管理口来关闭被fencing节点的电源，如ilo，ipmi，drac，绝大多数Agent属于此类，这些用于控制物理服务器节点。通过Hybervisor虚拟层或云平台关闭被fencing的节点，如virt，vmware，xvm，compute，这些用于控制虚机节点。通过禁止被fencing节点访问特定资源阻止起启动，如scsi，math，brocade。

前两种都属于电源类型的Stonith设备，而第三种和电源无关，之所以要这样划分，是因为：

使用非电源类型Stonith设备时，被fenced的节点没有关闭电源，仅仅是服务没有启动。在对其重启前，必须进行unfence，这样节点才能正常重启。因此创建此种类型的Stonith设备时需指定参数meta provides=unfencing。使用电源类型的stonith设备则无需指定，因为被fenced的节点电源已经被关闭，而启动节点这个操作本身即为unfenced。 2 创建stonith设备：

以下以fence_scsi为例进行实验。

2.1 创建共享存储

安装《在CentOS7上配置iSCSI》中的方法，通过一台专用的存储节点ha-disks为集群中的3个主机提供共享存储（即在ha-disks上创建iscsi硬盘，然后将其映射到3个集群主机上）。

在iscsi-disks上创建3个100M的硬盘fen1，fen2，fen3，挂载到主机上后设备名称分别为sdb,sdc,sdd

[root@ha-host1 ~]# fdisk -l | grep dev Disk /dev/sda: 42.9 GB, 42949672960 bytes, 83886080 sectors /dev/sda1 2048 4095 1024 83 Linux /dev/sda2 * 4096 2101247 1048576 83 Linux /dev/sda3 2101248 83886079 40892416 8e Linux LVM Disk /dev/mapper/VolGroup00-LogVol00: 40.2 GB, 40231763968 bytes, 78577664 sectors Disk /dev/mapper/VolGroup00-LogVol01: 1610 MB, 1610612736 bytes, 3145728 sectors Disk /dev/sdb: 104 MB, 104857600 bytes, 204800 sectors Disk /dev/sdc: 104 MB, 104857600 bytes, 204800 sectors Disk /dev/sdd: 104 MB, 104857600 bytes, 204800 sectors [root@ha-host2 ~]# fdisk -l | grep dev Disk /dev/sda: 42.9 GB, 42949672960 bytes, 83886080 sectors /dev/sda1 2048 4095 1024 83 Linux /dev/sda2 * 4096 2101247 1048576 83 Linux /dev/sda3 2101248 83886079 40892416 8e Linux LVM Disk /dev/mapper/VolGroup00-LogVol00: 40.2 GB, 40231763968 bytes, 78577664 sectors Disk /dev/mapper/VolGroup00-LogVol01: 1610 MB, 1610612736 bytes, 3145728 sectors Disk /dev/sdb: 104 MB, 104857600 bytes, 204800 sectors Disk /dev/sdc: 104 MB, 104857600 bytes, 204800 sectors Disk /dev/sdd: 104 MB, 104857600 bytes, 204800 sectors [root@ha-host3 ~]# fdisk -l | grep dev Disk /dev/sda: 42.9 GB, 42949672960 bytes, 83886080 sectors /dev/sda1 2048 4095 1024 83 Linux /dev/sda2 * 4096 2101247 1048576 83 Linux /dev/sda3 2101248 83886079 40892416 8e Linux LVM Disk /dev/mapper/VolGroup00-LogVol00: 40.2 GB, 40231763968 bytes, 78577664 sectors Disk /dev/mapper/VolGroup00-LogVol01: 1610 MB, 1610612736 bytes, 3145728 sectors Disk /dev/sdb: 104 MB, 104857600 bytes, 204800 sectors Disk /dev/sdc: 104 MB, 104857600 bytes, 204800 sectors Disk /dev/sdd: 104 MB, 104857600 bytes, 204800 sectors

测试一下这些硬盘是否支持PR Key：

[root@ha-host1 ~]# sg_persist /dev/sdc >> No service action given; assume Persistent Reserve In command >> with Read Keys service action LIO-ORG fen2 4.0 Peripheral device type: disk PR generation=0x5, there are NO registered reservation keys 2.2 创建stonith设备

首先使用一个fence盘/dev/sdb来进行实验：

[root@ha-host1 ~]# pcs stonith create scsi-shooter fence_scsi pcmk_host_list="ha-host1 ha-host2 ha-host3" devices=/dev/sdb meta provides=unfencing [root@ha-host1 ~]# pcs status Cluster name: linuxha Stack: corosync Current DC: ha-host2 (version 1.1.16-12.el7_4.8-94ff4df) - partition with quorum Last updated: Fri May 4 07:10:33 2018 Last change: Fri May 4 07:07:14 2018 by root via cibadmin on ha-host1 3 nodes configured 2 resources configured Online: [ ha-host1 ha-host2 ha-host3 ] Full list of resources: vip (ocf::heartbeat:IPaddr2): Started ha-host1 scsi-shooter (stonith:fence_scsi): Started ha-host2 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled

使用sg_persist -s参数获取/dev/sdb上的所有信息：

[root@ha-host1 ~]# sg_persist -s /dev/sdb LIO-ORG fen1 4.0 Peripheral device type: disk PR generation=0xa Key=0x35fc0000 All target ports bit clear Relative port address: 0x1 > scope: LU_SCOPE, type: Write Exclusive, registrants only Transport Id of initiator: iSCSI name and session id: iqn.2016-06.com.ha-host1:iscsi-host1 Key=0x35fc0001 All target ports bit clear Relative port address: 0x1 not reservation holder Transport Id of initiator: iSCSI name and session id: iqn.2016-06.com.ha-host2:iscsi-host2 Key=0x35fc0002 All target ports bit clear Relative port address: 0x1 not reservation holder Transport Id of initiator: iSCSI name and session id: iqn.2016-06.com.ha-host3:iscsi-host3

可以看到，3个节点使用不同的PR Key在这个磁盘上进行了注册(register)，并且ha-host1保留(reservation)成功，类型为“Write Exclusive, registrants only”。表明此时只有ha-host1对该磁盘进行写操作。

此时如果断开其中两个节点的的链接，如ha-host1和ha-host3：

[root@ha-host1 ~]# pcs status Cluster name: linuxha Stack: corosync Current DC: ha-host2 (version 1.1.16-12.el7_4.8-94ff4df) - partition with quorum Last updated: Fri May 4 07:30:53 2018 Last change: Fri May 4 07:07:13 2018 by root via cibadmin on ha-host1 3 nodes configured 2 resources configured Node ha-host3: UNCLEAN (offline) Online: [ ha-host1 ha-host2 ] Full list of resources: vip (ocf::heartbeat:IPaddr2): Started ha-host1 scsi-shooter (stonith:fence_scsi): Started ha-host2 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled [root@ha-host1 ~]# sg_persist -s /dev/sdb LIO-ORG fen1 4.0 Peripheral device type: disk PR generation=0xb Key=0x35fc0000 All target ports bit clear Relative port address: 0x1 > scope: LU_SCOPE, type: Write Exclusive, registrants only Transport Id of initiator: iSCSI name and session id: iqn.2016-06.com.ha-host1:iscsi-host1 Key=0x35fc0001 All target ports bit clear Relative port address: 0x1 not reservation holder Transport Id of initiator: iSCSI name and session id: iqn.2016-06.com.ha-host2:iscsi-host2

可以看到，经过协商后，ha-host3退出集群，并且也删除在fencing磁盘中的注册信息。由于stonith资源运行在ha-host2上，所以在ha-host2的日志中可以看到ha-host3被fence的过程：

[root@ha-host2 ~]# tail -1000 /var/log/cluster/corosync.log | grep ha-host3 May 04 07:30:51 [1437] ha-host2 pengine: notice: LogNodeActions: * Fence (reboot) ha-host3 'peer is no longer part of the cluster' May 04 07:30:51 [1438] ha-host2 crmd: notice: te_fence_node: Requesting fencing (reboot) of node ha-host3 | action=1 timeout=60000 May 04 07:30:51 [1434] ha-host2 stonith-ng: notice: handle_request: Client crmd.1438.0cea319b wants to fence (reboot) 'ha-host3' with device '(any)' May 04 07:30:51 [1434] ha-host2 stonith-ng: notice: initiate_remote_stonith_op: Requesting peer fencing (reboot) of ha-host3 | id=0cf426c7-666f-4299-8285-fa500fa5ac09 state=0 May 04 07:30:52 [1434] ha-host2 stonith-ng: notice: can_fence_host_with_device: scsi-shooter can fence (reboot) ha-host3: static-list May 04 07:30:52 [1434] ha-host2 stonith-ng: info: process_remote_stonith_query: Query result 1 of 2 from ha-host2 for ha-host3/reboot (1 devices) 0cf426c7-666f-4299-8285-fa500fa5ac09 May 04 07:30:52 [1434] ha-host2 stonith-ng: info: call_remote_stonith: Total timeout set to 60 for peer's fencing of ha-host3 for crmd.1438|id=0cf426c7-666f-4299-8285-fa500fa5ac09 May 04 07:30:52 [1434] ha-host2 stonith-ng: info: call_remote_stonith: Requesting that 'ha-host2' perform op 'ha-host3 reboot' for crmd.1438 (72s, 0s) May 04 07:30:52 [1434] ha-host2 stonith-ng: info: process_remote_stonith_query: Query result 2 of 2 from ha-host1 for ha-host3/reboot (1 devices) 0cf426c7-666f-4299-8285-fa500fa5ac09 May 04 07:30:52 [1434] ha-host2 stonith-ng: notice: can_fence_host_with_device: scsi-shooter can fence (reboot) ha-host3: static-list May 04 07:30:52 [1434] ha-host2 stonith-ng: info: stonith_fence_get_devices_cb: Found 1 matching devices for 'ha-host3' May 04 07:30:53 [1434] ha-host2 stonith-ng: warning: log_action: fence_scsi[2603] stderr: [ WARNING:root:Parse error: Ignoring unknown option 'port=ha-host3' ] May 04 07:30:53 [1434] ha-host2 stonith-ng: notice: log_operation: Operation 'reboot' [2603] (call 6 from crmd.1438) for host 'ha-host3' with device 'scsi-shooter' returned: 0 (OK) May 04 07:30:53 [1434] ha-host2 stonith-ng: notice: remote_op_done: Operation reboot of ha-host3 by ha-host2 for [email protected]: OK May 04 07:30:53 [1438] ha-host2 crmd: info: tengine_stonith_callback:Stonith operation 6 for ha-host3 passed May 04 07:30:53 [1438] ha-host2 crmd: info: crm_update_peer_expected:crmd_peer_down: Node ha-host3[3] - expected state is now down (was member)

ha-host3被fence之后，必须重启才能重新注册PR Key，否则即使网络恢复，其也无法运行需要stonith支持的资源。

问题：仲裁机制保证了必须有超过半数的节点的partition才能启动资源，拿为什么还需要stonith设备？

当集群中有只有两个节点的时候，我们必须允许partition在只有一个节点的时候也可以启动资源，此时Stonith设备为必须。仲裁机制是以主机间的pcs进程通信为基础的，存在一种可能性是pcs进程已经停掉了，但是相关的资源进程仍然在运行（资源进程处于脱管状态），此时节点已经脱离集群但仍会争抢资源。

【本文地址】

公司简介

联系我们