pve-ha-manager (4.0.2) bookworm; urgency=medium * cluster resource manager: clear stale maintenance node, which can be caused by simultaneous cluster shutdown -- Proxmox Support Team Tue, 13 Jun 2023 08:35:52 +0200 pve-ha-manager (4.0.1) bookworm; urgency=medium * test, simulator: make it possible to add already running service * lrm: do not migrate via rebalance-on-start if service already running upon rebalance on start * api: fix/add return description for status endpoint * resources: pve: avoid relying on internal configuration details, use new helpers in pve-container and qemu-server -- Proxmox Support Team Fri, 09 Jun 2023 10:41:06 +0200 pve-ha-manager (4.0.0) bookworm; urgency=medium * re-build for Proxmox VE 8 / Debian 12 Bookworm -- Proxmox Support Team Wed, 24 May 2023 19:26:51 +0200 pve-ha-manager (3.6.1) bullseye; urgency=medium * cli: assert that node exist when changing CRS request state to avoid creating a phantom node by mistake * manager: ensure node-request state gets transferred to new active CRM, so that the request for (manual) maintenance mode is upheld, even if the node that is in maintenace mode is also the current active CRM and gets rebooted. * lrm: ignore shutdown policy if (manual) maintenance mode is requested to avoid exiting from maintenance mode to early. -- Proxmox Support Team Thu, 20 Apr 2023 14:16:14 +0200 pve-ha-manager (3.6.0) bullseye; urgency=medium * fix #4371: add CRM command to switch an online node manually into maintenance (without reboot), moving away all active services, but automatically migrate them back once the maintenance mode is disabled again. * manager: service start: make EWRONG_NODE a non-fatal error, but try to find its the actual node the service is residing on * manager: add new intermediate 'request_started' state for stop->start transitions * request start: optionally enable automatic selection of the best rated node by the CRS on service start up, bypassing the very high priority of the current node on which a service is located. -- Proxmox Support Team Mon, 20 Mar 2023 13:38:26 +0100 pve-ha-manager (3.5.1) bullseye; urgency=medium * manager: update crs scheduling mode once per round to avoid the need for a restart of the currently active manager. * api: status: add CRS info to manager if not set to default -- Proxmox Support Team Sat, 19 Nov 2022 15:51:11 +0100 pve-ha-manager (3.5.0) bullseye; urgency=medium * env: datacenter config: include crs (cluster-resource-scheduling) setting * manager: use static resource scheduler when configured * manager: avoid scoring nodes if maintenance fallback node is valid * manager: avoid scoring nodes when not trying next and current node is valid * usage: static: use service count on nodes as a fallback -- Proxmox Support Team Fri, 18 Nov 2022 15:02:55 +0100 pve-ha-manager (3.4.0) bullseye; urgency=medium * switch to native version formatting * fix accounting of online services when moving services due to their source node going gracefully nonoperational (maintenance mode). This ensures a better balance of services on the cluster after such an operation. -- Proxmox Support Team Fri, 22 Jul 2022 09:21:20 +0200 pve-ha-manager (3.3-4) bullseye; urgency=medium * lrm: fix getting stuck on restart due to finished worker state not being collected -- Proxmox Support Team Wed, 27 Apr 2022 14:01:55 +0200 pve-ha-manager (3.3-3) bullseye; urgency=medium * lrm: avoid possible job starvation on huge workloads * lrm: increase run_worker loop-time for doing actually work to 80% duty-cycle -- Proxmox Support Team Thu, 20 Jan 2022 18:05:33 +0100 pve-ha-manager (3.3-2) bullseye; urgency=medium * fix #3826: fix restarting LRM/CRM when triggered by package management system due to other updates * lrm: also check CRM node-status for determining if there's a fence-request and avoid starting up in that case to ensure that the current manager can get our lock and do a clean fence -> unknown -> online FSM transition. This avoids a problematic edge case where an admin manually removed all services of a to-be-fenced node, and re-added them again before the manager could actually get that nodes LRM lock. * manage: handle edge case where a node gets seemingly stuck in 'fence' state if all its services got manually removed by an admin before the fence transition could be finished. While the LRM could come up again in previous versions (it won't now, see above point) and start/stop services got executed, the node was seen as unavailable for all recovery, relocation and migrate actions. -- Proxmox Support Team Wed, 19 Jan 2022 14:30:15 +0100 pve-ha-manager (3.3-1) bullseye; urgency=medium * LRM: release lock and close watchdog if no service configured for >10min * manager: make recovery actual state in finite state machine, showing a clear transition from fence -> reocvery. * fix #3415: never switch in error state on recovery, try to find a new node harder. This improves using the HA manager for services with local resources (e.g., local storage) to ensure it always gets started, which is an OK use-case as long as the service is restricted to a group with only that node. Previously failure of that node would have a high possibility of the service going into the errors state, as no new node can be found. Now it will retry finding a new node, and if one of the restricted set, e.g., the node it was previous on, comes back up, it will start again there. * recovery: allow disabling a in-recovery service manually -- Proxmox Support Team Fri, 02 Jul 2021 20:03:29 +0200 pve-ha-manager (3.2-2) bullseye; urgency=medium * fix systemd service restart behavior on package upgrade with Debian Bullseye -- Proxmox Support Team Mon, 24 May 2021 11:38:42 +0200 pve-ha-manager (3.2-1) bullseye; urgency=medium * Re-build for Debian Bullseye / PVE 7 -- Proxmox Support Team Wed, 12 May 2021 20:55:53 +0200 pve-ha-manager (3.1-1) pve; urgency=medium * allow 'with-local-disks' migration for replicated guests -- Proxmox Support Team Mon, 31 Aug 2020 10:52:23 +0200 pve-ha-manager (3.0-9) pve; urgency=medium * factor out service configured/delete helpers * typo and grammar fixes -- Proxmox Support Team Thu, 12 Mar 2020 13:17:36 +0100 pve-ha-manager (3.0-8) pve; urgency=medium * bump LRM stop wait time to an hour * do not mark maintenaned nodes as unkown * api/status: extra handling of maintenance mode -- Proxmox Support Team Mon, 02 Dec 2019 10:33:03 +0100 pve-ha-manager (3.0-6) pve; urgency=medium * add 'migrate' node shutdown policy * do simple fallback if node comes back online from maintenance * account service to both, source and target during migration * add 'After' ordering for SSH and pveproxy to LRM service, ensuring the node stays accessible until HA services got moved or shutdown, depending on policy. -- Proxmox Support Team Tue, 26 Nov 2019 18:03:26 +0100 pve-ha-manager (3.0-5) pve; urgency=medium * fix #1339: remove more locks from services IF the node got fenced * adapt to qemu-server code refactoring -- Proxmox Support Team Wed, 20 Nov 2019 20:12:49 +0100 pve-ha-manager (3.0-4) pve; urgency=medium * use PVE::DataCenterConfig from new split-out cluster library package -- Proxmox Support Team Mon, 18 Nov 2019 12:16:29 +0100 pve-ha-manager (3.0-3) pve; urgency=medium * fix #1919, #1920: improve handling zombie (without node) services * fix # 2241: VM resource: allow migration with local device, when not running * HA status: render removal transition of service as 'deleting' * fix #1140: add crm command 'stop', which allows to request immediate service hard-stops if a timeout of zero (0) is passed -- Proxmox Support Team Mon, 11 Nov 2019 17:04:35 +0100 pve-ha-manager (3.0-2) pve; urgency=medium * services: update PIDFile to point directly to /run * fix #2234: fix typo in service description * Add missing Dependencies to pve-ha-simulator -- Proxmox Support Team Thu, 11 Jul 2019 19:26:03 +0200 # Older entries have been removed from this changelog. # To read the complete changelog use `apt changelog pve-ha-manager`.