Sign Up
Log In
Log In
or
Sign Up
Places
All Projects
Status Monitor
Collapse sidebar
Please login to access the resource
openSUSE:Leap:15.5:Update
pacemaker.16898
bsc#1131353-bsc#1131356-0006-High-pacemakerd-vs...
Overview
Repositories
Revisions
Requests
Users
Attributes
Meta
File bsc#1131353-bsc#1131356-0006-High-pacemakerd-vs.-IPC-procfs-confused-deputy-authe-1.1.patch of Package pacemaker.16898
From 4d6f6e01b309cda7b3f8fe791247566d247d8028 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jan=20Pokorn=C3=BD?= <jpokorny@redhat.com> Date: Tue, 16 Apr 2019 00:08:28 +0200 Subject: [PATCH 6/7] High: pacemakerd vs. IPC/procfs confused deputy authenticity issue (4/4) [4/4: CPG users to be careful about now-more-probable rival processes] In essence, this comes down to pacemaker confusing at-node CPG members with effectively the only plausible to co-exist at particular node, which doesn't hold and asks for a wider reconciliation of this reality-check. However, in practical terms, since there are two factors lowering the priority of doing so: 1/ possibly the only non-self-inflicted scenario is either that some of the cluster stack processes fail -- this the problem that shall rather be deferred to arranged node disarming/fencing to stay on the safe side with 100% certainty, at the cost of possibly long-lasting failover process at other nodes (for other possibility, someone running some of these by accident so they effectively become rival processes, it's like getting hands cut when playing with a lawnmower in an unintended way) 2/ for state tracking of the peer nodes, it may possibly cause troubles in case the process observed as left wasn't the last for the particular node, even if presumably just temporary, since the situation may eventually resolve with imposed serialization of the rival processes via API end-point singleton restriction (this is also the most likely cause of why such non-final leave gets observed in the first place), except in one case -- the legitimate API end-point carrier won't possibly acknowledged as returned by its peers, at least not immediately, unless it tries to join anew, which verges on undefined behaviour (at least per corosync documentation) we make do just with a light code change so as to * limit 1/ some more with in-daemon self-check for pre-existing end-point existence (this is to complement the checks already made in the parent daemon prior to spawning new instances, only some moments later; note that we don't have any lock file etc. mechanisms to prevent parallel runs of the same daemons, and people could run these on their own deliberation), and to * guard against the interferences from the rivals at the same node per 2/ with ignoring their non-final leave messages altogether. Note that CPG at this point is already expected to be authenticity-safe. Regarding now-more-probable part, we actually traded the inherently racy procfs scanning for something (exactly that singleton mentioned above) rather firm (and unfakeable), but we admittedly got lost track of processes that are after CPG membership (that is, another form of a shared state) prior to (or in non-deterministic order allowing for the same) carring about publishing the end-point. Big thanks is owed to Yan Gao of SUSE, for early discovery and reporting this discrepancy arising from the earlier commits in the set. --- attrd/main.c | 19 ++++++++++- cib/main.c | 35 ++++++++++++--------- crmd/main.c | 35 ++++++++++++--------- fencing/main.c | 32 +++++++++++-------- lib/cluster/cpg.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++----- 5 files changed, 163 insertions(+), 52 deletions(-) Index: pacemaker-1.1.18+20180430.b12c320f5/attrd/main.c =================================================================== --- pacemaker-1.1.18+20180430.b12c320f5.orig/attrd/main.c +++ pacemaker-1.1.18+20180430.b12c320f5/attrd/main.c @@ -1,5 +1,7 @@ /* - * Copyright (C) 2013 Andrew Beekhof <andrew@beekhof.net> + * Copyright 2013-2019 the Pacemaker project contributors + * + * The version control history for this file may have further details. * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public @@ -336,6 +338,7 @@ main(int argc, char **argv) int index = 0; int argerr = 0; qb_ipcs_service_t *ipcs = NULL; + crm_ipc_t *old_instance = NULL; attrd_init_mainloop(); crm_log_preinit(NULL, argc, argv); @@ -372,6 +375,20 @@ main(int argc, char **argv) crm_log_init(T_ATTRD, LOG_INFO, TRUE, FALSE, argc, argv, FALSE); crm_info("Starting up"); + + old_instance = crm_ipc_new(T_ATTRD, 0); + if (crm_ipc_connect(old_instance)) { + /* IPC end-point already up */ + crm_ipc_close(old_instance); + crm_ipc_destroy(old_instance); + crm_err("attrd is already active, aborting startup"); + crm_exit(CRM_EX_OK); + } else { + /* not up or not authentic, we'll proceed either way */ + crm_ipc_destroy(old_instance); + old_instance = NULL; + } + attributes = g_hash_table_new_full(crm_str_hash, g_str_equal, NULL, free_attribute); if (attrd_cluster_connect() != pcmk_ok) { Index: pacemaker-1.1.18+20180430.b12c320f5/cib/main.c =================================================================== --- pacemaker-1.1.18+20180430.b12c320f5.orig/cib/main.c +++ pacemaker-1.1.18+20180430.b12c320f5/cib/main.c @@ -1,19 +1,10 @@ /* - * Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net> + * Copyright 2004-2019 the Pacemaker project contributors * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public - * License as published by the Free Software Foundation; either - * version 2 of the License, or (at your option) any later version. + * The version control history for this file may have further details. * - * This software is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - * - * You should have received a copy of the GNU General Public - * License along with this library; if not, write to the Free Software - * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * This source code is licensed under the GNU General Public License version 2 + * or later (GPLv2+) WITHOUT ANY WARRANTY. */ #include <crm_internal.h> @@ -103,13 +94,12 @@ main(int argc, char **argv) int index = 0; int argerr = 0; struct passwd *pwentry = NULL; + crm_ipc_t *old_instance = NULL; crm_log_preinit(NULL, argc, argv); crm_set_options(NULL, "[options]", long_options, "Daemon for storing and replicating the cluster configuration"); - crm_peer_init(); - mainloop_add_signal(SIGTERM, cib_shutdown); mainloop_add_signal(SIGPIPE, cib_enable_writes); @@ -184,6 +174,19 @@ main(int argc, char **argv) crm_log_init(NULL, LOG_INFO, TRUE, FALSE, argc, argv, FALSE); + old_instance = crm_ipc_new(cib_channel_ro, 0); + if (crm_ipc_connect(old_instance)) { + /* IPC end-point already up */ + crm_ipc_close(old_instance); + crm_ipc_destroy(old_instance); + crm_err("cib is already active, aborting startup"); + crm_exit(CRM_EX_OK); + } else { + /* not up or not authentic, we'll proceed either way */ + crm_ipc_destroy(old_instance); + old_instance = NULL; + } + if (cib_root == NULL) { cib_root = CRM_CONFIG_DIR; } else { @@ -198,6 +201,8 @@ main(int argc, char **argv) return CRM_EX_FATAL; } + crm_peer_init(); + /* read local config file */ cib_init(); Index: pacemaker-1.1.18+20180430.b12c320f5/crmd/main.c =================================================================== --- pacemaker-1.1.18+20180430.b12c320f5.orig/crmd/main.c +++ pacemaker-1.1.18+20180430.b12c320f5/crmd/main.c @@ -1,19 +1,10 @@ /* - * Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net> - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public - * License as published by the Free Software Foundation; either - * version 2 of the License, or (at your option) any later version. - * - * This software is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - * - * You should have received a copy of the GNU General Public - * License along with this library; if not, write to the Free Software - * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * Copyright 2004-2019 the Pacemaker project contributors + * + * The version control history for this file may have further details. + * + * This source code is licensed under the GNU General Public License version 2 + * or later (GPLv2+) WITHOUT ANY WARRANTY. */ #include <crm_internal.h> @@ -61,6 +52,7 @@ main(int argc, char **argv) int flag; int index = 0; int argerr = 0; + crm_ipc_t *old_instance = NULL; crmd_mainloop = g_main_loop_new(NULL, FALSE); crm_log_preinit(NULL, argc, argv); @@ -104,6 +96,19 @@ main(int argc, char **argv) crm_help('?', CRM_EX_USAGE); } + old_instance = crm_ipc_new(CRM_SYSTEM_CRMD, 0); + if (crm_ipc_connect(old_instance)) { + /* IPC end-point already up */ + crm_ipc_close(old_instance); + crm_ipc_destroy(old_instance); + crm_err("crmd is already active, aborting startup"); + crm_exit(CRM_EX_OK); + } else { + /* not up or not authentic, we'll proceed either way */ + crm_ipc_destroy(old_instance); + old_instance = NULL; + } + if (pcmk__daemon_can_write(PE_STATE_DIR, NULL) == FALSE) { crm_err("Terminating due to bad permissions on " PE_STATE_DIR); fprintf(stderr, Index: pacemaker-1.1.18+20180430.b12c320f5/fencing/main.c =================================================================== --- pacemaker-1.1.18+20180430.b12c320f5.orig/fencing/main.c +++ pacemaker-1.1.18+20180430.b12c320f5/fencing/main.c @@ -1,5 +1,7 @@ /* - * Copyright 2009-2018 Andrew Beekhof <andrew@beekhof.net> + * Copyright 2009-2019 the Pacemaker project contributors + * + * The version control history for this file may have further details. * * This source code is licensed under the GNU General Public License version 2 * or later (GPLv2+) WITHOUT ANY WARRANTY. @@ -1252,6 +1254,7 @@ main(int argc, char **argv) int option_index = 0; crm_cluster_t cluster; const char *actions[] = { "reboot", "off", "on", "list", "monitor", "status" }; + crm_ipc_t *old_instance = NULL; crm_log_preinit("stonith-ng", argc, argv); crm_set_options(NULL, "mode [options]", long_options, @@ -1417,6 +1420,20 @@ main(int argc, char **argv) } crm_log_init("stonith-ng", LOG_INFO, TRUE, FALSE, argc, argv, FALSE); + + old_instance = crm_ipc_new("stonith-ng", 0); + if (crm_ipc_connect(old_instance)) { + /* IPC end-point already up */ + crm_ipc_close(old_instance); + crm_ipc_destroy(old_instance); + crm_err("stonithd is already active, aborting startup"); + crm_exit(CRM_EX_OK); + } else { + /* not up or not authentic, we'll proceed either way */ + crm_ipc_destroy(old_instance); + old_instance = NULL; + } + mainloop_add_signal(SIGTERM, stonith_shutdown); crm_peer_init(); Index: pacemaker-1.1.18+20180430.b12c320f5/lib/cluster/cpg.c =================================================================== --- pacemaker-1.1.18+20180430.b12c320f5.orig/lib/cluster/cpg.c +++ pacemaker-1.1.18+20180430.b12c320f5/lib/cluster/cpg.c @@ -371,6 +371,20 @@ pcmk_message_common_cs(cpg_handle_t hand return NULL; } +static int cmp_member_list_nodeid(const void *first, + const void *second) +{ + const struct cpg_address *const a = *((const struct cpg_address **) first), + *const b = *((const struct cpg_address **) second); + if (a->nodeid < b->nodeid) { + return -1; + } else if (a->nodeid > b->nodeid) { + return 1; + } + /* don't bother with "reason" nor "pid" */ + return 0; +} + void pcmk_cpg_membership(cpg_handle_t handle, const struct cpg_name *groupName, @@ -382,29 +396,91 @@ pcmk_cpg_membership(cpg_handle_t handle, gboolean found = FALSE; static int counter = 0; uint32_t local_nodeid = get_local_nodeid(handle); + const struct cpg_address *key, **rival, **sorted; + + sorted = malloc(member_list_entries * sizeof(const struct cpg_address *)); + CRM_ASSERT(sorted != NULL); + + for (size_t iter = 0; iter < member_list_entries; iter++) { + sorted[iter] = member_list + iter; + } + /* so that the cross-matching multiply-subscribed nodes is then cheap */ + qsort(sorted, member_list_entries, sizeof(const struct cpg_address *), + cmp_member_list_nodeid); for (i = 0; i < left_list_entries; i++) { crm_node_t *peer = crm_find_peer(left_list[i].nodeid, NULL); - crm_info("Node %u left group %s (peer=%s, counter=%d.%d)", + crm_info("Node %u left group %s (peer=%s:%llu, counter=%d.%d)", left_list[i].nodeid, groupName->value, - (peer? peer->uname : "<none>"), counter, i); + (peer? peer->uname : "<none>"), + (unsigned long long) left_list[i].pid, counter, i); + + /* in CPG world, NODE:PROCESS-IN-MEMBERSHIP-OF-G is an 1:N relation + and not playing by this rule may go wild in case of multiple + residual instances of the same pacemaker daemon at the same node + -- we must ensure that the possible local rival(s) won't make us + cry out and bail (e.g. when they quit themselves), since all the + surrounding logic denies this simple fact that the full membership + is discriminated also per the PID of the process beside mere node + ID (and implicitly, group ID); practically, this will be sound in + terms of not preventing progress, since all the CPG joiners are + also API end-point carriers, and that's what matters locally + (who's the winner); + remotely, we will just compare leave_list and member_list and if + the left process has it's node retained in member_list (under some + other PID, anyway) we will just ignore it as well + XXX: long-term fix is to establish in-out PID-aware tracking? */ if (peer) { - crm_update_peer_proc(__FUNCTION__, peer, crm_proc_cpg, OFFLINESTATUS); + key = &left_list[i]; + rival = bsearch(&key, sorted, member_list_entries, + sizeof(const struct cpg_address *), + cmp_member_list_nodeid); + if (rival == NULL) { + crm_update_peer_proc(__FUNCTION__, peer, crm_proc_cpg, + OFFLINESTATUS); + } else if (left_list[i].nodeid == local_nodeid) { + crm_info("Ignoring the above event %s.%d, comes from a local" + " rival process (presumably not us): %llu", + groupName->value, counter, + (unsigned long long) left_list[i].pid); + } else { + crm_info("Ignoring the above event %s.%d, comes from" + " a rival-rich node: %llu (e.g. %llu process" + " carries on)", + groupName->value, counter, + (unsigned long long) left_list[i].pid, + (unsigned long long) (*rival)->pid); + } } } + free(sorted); + sorted = NULL; for (i = 0; i < joined_list_entries; i++) { - crm_info("Node %u joined group %s (counter=%d.%d)", - joined_list[i].nodeid, groupName->value, counter, i); + crm_info("Node %u joined group %s (counter=%d.%d, pid=%llu," + " unchecked for rivals)", + joined_list[i].nodeid, groupName->value, counter, i, + (unsigned long long) left_list[i].pid); } for (i = 0; i < member_list_entries; i++) { crm_node_t *peer = crm_get_peer(member_list[i].nodeid, NULL); - crm_info("Node %u still member of group %s (peer=%s, counter=%d.%d)", + crm_info("Node %u still member of group %s (peer=%s:%llu," + " counter=%d.%d, at least once)", member_list[i].nodeid, groupName->value, - (peer? peer->uname : "<none>"), counter, i); + (peer? peer->uname : "<none>"), member_list[i].pid, + counter, i); + + if (member_list[i].nodeid == local_nodeid + && member_list[i].pid != getpid()) { + /* see the note above */ + crm_info("Ignoring the above event %s.%d, comes from a local rival" + " process: %llu", groupName->value, counter, + (unsigned long long) member_list[i].pid); + continue; + } /* Anyone that is sending us CPG messages must also be a _CPG_ member. * But it's _not_ safe to assume it's in the quorum membership. @@ -423,7 +499,9 @@ pcmk_cpg_membership(cpg_handle_t handle, * certain point we need to acknowledge our internal cache is * probably wrong. Use 1 minute. */ - crm_err("Node %s[%u] appears to be online even though we think it is dead", peer->uname, peer->id); + crm_err("Node %s[%u] appears to be online even though we think" + " it is dead (unchecked for rivals)", + peer->uname, peer->id); if (crm_update_peer_state(__FUNCTION__, peer, CRM_NODE_MEMBER, 0)) { peer->votes = 0; }
Locations
Projects
Search
Status Monitor
Help
OpenBuildService.org
Documentation
API Documentation
Code of Conduct
Contact
Support
@OBShq
Terms
openSUSE Build Service is sponsored by
The Open Build Service is an
openSUSE project
.
Sign Up
Log In
Places
Places
All Projects
Status Monitor