If are running stable32 from git, can you please revert:
commit 8975bd6341b2d94c1f89279b1b00d4360da1f5ff
and see if it´s still a problem?
Thanks
Fabio
On 7/10/2012 1:33 PM, Dietmar Maurer wrote:
> I just updated from 3.1.8 to latest STABLE32:
>
>
>
> I use this cluster.conf:
>
>
>
> # cat /etc/cluster/cluster.conf
>
> <?xml version="1.0"?>
>
> <cluster config_version="235" name="test">
>
> <cman keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu"/>
>
> <clusternodes>
>
> <clusternode name="maui" nodeid="3" votes="1"/>
>
> <clusternode name="cnode1" nodeid="1" votes="1"/>
>
> </clusternodes>
>
> <rm>
>
> <pvevm autostart="0" vmid="100"/>
>
> </rm>
>
> </cluster>
>
>
>
> cman service starts without problems:
>
>
>
> # /etc/init.d/cman start
>
> Starting cluster:
>
> Checking if cluster has been disabled at boot... [ OK ]
>
> Checking Network Manager... [ OK ]
>
> Global setup... [ OK ]
>
> Loading kernel modules... [ OK ]
>
> Mounting configfs... [ OK ]
>
> Starting cman... [ OK ]
>
> Waiting for quorum... [ OK ]
>
> Starting fenced... [ OK ]
>
> Starting dlm_controld... [ OK ]
>
> Starting GFS2 Control Daemon: gfs_controld.
>
> Unfencing self... [ OK ]
>
> Joining fence domain... [ OK ]
>
>
>
> And the corosync objdb contains:
>
>
>
> # corosync-objctl|grep cluster.cman
>
> cluster.cman.keyfile=/var/lib/pve-cluster/corosync.authkey
>
> cluster.cman.transport=udpu
>
> cluster.cman.nodename=maui
>
> cluster.cman.cluster_id=1678
>
>
>
> Note: there is a value for ‘nodename’ and ‘cluster_id’
>
>
>
> Now I simply increase the version inside cluster.conf (on both nodes):
>
>
>
> # cat /etc/cluster/cluster.conf
>
> <?xml version="1.0"?>
>
> <cluster config_version="236" name="test">
>
> <cman keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu"/>
>
> <clusternodes>
>
> <clusternode name="maui" nodeid="3" votes="1"/>
>
> <clusternode name="cnode1" nodeid="1" votes="1"/>
>
> </clusternodes>
>
> <rm>
>
> <pvevm autostart="0" vmid="100"/>
>
> </rm>
>
> </cluster>
>
>
>
> And trigger a reload:
>
>
>
> # cman_tool version -r –S
>
> cman_tool: Error loading configuration in corosync/cman
>
>
>
> And the syslog have more details:
>
>
>
> Jul 10 13:28:25 maui corosync[488675]: [CMAN ] cman was unable to
> determine our node name!
>
> Jul 10 13:28:25 maui corosync[488675]: [CMAN ] Can't get updated
> config version: Successfully read config from /etc/cluster/cluster.conf#012.
>
> Jul 10 13:28:25 maui corosync[488675]: [CMAN ] Continuing activity
> with old configuration
>
>
>
> Somehow the nodename and cluster_id values are removed from the corosync
> objdb:
>
>
>
> # corosync-objctl|grep cluster.cman
>
> cluster.cman.keyfile=/var/lib/pve-cluster/corosync.authkey
>
> cluster.cman.transport=udpu
>
>
>
>
>
> Any Idea why that happens?
>
>
>
> - Dietmar
>
>
>
07-10-2012, 12:09 PM
Dietmar Maurer
cluster.cman.nodename vanish on config reload
> If are running stable32 from git, can you please revert:
>
> commit 8975bd6341b2d94c1f89279b1b00d4360da1f5ff
>
> and see if it´s still a problem?
Yes, same problem.
- Dietmar
07-10-2012, 12:26 PM
"Fabio M. Di Nitto"
cluster.cman.nodename vanish on config reload
On 7/10/2012 2:09 PM, Dietmar Maurer wrote:
>> If are running stable32 from git, can you please revert:
>>
>> commit 8975bd6341b2d94c1f89279b1b00d4360da1f5ff
>>
>> and see if it´s still a problem?
>
> Yes, same problem.
>
> - Dietmar
>
>
Ok. then please file a bugzilla. I´ll need to bisect and see when the
problem has been introduced (unless you want to give bisect a shot).
Fabio
07-11-2012, 07:36 AM
Dietmar Maurer
cluster.cman.nodename vanish on config reload
> >> If are running stable32 from git, can you please revert:
> >>
> >> commit 8975bd6341b2d94c1f89279b1b00d4360da1f5ff
> >>
> >> and see if it´s still a problem?
> >
> > Yes, same problem.
> >
> > - Dietmar
> >
> >
>
>
> Ok. then please file a bugzilla. I´ll need to bisect and see when the problem
> has been introduced (unless you want to give bisect a shot).
Ok, bisect myself.
This lead directly to commit f3f4499d4ace7a3bf5fe09ce6d9f04ed6d8958f6
But this is just the check you introduced. If I revert that patch, everything works
as before, but I noticed that It still deletes the values from the corosync objdb after config
reload - even in 3.1.8!
Both cluster.cman.nodename and cluster.cman.cluster_id get removed.
Testing with earlier versions now.
- Dietmar
07-11-2012, 07:37 AM
Dietmar Maurer
cluster.cman.nodename vanish on config reload
> Ok, bisect myself.
>
> This lead directly to commit f3f4499d4ace7a3bf5fe09ce6d9f04ed6d8958f6
>
> But this is just the check you introduced. If I revert that patch, everything
> works as before, but I noticed that It still deletes the values from the
> corosync objdb after config reload - even in 3.1.8!
>
> Both cluster.cman.nodename and cluster.cman.cluster_id get removed.
>
> Testing with earlier versions now.
That even happens with 3.1.4 (cant test easily with older versions).
Any ideas?
- Dietmar
07-11-2012, 08:14 AM
"Fabio M. Di Nitto"
cluster.cman.nodename vanish on config reload
On 7/11/2012 9:37 AM, Dietmar Maurer wrote:
>> Ok, bisect myself.
>>
>> This lead directly to commit f3f4499d4ace7a3bf5fe09ce6d9f04ed6d8958f6
>>
>> But this is just the check you introduced. If I revert that patch, everything
>> works as before, but I noticed that It still deletes the values from the
>> corosync objdb after config reload - even in 3.1.8!
>>
>> Both cluster.cman.nodename and cluster.cman.cluster_id get removed.
>>
>> Testing with earlier versions now.
>
> That even happens with 3.1.4 (cant test easily with older versions).
>
> Any ideas?
No, not yet, but what kind of operational problem do you get? does it
affect runtime? if so how?
Fabio
07-11-2012, 08:20 AM
"Fabio M. Di Nitto"
cluster.cman.nodename vanish on config reload
On 7/11/2012 10:14 AM, Fabio M. Di Nitto wrote:
> On 7/11/2012 9:37 AM, Dietmar Maurer wrote:
>>> Ok, bisect myself.
>>>
>>> This lead directly to commit f3f4499d4ace7a3bf5fe09ce6d9f04ed6d8958f6
>>>
>>> But this is just the check you introduced. If I revert that patch, everything
>>> works as before, but I noticed that It still deletes the values from the
>>> corosync objdb after config reload - even in 3.1.8!
>>>
>>> Both cluster.cman.nodename and cluster.cman.cluster_id get removed.
>>>
>>> Testing with earlier versions now.
>>
>> That even happens with 3.1.4 (cant test easily with older versions).
>>
>> Any ideas?
>
> No, not yet, but what kind of operational problem do you get? does it
> affect runtime? if so how?
>
> Fabio
>
Nevermind.. i answered my own question.
Fabio
07-11-2012, 08:21 AM
Dietmar Maurer
cluster.cman.nodename vanish on config reload
> >> This lead directly to commit f3f4499d4ace7a3bf5fe09ce6d9f04ed6d8958f6
> >>
> >> But this is just the check you introduced. If I revert that patch,
> >> everything works as before, but I noticed that It still deletes the
> >> values from the corosync objdb after config reload - even in 3.1.8!
> >>
> >> Both cluster.cman.nodename and cluster.cman.cluster_id get removed.
> >>
> >> Testing with earlier versions now.
> >
> > That even happens with 3.1.4 (cant test easily with older versions).
> >
> > Any ideas?
>
> No, not yet, but what kind of operational problem do you get? does it affect
> runtime? if so how?
I cannot change/reload the configuration with commit f3f4499d4ace7a3bf5fe09ce6d9f04ed6d8958f6
When I revert that commit everything works fine.
I just wonder why those values get removed from the corosync objdb?
Note: You added that check, so I guess it has negative side effects when there is no nodename (why did you add that check)?
- Dietmar
07-11-2012, 08:27 AM
"Fabio M. Di Nitto"
cluster.cman.nodename vanish on config reload
On 7/11/2012 10:21 AM, Dietmar Maurer wrote:
>>>> This lead directly to commit f3f4499d4ace7a3bf5fe09ce6d9f04ed6d8958f6
>>>>
>>>> But this is just the check you introduced. If I revert that patch,
>>>> everything works as before, but I noticed that It still deletes the
>>>> values from the corosync objdb after config reload - even in 3.1.8!
>>>>
>>>> Both cluster.cman.nodename and cluster.cman.cluster_id get removed.
>>>>
>>>> Testing with earlier versions now.
>>>
>>> That even happens with 3.1.4 (cant test easily with older versions).
>>>
>>> Any ideas?
>>
>> No, not yet, but what kind of operational problem do you get? does it affect
>> runtime? if so how?
>
> I cannot change/reload the configuration with commit f3f4499d4ace7a3bf5fe09ce6d9f04ed6d8958f6
>
> When I revert that commit everything works fine.
>
> I just wonder why those values get removed from the corosync objdb?
That´s the root cause of the issue.
>
> Note: You added that check, so I guess it has negative side effects when there is no nodename (why did you add that check)?
Well yes, it is an error if we can´t determine our nodename.
The issue now is to understand why it fails for you but doesn´t fail for
me using git.