The problem with /root/.ssh credentials (zero length files, 12 minute timeouts)
was because adding gnome to the image pulled in and enabled "firewalld".
This prevented access on ports 3001 .....
Disabling firewalld in the image resolved that problem
At the moment the only problem is that the mmsdrrestore command that always
worked for us in xCAT 2.8, from postscript sitespecific, no longer works in xCAT 2.13.1.
This is because the GPFS command mmsdrfsdef called from mmsdrrestore is not trying
to copy the credentials, because adminMode=central and fd 0 is not a tty.
I found the line using grep in /usr/lpp/mmfs/bin for "passwordless".
We had changed the GPFS adminMode years ago in anticipation of tightening access down,
but never got around to actually revoking "all to all" access on the nodes, so it still works.
I have a feeling older version of xcat ran postscripts in a pseudo-tty environment.
Not familiar enough with the internals to know for sure.
I'm trying to come up with other ways of sucking down the SDR, maybe wget, that doesn't need ssh as root.
The goal is for the diskless node to be able to reboot and rejoin the GPFS cluster without any manual intervention.
Thanks,
-- ddj
# lsdef gpu002 --osimage
Object name: gpu002
arch=x86_64
bmc=gpu002-bmc
bmcport=0
currstate=netboot rhels7.2-x86_64-cave
groups=gpu,debug,ipmi,rackA4,rackA4B,all,chs
initrd=xcat/osimage/rhels7.2-x86_64-netboot-cave/initrd-stateless.gz
installnic=eno1
ip=172.20.105.82
kcmdline=imgurl=http://172.20.0.6:80//install/netboot/rhels7.2/x86_64/cave/rootimg.cpio.gz XCAT=!myipfn!:3001 NODE=gpu002 FC=0 netdev=eno1 console=tty0 console=ttyS0,115200
kernel=xcat/osimage/rhels7.2-x86_64-netboot-cave/kernel
mac=e4:1f:13:84:55:f2!gpu002
mgt=ipmi
netboot=xnba
nfsserver=172.20.0.6
os=rhels7.2
postbootscripts=otherpkgs
postscripts=syslog,remoteshell,syncfiles,setupntp,ipoib,sitespecific
power=ipmi
primarynic=eno1
profile=cave
provmethod=rhels7.2-x86_64-netboot-cave
serialport=0
serialspeed=115200
status=netbooting
statustime=02-07-2017 13:19:41
tftpserver=172.20.0.6
profile=cave
pkglist=/install/custom/netboot/rh/cave.rhels7.x86_64.pkglist
osname=Linux
postinstall=/install/custom/netboot/rh/cave.rhels7.x86_64.postinstall
exlist=/install/custom/netboot/rh/cave.rhels7.x86_64.exlist
osdistroname=rhels7.2-x86_64
osvers=rhels7.2
objtype=osimage
osarch=x86_64
provmethod=netboot
rootimgdir=/install/netboot/rhels7.2/x86_64/cave
imagetype=linux
otherpkgdir=/install/post/otherpkgs/rhels7.2/x86_64
pkgdir=/install/rhels7.2/x86_64
-- ddj
Dave Johnson
> On Feb 8, 2017, at 4:52 AM, Er Tao Zhao <***@cn.ibm.com> wrote:
>
> Hi, David
>
> Can you show me the node definition?
> The postbootscript will be run one by one on the CN no matter of the alphabet order.
>
> Thx!
> Best Regards,
> -----------------------------------
> Zhao Er Tao
>
> IBM China System and Technology Laboratory, Beijing
> Tel:(86-10)82450485
> Email: ***@cn.ibm.com
> Address: 1/F, 28 Building,ZhongGuanCun Software Park,
> No.8 DongBeiWang West Road, Haidian District,
> Beijing, 100193, P.R.China
>
>
> ----- Original message -----
> From: David D Johnson <***@brown.edu>
> To: xCAT Users Mailing list <xcat-***@lists.sourceforge.net>
> Cc:
> Subject: Re: [xcat-user] upgrading xCAT onto new servers
> Date: Tue, Feb 7, 2017 8:04 PM
>
> That was already the case (IP of mgt1 and IP of mgt[2] are the forwarders).
> I don't believe it will forward requests within the zones that it is authoritative.
> I ended up using tabdump to recreate the hosts and nodelist tables. Mostly good.
>
> Now the problem of the day is fixing the SSH credentials so that all the diskless nodes booting off the
> new frontend can get root access to all the nodes still booted off the old frontend. Need this
> especially for GPFS. I've been trying to follow what's going on in the remoteshell postscript,
> and I'm wondering if my "sitespecific" postscript is running before "remoteshell" is competed.
> Is there a way to determine/force the order the postscripts are executed? Sitespecific is after
> remoteshell both in alphabet and in the lsdef output.
> The basic problem is that mmsdrrestore fails during sitespecific, but works fine when I try it again later by hand.
>
> -- ddj
> Dave Johnson
> Brown University
>
>> On Feb 7, 2017, at 4:32 AM, Er Tao Zhao <***@cn.ibm.com <mailto:***@cn.ibm.com>> wrote:
>>
>> Hi, David
>>
>> Will you pls try 'chdef -t site forwarders=<ip_of_mgt1>' and then 'makedns' to use mgt1 as your remote DNS server.
>> Pls feel free to let me know if there is any more issues.
>>
>> Thx!
>> Best Regards,
>> -----------------------------------
>> Zhao Er Tao
>>
>> IBM China System and Technology Laboratory, Beijing
>> Tel:(86-10)82450485
>> Email: ***@cn.ibm.com <mailto:***@cn.ibm.com>
>> Address: 1/F, 28 Building,ZhongGuanCun Software Park,
>> No.8 DongBeiWang West Road, Haidian District,
>> Beijing, 100193, P.R.China
>>
>>
>> ----- Original message -----
>> From: "David D. Johnson" <***@brown.edu <mailto:***@brown.edu>>
>> To: "xcat-***@lists.sourceforge.net <mailto:xcat-***@lists.sourceforge.net>" <xcat-***@lists.sourceforge.net <mailto:xcat-***@lists.sourceforge.net>>
>> Cc:
>> Subject: [xcat-user] upgrading xCAT onto new servers
>> Date: Sat, Feb 4, 2017 3:04 AM
>>
>> Weâre upgrading cluster mgt node hardware and software at the same time, going from 2.8.3 to 2.13.1,
>> and from centos6.7 to rhels7.2. I have the new frontend installed and somewhat functional.
>> Right now Iâm needing to clone the DNS / named from âmgt1â that is still authoritative for the production cluster.
>> I could just tabdump hosts and nodelist and do makedns on âmgt5â, or Iâm thinking there might be a way to make
>> the new mgt5 a slave to the existing named running on mgt1. Any pros/cons? What would you do?
>>
>> Thanks,
>>
>> â ddj
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org <http://slashdot.org/>! http://sdm.link/slashdot <http://sdm.link/slashdot>
>> _______________________________________________
>> xCAT-user mailing list
>> xCAT-***@lists.sourceforge.net <mailto:xCAT-***@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/xcat-user <https://lists.sourceforge.net/lists/listinfo/xcat-user>
>>
>>
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org <http://slashdot.org/>! http://sdm.link/slashdot_______________________________________________ <http://sdm.link/slashdot_______________________________________________>
>> xCAT-user mailing list
>> xCAT-***@lists.sourceforge.net <mailto:xCAT-***@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/xcat-user
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot <http://sdm.link/slashdot>
> _______________________________________________
> xCAT-user mailing list
> xCAT-***@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xcat-user <https://lists.sourceforge.net/lists/listinfo/xcat-user>
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot_______________________________________________
> xCAT-user mailing list
> xCAT-***@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xcat-user