Login  View  Edit  Attributes  History  Attach  Print  Search

SSH

how I got host based ssh working, a tale of troubleshooting:

First a run through of how to set things up: to get host based authentication working, I edited ssh_config and sshd_config on both hosts. Changes to ssh_config:

    HostbasedAuthentication yes
    EnableSSHKeysign yes
    PreferredAuthentications hostbased,publickey,keyboard-interactive,password

sshd_config:

    HostbasedAuthentication yes

Also set up /etc/ssh/ssh_known_hosts with keys from all hosts and /etc/hosts.equiv. To get keys, just take them from .ssh/known_hosts. When adding a new host, get key for this host, add to mimi's /etc/ssh/ssh_known_hosts file, then copy this to cluster image and bettye. Same for /etc/hosts.equiv.

Finally, make sure host is in bind on ella, all three database files

Useful tools:

ssh -vvv (sometimes useful to ssh to same machine you are on)
tcpdump -vv host 10.208.108.17
perl -MSocket -e 'print gethostbyaddr(inet_aton("10.208.108.17"),AF_INET)."\n"'

The tale of troubleshooting:

I originally used /etc/ssh/shosts.equiv with hostnames and IPs, but otherwise had above setup. Here is how I figured out what had gone wrong:

When I test, I get one of three results.

  1. it works
  2. it ignores host based authentication and asks for a password
  3. it fails completely, no route to host

original login host (from outside with password) to internal host:

  herbie    ---->    mimi            asks for password  
  herbie    ---->    oscar           asks for password
  mimi      ---->    herbie          works
  mimi      ---->    oscar           fails completely
  oscar     ---->    mimi            asks for password
  oscar     ---->    herbie          asks for password

If I do a tcpdump on mimi when it is trying to contact oscar (complete fail), I get:

13:09:51.992893 arp who-has oscar.shadlen.org tell mimi.shadlen.org
13:09:52.992885 arp who-has oscar.shadlen.org tell mimi.shadlen.org
13:09:53.992886 arp who-has oscar.shadlen.org tell mimi.shadlen.org

but with mimi to herbie (succeeds):

13:10:14.616384 IP herbie.shadlen.org.4153576441 > mimi.shadlen.org.nfs: 116 getattr [|nfs]
13:10:14.616442 IP mimi.shadlen.org.nfs > herbie.shadlen.org.4153576441: reply ok 116 getattr [|nfs]
13:10:14.616579 IP herbie.shadlen.org.977 > mimi.shadlen.org.nfs: . ack 565964103 win 65535 <nop,nop,timestamp 1552177950 1444529736>

and further communication looks normal

and herbie to mimi (asks for password)

listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
13:58:05.942989 IP 10.208.108.24.977 > 10.208.108.17.nfs: . ack 651656783 win 65535 <nop,nop,timestamp 1552531476 1444883207>
13:58:05.943142 IP 10.208.108.24.1479118073 > 10.208.108.17.nfs: 128 getattr [|nfs]
13:58:05.943256 IP 10.208.108.17.nfs > 10.208.108.24.1479118073: reply ok 116 getattr [|nfs]
13:58:05.943278 IP 10.208.108.24.977 > 10.208.108.17.nfs: . ack 117 win 65535 <nop,nop,timestamp 1552531476 1444883216>

The total fail seems to happen because it can't resolve oscar, but host, dig and ping all look the same from every host. Nope. Ping also did not work from mimi. Turns out mimi had an /etc/hosts file with an old (wrong) ip for oscar that was no longer being used. So, that solves the total fail, and also clues me into what is going wrong with the other connections. It seems that the /etc/ssh/shosts.equiv is being ignored, and I'm not sure why the DNS server is not sufficient, but it seems like on mimi what is critical is having the /etc/hosts file. Hmm. The reason I have a DNS server is so I don't have to update /etc/hosts files on lots of machines. I was reluctant to use the /etc/ssh/shosts.equiv file for the same reason, but had created it more to troubleshoot, with the hope of getting rid of it eventually. There was a reason for having /etc/hosts in mimi. Probably because it is the host for the etherboot machines.

Found this tool, run this on the command line, works as expected for all machines on mimi, now that /etc/hosts is fixed, but returns nothing from herbie.

perl -MSocket -e 'print gethostbyaddr(inet_aton("10.208.108.17"),AF_INET)."\n"'

check out this webpage:

http://book.opensourceproject.org.cn/security/securitycook/opensource/0596003919_linuxsckbk-chp-6-sect-8.html

From this, I learned that there seemed to be a problem with the DNS server. So, I took a look at my bind configuration, and discovered that we did not have a reverse lookup file. I created a file, restarted bind, and bingo, now I can use host-based ssh authentication! I suspect this will have solved my original problem with mimi as well, so will try removing stuff from the /etc/host file, and see if everything still works. Then I will have one less place to have to worry about propagating ip/hostname changes to!

Okay, this is strange. It seems that I do not need the /etc/ssh/shosts.equiv file on herbie, but I do on mimi. I wonder if it is having ips in /etc/hosts file that causes this? Or maybe nsswitch.conf is different? It isn't nsswitch.conf, let's try temporarily removing stuff from /etc/hosts on mimi, and see if we can then get rid of the /etc/ssh/shosts.equiv file. That didn't work, mimi still asked for a password from herbie to mimi, but mimi to herbie, of course, still works. Let's look at the communication again, leaving /etc/hosts stripped, and no /etc/ssh/shosts.equiv file:

19:02:51.520163 IP mimi.shadlen.org.nfs > herbie.shadlen.org.4175339513: reply ok 116 getattr [|nfs]
19:02:51.520243 IP herbie.shadlen.org.977 > mimi.shadlen.org.nfs: . ack 241 win 65535 <nop,nop,timestamp 1600673458 1493018962>
19:02:51.520342 IP herbie.shadlen.org.4192116729 > mimi.shadlen.org.nfs: 120 access [|nfs]
19:02:51.520357 IP herbie.shadlen.org.4208893945 > mimi.shadlen.org.nfs: 116 getattr [|nfs]
19:02:51.520414 IP mimi.shadlen.org.nfs > herbie.shadlen.org.4192116729: reply ok 124 access [|nfs]
19:02:51.520462 IP mimi.shadlen.org.nfs > herbie.shadlen.org.4208893945: reply ok 116 getattr [|nfs]
19:02:51.520541 IP herbie.shadlen.org.977 > mimi.shadlen.org.nfs: . ack 481 win 65535 <nop,nop,timestamp 1600673458 1493018962>

Well, that looks fine. The little perl script is still giving good info, and looks the same on both:

maria@mimi:~$ perl -MSocket -e 'print gethostbyaddr(inet_aton("10.208.108.24"),AF_INET)."\n"' herbie.shadlen.org

maria@herbie:~$ perl -MSocket -e 'print gethostbyaddr(inet_aton("10.208.108.17"),AF_INET)."\n"' mimi.shadlen.org

Wtf? What else could possibly be different between mimi and herbie that I need an shosts.equiv file on mimi?

Okay, I had an /etc/hosts.equiv file on herbie. Silliness! Well, need to choose one and stick with it, and see if I can keep /etc/hosts on mimi stripped down. I think I like the /etc/hosts.equiv file better then /etc/ssh/shosts.equiv, so I will keep that on both mimi and herbie.