I set out recently to run modern (> kernel 3.5.x) linux on a Sun Netra T1 I had laying around, and it turned into to a rather in depth process indeed. I do have a soft spot for Sparc and Sun hardware in general, and have a few of these machines that have been running FreeBSD for ages without issue. However, I was hoping to use this particular machine with a Linux kernel driver I wrote which interfaces my
geiger counter, seeding the kernel random number generator with events.
In any event, I started by seeing if the Netra would netboot using DHCP, as I already had my dhcpd pointing to tftp. I dropped a Sparc Debian netboot image in TFTP at the right filename (the machine's intended IP in hex, 0A00000E in my case) but alas:
lom>poweron
lom>
LOM event: power on
Netra t1 (UltraSPARC-IIi 440MHz), No Keyboard
OpenBoot 3.10.25 ME, 256 MB memory installed, Serial #14265310.
Ethernet address 8:0:20:d9:ab:de, Host ID: 80d9abde.
Drive not ready
Boot device: net File and args:
Timeout waiting for ARP/RARP packet
Timeout waiting for ARP/RARP packet
Timeout waiting for ARP/RARP packet
Packet logs show indeed that the Netra doesn't request DHCP, only RARP:
root@elan /etc/dhcp # tcpdump -n ether host 08:00:20:d9:ab:de
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
10:00:16.860634 ARP, Reverse Request who-is 08:00:20:d9:ab:de tell 08:00:20:d9:ab:de, length 50
10:00:19.519799 ARP, Reverse Request who-is 08:00:20:d9:ab:de tell 08:00:20:d9:ab:de, length 50
10:00:24.180254 ARP, Reverse Request who-is 08:00:20:d9:ab:de tell 08:00:20:d9:ab:de, length 50
OK, no problem. I installed, configured, and started rarpd:
root@elan /tftpboot/sparc # cat /etc/ethers
08:00:20:d9:ab:de sparc0
root@elan /tftpboot/sparc # grep sparc0 /etc/hosts
11.0.0.14 sparc0.priv.crepinc.com sparc0
root@elan /tftpboot/sparc # rarpd -A -v -d -b /tftpboot/sparc
Then again booted the machine:
ok boot net
Boot device: /pci@1f,0/pci@1,1/network@1,1 File and args:
a00000 Fast Data Access MMU Miss
Hmmmm. The machine begins loading the image, but stops at byte 0xa00000 and prints the above. Googling shows that this is a known bug on this hardware (see
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=658588) - apparently the Debian netboot images got too large after Lenny and Netra is no longer supported.
OK, let's try a Lenny image, and we can build the fresh kernel after install:
[ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.10.25 2000/01/17 21:26'
[ 0.000000] PROMLIB: Root node compatible: sun4u .25 2000/01/17 21:26'
[ 0.000000] Initializing cgroup subsys cpu sun4u
[ 0.000000] Linux version 2.6.26-2-sparc64 (Debian 2.6.26-29) (dannf@debian.ooorg) (gcc verrg) (gcc version 4.1.3 20080704 (prerelease) (Debian 4.1.2-25)) #1 Sun Mar 4 21:::2:17:20 UTC 2012 Linux version 2.6.26-2-sparc64 (Debian 2.6.26-29) (dannf@debian.o
Nice! The kernel boots and the installer loads but... another known bug: the SCSI controller isn't located. Sigh. I then tried a few Ubuntu versions, all of which were too large and triggered the MMU miss above.
Next on the list of possible distros (even just to use as a jumping off point to bootstrap the system) is Gentoo. I nabbed the netboot image, dropped it in the TFTP dir, and we were in business:
ok boot net
Boot device: /pci@1f,0/pci@1,1/network@1,1 File and args:
846200
PROMLIB: Sun IEEE Boot Prom 'OBP 3.10.25 2000/01/17 21:26'
PROMLIB: Root node compatible: sun4u
Linux version 2.6.32-gentoo-r7 (root@bender) (gcc version 4.3.3 (Gentoo 4.3.3 p1.0) ) #1 SMP Tue Apr 13 22:46:39 UTC 2010
[...]
Gentoo/SPARC Netboot for Sun UltraSparc Systems
Build Date: April 13, 2014
Nice! Given the number of years it's been since I built a Gentoo box, I followed the handbook for Sparc:
http://www.gentoo.org/doc/en/handbook/handbook-sparc.xml. A couple notes:
(1) Due to the small size of my disk and the enormous number of files in recent portage, I kept running out of inodes while untarring. I ended up making /usr much larger than it needed to be and asking mkfs for the maximum number of inodes (-i 1024) to solve it.
(2) For some reason, the permissions of items in /dev on the install chroot were not to emerge's liking. chmod a+rw /dev/* provided the quick fix (for obvious reasons, don't do that on any real system).
(3) The kernel I built was a bit too large, though as the handbook specifies, I was able to strip it down to the required size:
(chroot) netboot linux # du -hs vmlinux
7.8M vmlinux
(chroot) netboot linux # strip -R .comment -R .note vmlinux
(chroot) netboot linux # du -hs vmlinux
5.7M vmlinux
(4) I had originally compiled in Open Boot PROM to the kernel, however probing at boot would lock the machine:
[ 40.238603] /pci@1f,0/pci@1,1/ebus@1/flashprom@10,0: OBP Flash, RD 1fff0000000[100000] WR 1fff0000000[100000]
[ 40.369577] /pci@1f,0/pci@1,1/ebus@1/flashprom@10,400000: OBP Flash, RD 1fff0400000[200000] WR 1fff0400000[200000]
[ 40.505971] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
[ 40.619440] flash: probe of f0084fe0 failed with error -16
[ 40.691610] /pci@1f,0/pci@1,1/ebus@1/flashprom@10,800000: OBP Flash, RD 1fff0800000[200000] WR 1fff0800000[200000]
[ 40.827856] flash: probe of f0085178 failed with error -16
[ 64.439127] BUG: soft lockup - CPU#0 stuck for 23s! [swapper:1]
[...]
[ 65.911690] I7:
[ 65.963206] Call Trace:
[ 65.995345] [000000000099b2ec] openprom_init+0x8/0x78
[ 66.062946] [0000000000426bcc] do_one_initcall+0xec/0x140
[ 66.135131] [0000000000976914] kernel_init_freeable+0xfc/0x1a0
[ 66.213000] [00000000007dce44] kernel_init+0x4/0x100
[ 66.279456] [0000000000405f84] ret_from_fork+0x1c/0x2c
[ 66.348175] [0000000000000000] (null)
Disabling "/dev/openprom Device Support", "openprom /proc Entry", and "OBP Flash Device Support" from the kernel config solved the issue - I didn't bother tracking down exactly which module was the problem as I don't need openprom support, but presumably it could be determined quickly.
After rebuilding the kernel and initramfs, it boots successfully!
jackc@sparc0 ~ $ uname -a
Linux sparc0 3.12.13-jackc-v2 #2 Wed Apr 9 21:27:52 EDT 2014 sparc64 sun4u TI UltraSparc IIi (Sabre) GNU/Linux
One strange thing of note is that Gentoo went to a multilib system for Sparc in recent versions - that is, 64 bit kernel, 32 bit userland. (see
http://www.gentoo.org/proj/en/base/sparc/multilib.xml)
jackc@sparc0 ~ $ file /boot/kernel-3.12.13-jackc-v2
/boot/kernel-3.12.13-gentoo-v1: ELF 64-bit MSB executable, SPARC V9, Sun UltraSPARC1 Extensions Required, relaxed memory ordering, version 1 (SYSV), statically linked, BuildID[sha1]=46a9535b4e050fbbdfdfe71cba5795502d127eb2, with unknown capability 0x410000000f676e75 = 0x1000000070433, not stripped
jackc@sparc0 ~ $ file /bin/ls
/bin/ls: ELF 32-bit MSB executable, SPARC32PLUS, V8+ Required, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.16, stripped
Both 32- and 64- bit binaries can be compiled and run using the -m32 or -m64 flags, respectively, to sparc-unknown-linux-gnu-gcc.
Another thing of note is that the machine will hard lock under high load, such as compiling kernels. It just so happens a thread came up on the sparclinux kernel mailinglist about a memory management bug just recently, with a few patches slated to be merged into mainline: see
http://marc.info/?t=139934944100001&r=1&w=2. I'll hopefully get around to applying and testing the patches in the near future.