|
|
Printer-friendly version
Installing an OpenSSI 1.1.1 Cluster on Fedora Core 2
These instructions describe how to install an OpenSSI cluster on Fedora Core
2 (``FC2'') with minimal hardware requirements. All you need is two or
more computers, connected with a private ethernet switch. This network is
called the "interconnect", and should be private for security
and performance reasons. Each individual computer in the cluster is called
a "node".
In this basic configuration, the first node's root filesystem is shared with
the rest of the cluster via the interconnect. This works well for many users.
To learn more about how filesystems are shared over the interconnect, please
see README.cfs.
You can make your filesystems highly-available (``HA'') if they are installed
on (or copied to) shared disk hardware that is attached to two or more nodes.
This can be done with Fibre Channel or some other Storage Area Network (``SAN'').
Please see README.hardmounts for more information.
Note that any time another document is referenced, you can find it in the
docs/ directory of this release tarball, as well as in your system's
/usr/share/doc/openssi-tools/ directory after you install OpenSSI.
A good document that explains more about OpenSSI clustering in general is
Introduction-to-SSI. It was written by Bruce Walker, who is the
project leader for OpenSSI.
This installation guide is provided in multiple formats for your convenience:
- ASCII (INSTALL)
- PDF (pdf/INSTALL.pdf)
- HTML (html/INSTALL.html)
- Lyx source (lyx/INSTALL.lyx)
These instructions assume you are doing a fresh install of OpenSSI. If you
are upgrading from 1.1.0 for Fedora Core 2, please read docs/README.upgrade.
- Install FC2 on the first node. There's no need to install a distribution
on any node other than the first one.
- Feel free to either let the installer automatically partition your filesystems
or do it yourself using the provided tools. The ext3 filesystem is preferred
over ext2, because of its journalling capabilities.
/boot can either be its own partition or a directory on the root
filesystem. Regardless of this choice, these instructions assume that /boot
is located on the first partition of the first drive (e.g., /dev/hda1
or /dev/sda1).
- Configure GRUB as the boot loader, rather than LILO. The OpenSSI project
no longer supports LILO.
The Fedora installer gives you the option to install the boot loader on the
Master Boot Record (``MBR'') of a particular disk, or on the boot block
of a particular partition. It is recommended that you install the boot loader
on the MBR of your first internal disk (e.g., /dev/hda or /dev/sda).
- Configure the cluster interconnect interface with a static IP address. The
interconnect should be on a private switch for security reasons, so hopefully
this requirement does not cause much trouble, even in a networking environment
with dynamic addresses.
- When configuring your firewall, do one of the following:
- designate as ``trusted'' the interface for the cluster interconnect
- disable the firewall
- /etc/modprobe.conf contains a list of all
local network interfaces and their network drivers. If you plan to add a
node that will need a different network driver to connect to the cluster
than what is already listed, add a line for it:
-
- alias eth-extra 8139cp
Add as many eth-extra lines as necessary to specify the various
network drivers you'll need.
- If you wish to run X Windows and you have a PS/2 mouse, there is a change
you need to make to /etc/X11/xorg.conf, otherwise X will not be
able to start with OpenSSI's FC1-based kernel. See ``Using a PS/2 mouse
with X'' in README.X-Windows for more details.
- Run the ./install script. After it installs your packages, it will
ask you a few questions about how you want to configure your cluster and
your first node:
- Enter a node number between 1 and 125. Every node in the cluster must have
a unique node number. The first node is usually 1, although you might want
to choose another number for a reason such as where the machine is physically
located.
- Select a Network Interface Card (``NIC'') for the cluster interconnect.
It must already be configured with an IP address and netmask before it will
appear in the list. If the desired card has not been configured, do so in
another terminal then select (R)escan.
The NIC should be connected to a private network for better security and
performance. It should also be capable of network booting, in case anything
ever happens to the boot partition on the local hard drive. To be network
boot capable, the NIC must have a chipset supported by PXE or Etherboot.
- Select (P)XE or (E)therboot as the network boot protocol for this node. PXE
is an Intel standard for network booting, and many professional grade NICs
have a PXE implementation pre-installed on them. You can probably enable
PXE with your BIOS configuration tool. If you do not have a NIC with PXE,
you can use the open-source project Etherboot, which lets you generate a
floppy or ROM image for a variety of different NICs.
- OpenSSI includes an integrated version of Linux Virtual Server (``LVS''),
which lets you to configure a Cluster Virtual IP (``CVIP'') address that
automatically load balances TCP connections across various nodes. This CVIP
is highly available and can be configured to move to another node in the
event of a failure. For more information, please see README.CVIP.
- Enter a clustername. It should resolve to your CVIP address, either in DNS
or the cluster's /etc/hosts file, if you choose to configure a CVIP.
This is required if you want to run NFS server. For more information, please
see README.nfs-server.
The current hostname will automatically become the nodename for this node.
- Select whether you want to enable root filesystem failover. The root must
be installed on (or copied to) shared disk hardware, in order to answer yes
to this question. If you do answer yes, then each time you add a new node,
you will be asked if the node is physically attached to the root filesystem
and if it should be configured as a root failover node (see openssi-config-node
later). You can learn more about filesystem failover in README.hardmounts.
- A simple mechanism for synchronizing time across the cluster will be installed.
Any time a node boots, it will synchronize its system clock with the initnode
(the node where init is running). You can also run the ssi-timesync command
at any time to force all nodes to synchronize with the initnode.
This timesync mechanism synchronizes nodes to within a second or two of each
other. If you need a higher degree of synchronization, you can configure
Network Time Protocol (``NTP'') across the cluster. Instructions for
how to do this are available in README.ntp.
- Automatic process load balancing will be installed as part of OpenSSI. By
default, only programs launched from the bash-ll shell will be load
balanced. The bash-ll shell is identical to bash, except
for having load balancing enabled.
To enable load-balancing for a program without launching it from bash-ll,
add its program name to /etc/sysconfig/loadlevellist and run service
loadlevel restart. For more information, please see README-mosixll.
- If you want to run X Windows, please see README.X-Windows.
- After your cluster is configured, you will be prompted to reboot your first
node. You must reboot to run the OpenSSI code.
- A new node is added to an OpenSSI cluster using network booting. This lets
you avoid having to install a distribution on more than one node. To network
boot a new node, first select one of its NICs for the cluster interconnect.
It must have a chipset supported by PXE or Etherboot.
The DHCP server will be automatically configured and started to allow the
new nodes to join the cluster.
- If the selected NIC does not support PXE booting, download an appropriate
Etherboot image from the following URL:
-
- http://rom-o-matic.net/5.2.4/
Choose the appropriate chipset. Under Configure it is recommended
that ASK_BOOT be set to 0. Floppy Bootable ROM
Image is the easiest format to use. Just follow the instructions for writing
it to a floppy.
- If the node requires a network driver not already in /etc/modprobe.conf,
follow step 1 of Installing OpenSSI Software
(Section 2) to add it. Then rebuild the ramdisk to include the driver and
update the network boot images:
-
- # mkinitrd --cfs -f <initrd-image> <kernel-version>
# ssi-ksync
- Connect the selected NIC to the cluster interconnect,
insert an Etherboot floppy (if needed), and boot the computer. It should
display the hardware address of the NIC it is attempting to boot with, then
hang while it waits for a DHCP server to answer its request.
- On the first node (or any node already in the cluster), run openssi-config-node.
Select ``Add a new node''. It will ask you a few questions about how
you want to configure your new node:
- Enter a unique node number between 1 and 125.
- Select the hardware address of the new node's cluster interconnect NIC. The
list shows all unknown hardware addresses that have recently probed the cluster's
DHCP server. Choose the one that is displayed on the console of the new node
when it attempts to network boot. If the desired NIC is not listed, make
sure the node has attempted to network boot on the cluster interconnect,
then select (R)escan.
- Enter a static IP address for the NIC. It must be unique and it must be on
the same subnet as the cluster interconnect NICs for the other nodes. If
you want, this program can scan the subnet for IP addresses that seem to
be available. It does this by pinging every address in the subnet, which
can take awhile for larger subnets. You can safely skip this feature if you
know one or more available IP addresses.
- Select (P)XE or (E)therboot as the network boot protocol for this node. PXE
is an Intel standard for network booting, and many professional grade NICs
have a PXE implementation pre-installed on them. You can probably enable
PXE with your BIOS configuration tool. If you do not have a NIC with PXE,
you can use the open-source project Etherboot, which lets you generate a
floppy or ROM image for a variety of different NICs.
- Enter a nodename. It should be unique in the cluster and it should resolve
to one of this node's IP addresses, so that client NFS can work correctly.
The nodename can resolve to either the IP address you configured above for
the interconnect, or to one of external IP addresses that you might configure
below. The nodename can resolve to the IP address either in DNS or in the
cluster's /etc/hosts file.
The nodename is stored in /etc/nodename, which is a context-dependent
symlink (``CDSL''). In this case, the context is node number, which means
each node you add will have it's own view of /etc/nodename containing
its own hostname. To learn more about CDSLs, please see the document entitled
cdsl.
- If you enabled root failover during the first node's installation, you will
be asked if this node should be a root failover node. This node must
have access to the root filesystem on a shared disk in order to answer yes.
If you answer yes, then this node can boot first as a root node, so you should
configure it with a local boot device. This is done after this node joins
the cluster and is described in step 7b.
- Save the configuration.
- The program will now do all the work to admit the node into the cluster.
Wait for the new node to join. A ``nodeup'' message on the first node's
console will indicate this. You can confirm its membership with the cluster
command:
-
- # cluster -v
If the new node is hung searching for the DHCP server, try manually restarting
the dhcpd daemon on the cluster:
-
- # /sbin/service dhcpd reload
If the new node is still hung, reboot it. It should boot properly and join.
- The following steps tell you how to configure the
new node's hardware, including its swap space, local boot device (optional
unless configured for root failover), and external NICs. Sorry for the complexity
of some of the steps. The Fedora installer automates most of it for you during
the installation of the first node, but it's not much help when adding new
nodes.
- Configure the new node with one or more swap devices using fdisk
(or a similar tool) and mkswap:
-
- # onnode <node_number> fdisk /dev/hda
partition disk
# onnode <node_number> mkswap /dev/hda3
Add the device name(s) to the file /etc/fstab, as documented in
README.clusterfstab.
Either reboot the node or manually activate the swap device(s) with the swapon
command:
-
- # onnode <node_number> swapon <swap_device>
- If you have enabled root failover you MUST
configure a local boot device on the new node. Otherwise, configuring a local
boot device is optional. If you are going to configure a local boot device,
it is highly recommended that the boot device have the same name as the first
node's boot device. Remember that we assumed at the beginning of these instructions
that the first node's boot device is located on the first partition of the
first drive (e.g., /dev/hda1 or /dev/sda1).
Assuming you have already created a suitable partition with fdisk,
format your boot device with an ordinary Linux filesystem, such as ext3:
-
- # onnode <node_number> mkfs.ext3 /dev/hda1
Now run ssi-chnode anywhere in the cluster (no need to use onnode
with this command). Select the new node, enter its local boot device name,
and ssi-chnode will copy over the necessary files.
Finally, you need to manually install a GRUB boot block on the new node:
-
- # onnode <node_number> grub --device-map=/boot/grub/device.map
grub> root (hd0,0)
grub> setup (hd0)
grub> quit
- You can configure external NICs using the system-config-network
command:
-
- # onnode <node_number> redhat-config-network
If you do not have system-config-network installed (such as with
a non-GUI installation), you can use the older netconfig command:
-
- # onnode <node_number> netconfig --device=ethX
Do not attempt to configure the NIC you chose for the cluster interconnect;
you may configure any other NIC. The system-config-network command
protects you from making this mistake, but the netconfig command
does not.
- Repeat steps 4-7 at any time
to add other nodes to the cluster.
- Enjoy your new OpenSSI cluster!!!
To learn more about OpenSSI, please read Introduction-to-SSI.
One of the first things you can try is running the demo Bruce, Scott and
I have done at recent trade shows. It illustrates some of the features of
OpenSSI clusters. You can find it here, along with older demos:
-
- http://OpenSSI.org/#demos
Recently, a scalable LTSP server has been tested on an OpenSSI cluster. To
learn more about how to set this up, please see README.ltsp.
If you have questions or comments that are not addressed on the website,
do not hesitate to send a message to the user's discussion forum:
-
- ssic-linux-users@lists.sf.net
|