Pre-Upgrade-Work
OSG provides very detailed documents
What Operating System should my GateKeeper run?
The installation described here is done as root even though services will not necessarily run as root
Once you are ready to upgrade your cluster, please let the
"GOC" (Grid Operations Center) know your resource
will be unavailable by using the "Maintenance Scheduling Tool"
There is a link at the bottom of the page.
Shutdown current services
We have to shutdown the exisiting OSG software stack:
vdt-control --off
NOTE - some of these will fail as they may not be started! Stop/Kill any
process which is running in $VDT_LOCATION (/opt/grid) or the install will
not work correctly.
Move current install out of the way
We want to preserve the previous install by moving it to a new location.
cd /opt/ mv grid grid-11-24-2008 mkdir grid
Pacman
Installs are done with Pacman.
Remove old Pacman, Get Pacman 3.26
rm -rf pacman-3.21* (or previous version of pacman) wget http://physics.bu.edu/pacman/sample_cache/tarballs/pacman-3.26.tar.gz tar --no-same-owner -xzvf pacman-3.26.tar.gz cd pacman-3.26 For sh and bash shells: > source setup.sh For csh and tcsh shells: > source setup.csh
Install OSG:ce
Use pacman to install compute element
cd grid (you should be in /opt/grid now) pacman -trust-all-caches -get OSG:ce
Source setup files
After each install/configure step, it is a good idea to refresh your environment by
sourcing the setup.sh or setup.csh files.
bash-3.1# source setup.sh
Job Manager
If you are using a jobmanager, install the appropriate one.
Only install one job Manager
Select the one which applies: pacman -get OSG:Globus-Condor-Setup pacman -get OSG:Globus-PBS-Setup pacman -get OSG:Globus-LSF-Setup pacman -get OSG:Globus-SGE-Setup
Managed Fork
Managed Fork is an optional service which replaces the default fork
jobmanager with Condor to manage incoming fork requests.
source setup.sh pacman -get OSG:ManagedFork
NOTE:
During the update you may be asked if you would like to run Condor.
You will need to answer y to this because ManagedFork uses Condor
to handle fork jobs on the CE.
Enable Manged Fork
source setup.sh $VDT_LOCATION/vdt/setup/configure_globus_gatekeeper --managed-fork y --server y
*** NEW *** Local Auth Pre Config *** NEW ***
During installation, a default OSG configuration file, $VDT_LOCATION/edg/etc/edg-mkgridmap.conf, is created providing access to all registered and approved OSG VOs VOMS servers. Review this configuration file:
- If the list is acceptable to you, then you're done with this step.
- If not, then you will need to edit the file and commenting out the line(s) for each VirtualOrganizations?/VOInfo you wish to disable.
- Any user mappings should be done in this file
You will NEED "nysgrid" and "mis" enabled.
vdt-control --enable edg-mkgridmap vdt-control --on edg-mkgridmap
*** NEW *** WS-GRAM services & sudoers file *** NEW ***
WS-GRAM is a CE service that submits user jobs from the grid to an underlying local
batch system. Users present their grid identities (user proxy) to WS-GRAM.
This identity is mapped to a local user.
jbednasz is mapped to NYSGRID
The pre-WS GRAM processes run as privileged user (root) and can, therefore,
change to any local unprivileged user. This mechanism, however, may present security
risks: bugs in the code, which runs as root, may be exploited to gain privileged access to the machine.
To mitigate this risk, WS-GRAM processes run as an unprivileged user
(either globus or daemon, depending on the local configuration).
In order for these users to be able to switch to another local unprivileged user,
though, the local sudo service must be appropriately configured.
The configuration requires editing the /etc/sudoers file manually.
IMPORTANT : for this example $VDT_LOCATION is /opt/grid.
Please use appropriate paths for this configuration.
vi /etc/sudoers and add
Runas_Alias GLOBUSUSERS = ALL, !root
daemon ALL=(GLOBUSUSERS) \
NOPASSWD: \
/opt/grid/globus/libexec/globus-job-manager-script.pl *
daemon ALL=(GLOBUSUSERS) \
NOPASSWD: \
/opt/grid/globus/libexec/globus-gram-local-proxy-tool *
Configure OSG
PLEASE UPDATE ENTRIES TO MATCH YOUR SITE
Moving/upgrading an existing installation (e.g. from OSG-0.8 to OSG-1.0)
- Source setup.sh from the new installation
source setup.sh
export OLD_VDT_LOCATION pointing to source installation.
(point to your specific old install) export OLD_VDT_LOCATION=/opt/grid-11-24-2008/
Extract the old configuration by running configure-osg.py -e . This will produce a configuration file, extracted-config.ini.
cd monitoring ./configure-osg.py -e
Edit the extracted-config.ini to correct any mistakes, change any references to the original installation, and to add any missing options
Test the configuration by running configure-osg.py -v -f ./extracted-config.ini../configure-osg.py -v -f ./extracted-config.ini NOTE: You might have to copy over some configuration files over, such at the grid3-user-vo-map.txt cp /opt/grid-11-24-2008/monitoring/grid3-user-vo-map.txt /opt/grid/monitoring/grid3-user-vo-map.txt then rerun: ./configure-osg.py -v -f ./extracted-config.ini
Configure the new installation by running configure-osg.py -c -f ./extracted-config.ini
./configure-osg.py -c -f ./extracted-config.ini
Choosing and Installing a CA Distribution:
OSG 1.0.0 no longer automatically pulls CAs.
To pull the OSG recommended CA distribution edit the cacerts_url in the configuration file at
cd $VDT_LOCATION source setup.sh vi $VDT_LOCATION/vdt/etc/vdt-update-certs.conf
This file contains URLs to CA Certificate distributions including the OSG GOC distribution with
certificates recommended by the OSG Security Team, as well as the VDT convenience distribution.
Uncomment the second option:
cacerts_url = http://software.grid.iu.edu/pacman/cadist/ca-certs-version
This file contains URLs to CA Certificate distributions including the OSG GOC distribution
with certificates recommended by the OSG Security Team, as well as the VDT convenience distribution.
# source $VDT_LOCATION/vdt-questions.sh; $VDT_LOCATION/vdt/sbin/vdt-setup-ca-certificates # vdt-control --enable vdt-update-certs # vdt-control --on vdt-update-certs
Create a link to the certs
cd /etc/grid-security ln -s /opt/grid/globus/share/certificates certificates
Host Cert Check
This can be skipped if this is an upgrade.
cert-request -ou s -ca doegrids -name "Jon Bednasz" -email jbednasz@yahoo.com -phone 716-881-8910 ponsor_name "Steven M. Gallo" -sponsor_email smgallo@ccr.buffalo.edu -sponsor_phone 716-881-8960 -reason "u2-grid.ccr.buffalo.edu server certificate" -affiliation osg -vo nysgrid -dir ~/.globus/u2-grid_hostcert -host u2-grid.ccr.buffalo.edu
sudo bash
cd /opt/grid
source setup.sh
cert-retrieve -dir . -certnum XXXXXX (serial number from DOEGrids-CA-1 email)
mv ./usercert.pem /etc/grid-security/hostcert.pem
mv ./userkey.pem /etc/grid-security/hostkey.pem
chmod 444 /etc/grid-security/hostcert.pem
chmod 400 /etc/grid-security/hostkey.pem
Now check the Cert
openssl x509 -text -noout -in /etc/grid-security/hostcert.pem
Port Ranges
Add appropriate port range for globus:
vi /opt/grid/globus/etc/globus-job-manager.conf
ADD -globus-tcp-port-range "15500,19999"
Add the following to /opt/grid/vdt/etc/vdt-local-setup.csh
setenv GLOBUS_TCP_PORT_RANGE "15500,19999"
setenv GLOBUS_TCP_SOURCE_RANGE "15500,19999"
Add the following to /opt/grid/vdt/etc/vdt-local-setup.sh
export GLOBUS_TCP_PORT_RANGE="15500,19999"
export GLOBUS_TCP_SOURCE_RANGE="15500,19999"
Add port range to xinetd.d files: /etc/xinetd.d/gsiftp /etc/xinetd.d/globus-gatekeeper
env = GLOBUS_TCP_PORT_RANGE=15500,19999
env = GLOBUS_TCP_SOURCE_RANGE=15500,19999
Firewall Rules
Setup Firewall:
FIREWALL:
#
# Globus Gatekeeper
#
-A INPUT -m tcp -p tcp --dport 2119 -s 0/0 -j ACCEPT
#
# GSI FTP
#
-A INPUT -m tcp -p tcp --dport 2811 -s 0/0 -j ACCEPT
#
# Globus TCP Port Range
# Set the environment variable GLOBUS_TCP_PORT_RANGE=,
# so that the various Globus components will know that connections are allowed
# on that port range
#
-A INPUT -m state --state NEW -m tcp -p tcp --dport 15500:19999 -s 0/0 -j ACCEPT
#
# Globus Monitoring and Discovery Service (MDS)
#
-A INPUT -m tcp -p tcp --dport 2135 -s 0/0 -j ACCEPT
#
Check Users
At a minimum, we need to support nysgrid, engage and mis
Add users to cluster
"mis" (monitoring user)
"nysgrid" (nysgrid user)
"engage" (additional VO NYSgrid is supporting)
Setup the GridMap config to only grab NYSGRID users:
sudo vi /opt/grid/edg/etc/edg-mkgridmap.conf
(comment out all but nysgrid/mis/engage)
Turn on the OSG services
Let's fire up all the services
# source setup.sh # vdt-control --on
Optional Configuration
By default, the Managed Fork jobmanager will behave just like the fork jobmanager.
If you wish to restrict it, you need to modify your local Condor configuration.
If you're using Condor from the VDT this can be done by editing
$VDT_LOCATION/condor/local.<hostname>/condor_config.local Set a hard limit on most jobs, but always let grid monitor jobs run (strongly recommended): Add this to $VDT_LOCATION/condor/local.<hostname>/condor_config.local START_LOCAL_UNIVERSE = TotalLocalJobsRunning < 10 || GridMonitorJob =?= TRUE
Run Site Verify
You will need a personal cert for this step.
If you want me to test, please ask jbednasz@ccr.buffalo.edu
cd /opt/grid
source setup.sh
grid-proxy-init
cd verify
./site_verify.pl
Setup RSV
Follow these instructions for setting up RSV to report your sites status back to OSG:
Create a rsvuser user account on your cluster. A locked Unix account is acceptable
Test with:
su -c "/bin/date" rsvuser
Prepare for using a host cert by allowing local accounts (rsvuser) in the gridmap file
vi /opt/grid/edg/etc/edg-mkgridmap.conf Add the following to the bottom of the file: # LOCAL gmf_local /etc/grid-security/grid-mapfile-local
Request the httpd and rsv "service" certs:
cert-request -ou s -service rsv -host $HOSTNAME -label rsv-fqdn cert-request -ou s -service http -host $HOSTNAME -label httpd-fqdn
Once these certs have been approved.
Place RSV service cert files in /etc/grid-security/rsv/ as rsvkey.pem and rsvcert.pem
and
Place HTTP service cert files in /etc/grid-security/http/ as httpkey.pem and httpcert.pem
Add your rsv service cert in /etc/grid-security/grid-mapfile-local
"/DC=org/DC=doegrids/OU=Services/CN=rsv/F.Q.D.N" nysgrid
sudo bash
cd /opt/grid/monitoring
vi config.ini (or the appropriate file your have your configuration)
Goto the "RSV" Section and make appropriate changes:
[RSV]
; The enable option indicates whether rsv should be enable or disabled. It should
; be set to True or False
enabled = %(enable)s
; The rsv_user option gives the user that the rsv service should use. It must
; be a valid unix user account
;
; If rsv is enabled, and this is blank or set to unavailable it will default to
; rsvuser
rsv_user = rsvuser
; The enable_ce_probes option enables or disables the RSV CE probes. If you enable this,
; you should also set the ce_hosts option as well.
;
; Set this to true or false.
enable_ce_probes = %(enable)s
; The ce_hosts options lists the FQDN of the CEs that the RSV CE probes should check.
; This should be a list of FQDNs separated by a comma (e.g. my.host,my.host2,my.host3)
;
; This must be set if the enable_ce_probes option is enabled. If this is set to
; UNAVAILABLE or left blank, then it will default to the hostname setting for this CE
ce_hosts = %(localhost)s
; The enable_gridftp_probes option enables or disables the RSV gridftp probes. If
; you enable this, you must also set the ce_hosts or gridftp_hosts option as well.
;
; Set this to True or False.
enable_gridftp_probes = %(enable)s
; The gridftp_hosts options lists the FQDN of the gridftp servers that the RSV CE
; probes should check. This should be a list of FQDNs separated by a comma
; (e.g. my.host,my.host2,my.host3)
;
; This or ce_hosts must be set if the enable_gridftp_probes option is enabled. If
; this is set to UNAVAILABLE or left blank, then it will default to the hostname
; setting for this CE
gridftp_hosts = %(localhost)s
; The gridftp_dir options gives the directory on the gridftp servers that the
; RSV CE probes should try to write and read from.
;
; This should be set if the enable_gridftp_probes option is enabled. It will default
; to /tmp if left blank or set to UNAVAILABLE
gridftp_dir = %(unavailable)s
; The enable_gums_probes option enables or disables the RSV gums probes. If
; you enable this, you must also set the ce_hosts or gums_hosts option as well.
;
; Set this to True or False.
enable_gums_probes = %(disable)s
; The enable_srm_probes option enables or disables the RSV srm probes. If
; you enable this, you must also set the srm_hosts option as well.
;
; Set this to True or False.
enable_srm_probes = %(disable)s
; Use the use_service_cert option indicates whether to use a service
; certificate with rsv
;
; NOTE: This can't be used if you specify multiple CEs or GUMS hosts
use_service_cert = %(enable)s
; You'll need to set this if you have enabled the use_service_cert.
; This should point to the public key file (pem) for your service
; certificate
;
; If this is left blank or set to UNAVAILABLE and the use_service_cert
; setting is enabled, it will default to /etc/grid-security/rsvcert.pem
rsv_cert_file = /etc/grid-security/rsv/rsvcert.pem
; You'll need to set this if you have enabled the use_service_cert.
; This should point to the private key file (pem) for your service
; certificate
;
; If this is left blank or set to UNAVAILABLE and the use_service_cert
; setting is enabled, it will default to /etc/grid-security/rsvkey.pem
rsv_key_file = /etc/grid-security/rsv/rsvkey.pem
; You'll need to set this if you have enabled the use_service_cert. This
; should point to the location of the rsv proxy file.
;
; If this is left blank or set to UNAVAILABLE and the use_service_cert
; setting is enabled, it will default to /tmp/rsvproxy
rsv_proxy_out_file = %(unavailable)s
; If you don't use a service certificate for rsv, you will need to specify a
; proxy file that RSV should use in the proxy_file setting.
; This needs to be set if use_service_cert is disabled
proxy_file = %(unavailable)s
; This option will enable RSV record uploading to central RSV collector at the GOC
;
; Set this to True or False
enable_gratia = %(enable)s
; The print_local_time option indicates whether rsv should use local times instead of
; GMT times in the local web pages produced (NOTE: records uploaded to central RSV
; collector will still have UTC timestamps)
;
; Set this to True or False
print_local_time = %(disable)s
; The setup_rsv_nagios option indicates whether rsv try to connect to a locat
; nagios instance and report information to it as well
;
; Set this to True or False
setup_rsv_nagios = %(disable)s
; The setup_rsv_nagios option indicates whether rsv try to create a webpage
; that can be used to view the status of the rsv tests. Enabling this is
; highly encouraged.
;
; Set this to True or False
setup_for_apache = %(enable)s
Now reconfigure osg to handle rsv. Correct any errors.
${VDT_LOCATION}/monitoring/configure-osg.py -c -f config.ini (or the appropriate file your have your configuration)
Assuming no errors, turn on the services
vdt-control --on condor-cron osg-rsv apache
After waiting about 15 minutes, check out the local status web page that is created with results of RSV probe runs each day. By default, this page is $VDT_LOCATION/osg-rsv/output/html/index.html

