1. Describe where we are.
resource allocation and user validation
scheduling
partitioning of the system
checkpoint/restart at system level
disk quotas, archiving, migration
External media
statisitcs accounting
performance monitoring & tuning
security
Which of these are (critical, necessary,
useful)
Which are: (included in OS, add-on
to OS, 3rd party, develop on our on. )
What is Dist Computing? heterogeneous
environment, span administrative domains
Which are unique to Distributed Computing?
2. Identify problems
All solutions
need to span Unix and NT domains - heterogeneity in general
System Admin Tools that span the whole
domain
add user - change quotas, modify resource
pool, PS kill
Meta-scheduling - coupling exiting
scheduling, local developed scheduler run at all sites
Fault Tolerance - automatic detection,
recovery/repair, notification
Common Program Development Envir.
- common set of tools and libraries. (Apps Group)
Hetero between sites eg. C compilers,
debugger, ...
Conferencing - commercial products
exist
Notebooks - useful tool, just set
it up, not a big issue
3.Strategies to eliminate problems
System Administration Tools
Web-based resource
management/monitoring tools