Distributed Computing Infrastructure Working Group
 
During the workshop this group was led by Al Geist.  Participants included Michel Jaunin, Martin Frey, Guy Cormier, and Juan Meza.  Notes are courtesy of Al Geist.

1. Describe where we are.
    resource allocation and user validation
    scheduling
    partitioning of the system
    checkpoint/restart at system level
    disk quotas, archiving, migration
    External media
    statisitcs accounting
    performance monitoring & tuning
    security

    Which of these are (critical, necessary, useful)
    Which are: (included in OS, add-on to OS, 3rd party, develop on our on. )
    What is Dist Computing? heterogeneous environment, span administrative domains
    Which are unique to Distributed Computing?

2. Identify problems
    All solutions need to span Unix and NT domains - heterogeneity in general
    System Admin Tools that span the whole domain
    add user - change quotas, modify resource pool, PS  kill
    Meta-scheduling - coupling exiting scheduling, local developed scheduler run at all sites
    Fault Tolerance - automatic detection, recovery/repair, notification
    Common Program Development Envir. - common set of tools and libraries. (Apps Group)
    Hetero between sites eg. C compilers, debugger, ...
    Conferencing - commercial products exist
    Notebooks - useful tool, just set it up, not a big issue
 
3.Strategies to eliminate problems

    System Administration Tools
    Web-based resource management/monitoring tools

    Meta-scheduling     Fault Tolerance