Whatever conduces to universal fellowship,
that is to say, whatever causes people
to live in harmony with one another,
is profitable. -- Spinoza
Summary
The use of an LDAP directory service as a groups registry limits our present groups service (GWS). One proposed improvement is a new, locally developed registry using a PostgreSQL database. The GWS webservice can work directly with this registry and, by automatic processes, feed the present LDAP directories, which remain unchanged. As the GWS webservice and the LDAP directory service continue to present the same APIs as before this approach has the advantage that no existing clients would be interrupted or affected. As a second, though some might say first, approach the Architect suggests instead that this is a good time to install grouper as the engine for our group registry.
Present groups registry
The UW Groups Service consists of a Groups Directory (GDS), a webservice (GWS) providing a user interface for browser clients and a RESTful API for programmed clients, and several automatic feeds of group memberships from student services and administrative sources.
Directory
The directory, a cluster of openldap services, provides:
- A flat space. Group naming conventions give the appearance of a hierarchy, but none is enforced.
- Recursive group membership.
- Read-only for all but IAM clients.
- Read and view access control of group visibility.
- Course memberships. These are maintained in a separate OU, with additional attributes relevent to courses.
Webservice
The webservice, directly using the directory as a registry, provides in addition:
- A simple, certificate authenticated, RESTful web service for automated access.
- An interactive, uwnetid authenticated GUI for browser access.
Enterprise groups
Automatic, generally nightly, feeds of group information include:
- Course students and instructors,
- Internal UW Technology email lists,
- Budget number memberships,
- Affiliations (student, staff, faculty, ...),
- Student majors
Grouper
Grouper is a group management toolkit funded by the NSF Middleware Initiative. It is a sibling project to Shibboleth, although it does not enjoy the same level of developer support and lags behind in its installed base. Grouper provides a user interface, a command line shell, a java API and a webservice API. The shell includes import and export tools and an mechanism to provision an LDAP directory.
Grouper installation: progress, trials, tribulations
The migration plan assumes that we want to keep the user interface of the groups service as much as possible unchanged.
- The LDAP directory will keep its present schema and authorization mechanisms. It will appear as much as possible as it does now.
- We will continue to provide the RESTful webservice.
- We may like to provide the GWS UI.
Demonstration installation
In order to understand the capabilities and challenges of grouper we have installed a working service, complete with the 'u_' and 'uw_' groups from the present GDS (as of 04/02/09). Anyone with access to the existing GWS UI can access and enjoy the corresponding UW Grouper user interface.
Local extensions and tools
For the demonstration site, and probably for any production installation, we provide a fewextensions to the distributed code.
New LDAP Subject provider
Grouper allows subjects (members, etc.) to come from various sources, one of which is an LDAP directory. Thus we can work with anyone in our PDS directory. However, the LDAP subject provider used by the distribution, the standard JNDI from Sun Microsystems, does not work efficiently with a directory that requires SASL EXTERNAL authentication. In particular it provides neither continuous connections nor connection pooling. We implemented a new subject provider using the ldap library from Virginia Tech. This is also the library used by Shibboleth.
Grouper uses the REMOTE_USER environment variable as the browser user's identity. We are protecting our site with Shibboleth, as that seems likely the mechanism we would use in a production setup. Shibboleth REMOTE_USER values include a domain specification, e.g., "@washington.edu". Our custom subject provider trims these suffixes to match our PDS directlory entries.
We also added a sorting and searching features to the provider.
If we want to allow non-uwnetid membership we will have to come up with an ePPN subject provider. That has not been done.
Data import from GDS
Our translation tool from GDS to Grouper utilizes the apparent hierarchy of GDS to create a real hierarchy within Grouper.
- GDS's "u_spud1_spud2_potato" becomes in Grouper "u:spud1:spud:potato"
- "Departments.C&C." is translated to "u_cac_"
- "Admin." is translated to "u_cac_internal"
- ePPN members are dropped
The translations of 'Departments.C&C.' and 'Admin.' to their 'u_' names are gratuitous and need not be carried forward to a production system.
This export-import operation may have lost a group or member or two. My goal was to capture most groups and members and admins. Any losses are being investigated, but any delay not inhibit other testing of Grouper.
RESTful APIs
A RESTful API, matching the existing API anu UI of GWS will be developed. There are some issues:
- The GWS API translates grouper concepts into simpler forms.
- GWS has a flat namespace. A GWS group name translates to a grouper stem-stem-...-groupname. There has to be a basic stem template that GWS can use when automatically creating groups.
- GWS works with identity 'types', netids, eppns, groups, and dns names; Grouper works with identity sources. Groups and netids map directly between the two views. Grouper will need a custom source for the eppn and dns types.
- Grouper maintains more attributes about groups than does GWS. These will need default values when GWS creates a group, and will have to be preserved when GWS updates a group.
- Naming conventions: GWS (and LDAP) uses underscore as a separator; grouper uses a colon. This might cause confusion if someone used both the GWS and the native Grouper GUIs.
Reconciliation
Grouper has an import tool (xmlImport) that can accept group data in a similar format to present day's tegea imported of GWS. However, it may be too slow to be directly useful. Timing test show it to import new groups at a rate of about 500 subjects per minute. It imports existing groups by first deleting the old and then importing the new, at a rate closer to 300 subjects per minute. That's too many hours for a nightly reconciliation of several hundred thousand subjects.
We can write a reconciliation tool that works more efficiently.
As a demonstration and test we ran a couple of trials on a large group (u_subman_ezproxy, 100K members).
- xmlImport: 567 minutes. Uses R&R method. Don't know why so long. 14:32 4/8 - 00:46 4/9
- reconciler: 7 minutes. Compare and update method. Of this six minutes was spend loading the xml document. That's an inefficient way to do this sort of task. Serial processing of the input should bring this run down to a couple of minutes.
LDAP provisioning
Grouper includes provisioning tools that should be able to keep an LDAP directory (or directories) up-to-date. I haven't yet been able to make that work. Expect it to be not too difficult though. Some issues to consider:
- Keep recursive memberships? or expand them? Many clients would benefit by a non-recursive membership.
- How to assure authz protection is correctly imported into LDAP? May be difficult when complex groups are imported.
- How to allow selective provisioning: to groups.u? to UWWI? to Google?