Skip to end of metadata
Go to start of metadata

This describes the adoption of Grouper as a registry for our Groups Webservice (GWS).

The GWS service supports institutional groups (maintained automatically by external processes) and ad-hoc groups (managed by individuals and client applications of GWS). The latter includes the browser application. The service began its life using an LDAP directory, Groups Directory Service (GDS), as its database. That is no longer tenable.

Grouper is a group management toolkit funded by the NSF Middleware Initiative. It is a sibling project to Shibboleth. It provides a user interface, a command line shell, a java API and a webservice API.  The shell includes import and export tools and a mechanism to provision an LDAP directory. Grouper can provide the registry for GWS.

For a history of this effort see GWS with Grouper - A history.


Discussion

GWS API

We desired to keep our present webservice API (both the plain version for program clients and the decorated version for browser users). The plain version is concise and well documented and in use. The browser version is more intuitive and easier to use than the various API that come with Grouper.

One obvious GWS implementation keeps most of the old application, a cgi, but replaces the ldap library with something that interacts with Grouper. However, the interface between Java and C is always awkward and often inefficient. This seems to be an inauspicious course.

A more propitious path employs a lightweight Spring MVC Java application as a facade to Grouper. Both the application, its template engine (Velocity) and Grouper are Java objects, making all the interfaces simpler and more efficient.

Responsiveness, reliability, redundancy

Old GWS had no single point of failure. Both the web and directory services were supported by clusters. The most common query, "what groups is a user in?" takes between 50 and 100 msec. Grouper uses a relational database, which is much more difficult to cluster. It also tends to be about an order of magnitude slower. A half second to second response is OK for a browser user, but is not acceptable for program clients – especially when the client is our IdP.

In order to provide our customary level of service a hybrid system seems advisable. All browser interactions, and all updates from programs are handled by the the new GWS (gws-grouper), while some reads, is-member and groups-for-user, are handled by the old GWS (gws-ldap).

Groups and stems

Grouper uses a colon as a separator of stem and group name parts. The new GWS will continue to use underscore separator and translate as needed.

Grouper has a strict distinction between stems (aka folders) and groups. GWS downplayed the stem concept and never required intermediate stems to actually exist. A groups also acts as a stem for child groups. The new GWS will continue this appearance by automatically managing stems as needed.

  • A stem with the same name as its group will be created to hold subgroups.
  • Create privilege is assigned to the stem (as CREATE an STEM) - not the group.
  • Intermediate stems are created as needed.
  • Removal of a group and all sub groups will remove the hidden stem as well.

We will have to migrate from historical names, Departments.C&C.all, for example, to new names more consonant with our naming policy, for example:

  • "Departments.C&C." translates to "u_cac_".
  • "Admin." translates to "uw_uwtech_internal"

Multiple names for groups

Old GWS allow as many names on a group as the owner wanted. Grouper generally has only one name. However, Grouper now allows a moved (renamed) group to maintain its old name.

This is unadvised policy for a couple of reasons. First, it requires a legacy 'ownership' of the old stem's namespace. Second, owners tend never to remove old names. We end up with twice as many names as are necessary.

Subject Issues

Old GWS did not require that a group's member or administrator actually exist anywhere. Whatever string of characters looked like a UWNetID or some other acceptable name was fine. While lax, this policy did allow quick and efficient loading of memberships.

Grouper requires one or more separate Subject databases in which any member or administrator must exist. The collection of groups is one such database - provided with Grouper. Two external subject interfaces are provided: one to an LDAP directory; the other to an SQL database. The LDAP subject adapter is inefficient, does not work conveniently with a directory that requires SASL EXTERNAL authentication, and provides neither continuous connections nor connection pooling. We can access the subjects in PDS with an improved LDAP adapter. We might consider, though, as we move away from an LDAP group registry because LDAP is not the most capable nor flexible of DBMS, whether we should continue to use LDAP as part of the new registry DBMS. Are we slow learners?

More to the point, this bifurcation forms a serious structural defect in Grouper. There is no ability to use efficient database joins when searching or updating the registry. Every reference to any subject requires an external procedure. By caching what information we need from PDS in a table within Grouper's database we can provide quicker, more efficient and more efficacious registry queries and updates. Until and unless Grouper resolves this defect internally, these improved access methods will require direct access to Grouper tables – outside the provided API. Similar tables and procedures can provide subject sources for DNS names and ePPNs.

Institutional groups

We automatically provide many institutional groups: Affiliation, Budget, Classes, etc. These are kept up-to-date by periodic reconciliation, wherein the various sources of membership present to GWS a current and complete membership. Use of Grouper's loader (xmlImport) results in the group being removed and replaced. This is much too inefficient and is not feasible for large groups. A reconciliation approach is needed.

Reconciliation

The reconciliation process of old GDS (and GWS) directly updated the LDAP database. The new system reconciles through GWS's PUT membership interface. This allows convenient remote access, where clients would not have direct access to Grouper's native API. Grouper's membership is a subject's netid-uuid, not its UWNetID or name. The latter must be looked up by he separate subject system. Reconciliation is therefore much more efficient if memberships are presented as netid-uuid. The new system will support that.

Tasks

  • Modify recon process (tegea) to use GWS.
    • cache of netid-regid relationships
    • processing of PDS ldif in lieu of the xml file now distributed by EDS

Directory provisioning

Grouper includes provisioning tools that should be able to keep an LDAP directory (or directories) up-to-date. There are two such provisioners: the old and not too capable version; and the new, quite-complex beta edition.

We have some specific requirements.

  • We want updates to the directories as soon as possible. Delays of even a few minutes are undesirable.
  • Membership changes to large groups (tens of thousands of members) tend to take a long time on an openldap server. It rebuilds its indices on each update. Thus these need to be batched.
  • Our course groups, on the directories, have a very different structure and are on a separate OU. In addition, they acquire attributes from grouper extended attributes.

Usage statistics

See the usage statistics page.

People picking

Grouper's GUI uses a people picker mechanism to add members to a group. There is no other way. How they expect to comply with the many restrictive clauses of FERPA is beyond me. New GWS allows addition of members by UWNetID, but provides a person picker for authorized users as a convenience.

Small details

Groups as members of groups

Because GWS did not verify the existence of members as they were added, an administrator could add a group as member when he did not have permission to read the membership of that member group. Access was regulated as the group was referenced. Grouper requires that a user has read permission of a membership before a group can be added as a member of another group. We can probably adopt Grouper's policy without affecting existing clients.

Course members see course membership

Should members of a course, student and instructors, be able to see the membership of that course?


Implementation steps

Webservice application

Developed a new webservice that mimics the existing service and uses grouper as a registry.

  • A java and tomcat application
  • Uses Spring MVC, velocity, and the grouper API
  • Translates UW name separators (underscore) to and from grouper separators (colon)
  • Automatically and silently manages stems

Ldap subject source

Developed an Ldap subjct source, based on Grouper's JNDI source, that:

  • Uses the ldap library from Virginia Tech
  • Uses persistent connections and connection pooling
  • Allows convenient certificate (SASL) authentication
  • Works with UW's two-OU directory structure
    • Account uuid (the subject's id) is in a separate OU from the subject's other attributes.

Cached subject source

Develop tools to keep cached information from PDS up-to-date. These are external to Grouper.

Implemented ePPN, DNS, and entityId subjects in RDBMS sources.

  • Provide tools to maintain subjects
  • Allow DNS and ePPN subjects to be automatically populated at first use.

Reconciliation

Modified the institutional group reconciliation processes to:

  • Access GWS rather than the LDAP directory
  • Specify account uuid (subject by id) memberships.

Activity history

Add capability to the new GWS webservice to access Grouper's built-in audit logs.

Directory provisioning

  • Add hook classes to catch activity relevant to LDAP and write it to an activity table in the dataabse.
  • Add out-of-process tasks that use the activity table to keep the directories up-to=date. These can utilize the old GWS ldap tools.

Fix present GWS to return course groups in "member-of" searches.

Task status

Move to Jira (link?)

  • No labels