15 ideas posted
47 comments 33 votes 141 users
In this community, you can submit ideas, vote on existing ideas, or add comments.
To submit an idea, please click the Submit New Idea button at the top of the navigation sidebar. You will then be asked to add a title and choose a campaign for the new idea. You will also have the option to add tags to the idea. To vote on an idea, simply click the up or down arrows to the right of the idea title/description. And to add a comment, click in the box below the idea.
If you would like to see all ideas created with a specific tag, you can click on the word or phrase via the tagcloud in the navigation sidebar area under "What we're discussing". You can also view ideas sorted by Campaigns from the right navigation area. To return to this page, click the All Ideas link.
Note: this case is not itself a scenario, but rather a common theme extracted from the scenarios above. The previous two examples consideration raise the issue of identity management, which is intimately connected with the FAM requirements for some of the use cases, in particular those that involve a long duration in some way. The current approach in the UK federation addresses identity only for relatively short durations, ...more »
Note: this case is not itself a scenario, but rather a common theme extracted from the scenarios above.
The previous two examples consideration raise the issue of identity management, which is intimately connected with the FAM requirements for some of the use cases, in particular those that involve a long duration in some way. The current approach in the UK federation addresses identity only for relatively short durations, which is sufficient for the scenarios that it was originally intended to support. There seems to be a need for personal identifiers that are persistent (they always refer to the same individual), unique within the federation, and isolated as far as possible from changes in the user's personal attributes (such as a change of host institution).
The AAF (Australian Access Federation) , which is the Australian equivalent of the UK Access Management Federation, has been looking into this issue. It has adopted an attribute auEduPersonSharedToken, which functions as an opaque and resolvable (by the owning IdP) identifier (thus far like eduPersonTargetedID), but which seems to fulfil the additional requirements identified above: it is unique (and cannot be reassigned) – no other user within the federation, at any other institution or time, can have the same identifier; it is not targeted – it does not depend on the SP; it is persistent – each time a user returns to an SP, his/her identifier is the same; it is immutable – a user’s attribute does not change once assigned; it is (in principle) portable, i.e. when a user moves institutions the attribute can be transferred.
In the AAF, the values of the token can be generated either by an IdP or by a federated service; both methods will be used by the federation, and the result will be the same in either case. Although these identifiers are in principle portable, portability has not yet been implemented in practice. This is a procedural issue rather than a technical one, as the technical solution described above makes portability possible. Initially, identifiers will be ported manually, although automated procedures will be investigated if needed . The roll out of the AAF in Australia is underway but is not yet far advanced. It would be useful to investigate further at a later date, to see whether/how the implementation of auEduPersonSharedToken works in practice.
Some issues with transferring identifiers between institutions are: when an institution makes available a personal identifier that has been used (supposedly) by an individual at previous institutions, will it be asserting things about that individual’s previous history? Or conversely, will it making assertions about future uses? Does this require them to trust or vet previous employers? Will this have any legal consequences for the institution (e.g. legal liability) if the assertion turns out to be false? Should a named authority service be used?
Several projects are of relevance here:
• the Names project , which is addressing name authority services and persistent, unique user identifiers in HE, using ID data sourced from ZETOC and UKPMC records. The availability of such a service, providing a persistent and reliable association of IDs with individuals, could be exploited to make the federation a source of authoritative claims to identity. Shibboleth could pass this information as an attribute; the issue is to get policies and processes defined..
• the recent JISC-funded e-portfolio report .
• GFIVO .
• MIAP (Managing Information Across Partners) and the Unique Learner Number (ULN) project . A ULN is a unique, opaque identifier that can be associated with a learner throughout their lifetime; MIAP provides a service for generating ULNs. There has already been discussion on including ULNs within the federation attributes.
• the National Information Standards Organization’s (NISO) work on persistent user identifiers .
• the current JISC call for an Identity Management Toolkit .
CERIF (Common European Research Information Format ) is an EU-supported standard for modelling and exchanging research information, and is maintained by the not-for-profit organisation euroCRIS, which promotes the use of CRIS (Current Research Information Systems). CERIF/CRIS do not in themselves have anything to do with access management; however, the data model naturally incorporates a Person entity that persists independently of any affiliation. It is not clear whether any production CRIS implementations are underway, but there is a clear overlap of interests between CRIS/CERIF, identity management, and repositories of research outputs.
Other identity management issues worth investigation by JISC are:
• User-centric identity management, in which the assertions are made by the individual. With regards to this, look at the recent JIIE (JISC Integrated Information Environment Committee) presentation on self-assertion, and the recent SDSS study on OpenID , which is increasingly being used in social networking communities, and may also be applicable in HE and research environments . There are problems with the levels of assurance provided by an OpenID (see discussion of Scenario 7, below). However, it may be interesting to disaggregate the use of OpenID to identify oneself from the authentication/assurance aspects; the OpenID could then be used as an attribute within the federation, but with federation methods used to provide the trust. Alternatively, one could use OpenIDs supplied only by providers that conform to additional conditions, although this may break the benefits of OpenID . JISC may be able to use the SDSS study to influence take-up of OpenID (or a competitor) among resource providers.
• Allowing people to have multiple identities for different environments, e.g. using tools such as Cardspace.
i) The issue of persistent personal identification is important for JISC. JISC should commission a small feasibility study (desk research) to investigate whether, in the current (or near current) environment, there is a token that could be used to provide functionality similar to auEduPersonSharedToken. The study could focus on the Names project as the appropriate attribute store providing the values for this token. The study should be driven by examining the requirements of existing or potential SPs, and of stakeholders such as HEFCE and research funding bodies, and should investigate the rules around the token and its values.
ii) At a later date (when the AAF and auEduPersonSharedToken have been in active use for some time), carry out an assessment of auEduPersonSharedToken and of how it has worked in practice.
Some repository access will involve changes to the content of the repository, for example: deposit of an object; editorial activities such as modification of an object’s metadata annotation of an object; administrative and maintenance activities. The question arises: in such a case, is it important for repository managers to be able to determine who carried out the action at a later date? This would place requirements ...more »
Some repository access will involve changes to the content of the repository, for example:
deposit of an object;
editorial activities such as modification of an object’s metadata
annotation of an object;
administrative and maintenance activities.
The question arises: in such a case, is it important for repository managers to be able to determine who carried out the action at a later date? This would place requirements on the persistence of user identifiers, and on traceability of users. Typically, IdPs in the Federation are not required to keep records beyond six months, and eduPersonTargetedID may in any case be recycled. However, the need for detailed user-related provenance data may be an unusual case, the exception rather than the norm.
The following distinct cases were identified:
(a) In some situations there is a legal/regulatory requirement for tracking user-related provenance at the level of the individual over the long term, for example, when dealing with medical data. The question arises as to whether the federation level is the best approach to dealing with such cases. The medical case is probably beyond the current federation model because of the need to identify individuals; if the institution were liable for legal infringements, then it might be expected to keep records of who made changes to such data. On the other hand, in other cases it may be better to avoid “non-standard” mechanisms that circumvent the federation.
(b) In many cases in which the repository managers will not care exactly who took a particular action. It will be enough to know that the action taken was, when it took place, under the control of a trusted institution, and managed in a reliable way (for example, it is unlikely that anyone would want to track minor changes to metadata at this level). In addition, many of these changes will take place within an institution, and can be tracked by other mechanisms, outside FAM.
(c) However, there are cases in which a repository would want to track exactly who made changes to an object. One example is changes to a document that is being edited by several people – look at what Google Docs does, for example . Another is where objects are annotated during the research process; for various reasons it would be desirable to know who made particular annotations. For example, an annotation would be viewed differently if made by a professor than if made by a PhD student (although this particular example could be handled by using sufficiently fine-grained eduPersonScopedAffiliation).
Note: This Scenario is closely associated with issues of personal identity management – see Scenario 5 below.
It was agreed that JISC should concentrate on investigating cases (b) and (c), to determine real scenarios and their implications. Case (a) should be considered out of scope for the moment..
i) The scenario has two general implications for the federation: user anonymity and the (lack of) persistence for user IDs. The federation is amenable to accommodating persistent user IDs, and this scenario provides concrete use cases to support the inclusion of ID persistence in the federation rules (further such use cases arise in grid environments). However, there is a problem within the institutions, which say that they are unable to guarantee that IDs won’t be recycled. JISC needs to determine whether (and how) HE IdPs can be helped to address this issue.
ii) Carry out a detailed and systematic analysis of use cases, to identify those in which these issues (anonymity and persistence) are important, and to determine what IDs would need to be kept, how long they would need to be kept for, whether this is a necessity or a ‘nice to have’, whether users would be willing to reveal their identity, what the DPA implications are, and so forth. This is important as most repositories will have to address this eventually. At a later date, JISC will need to work with institutions on implementing these use cases.
iii) Develop a general vocabulary for expressing “active” access to repositories, that is doing things in repositories other than just search/browse/read access, with the aim of achieving consistency in expressing these actions as entitlements. There is a definite need for such a vocabulary.
A similar use case to 2A is where a resource’s owner has attached additional conditions around accessing/using a resource, and individual users must agree to a licence before they are granted access. For example, users may be obliged to agree to use a resource only for educational or non-profit purposes, or to observe copyright This can be implemented using eduPersonTargetedID; that is, this attribute can be used to ...more »
A similar use case to 2A is where a resource’s owner has attached additional conditions around accessing/using a resource, and individual users must agree to a licence before they are granted access. For example, users may be obliged to agree to use a resource only for educational or non-profit purposes, or to observe copyright
This can be implemented using eduPersonTargetedID; that is, this attribute can be used to identify repeat visits by the same user (subject to the restrictions on attribute re-assignment described above). SARoNGS uses such an approach, based on a registration process where users provide additional information beyond that supplied by the IdP. However, it may still be worth JISC negotiating at the national level particular values on eduPersonEntitlement in these cases.
The UKDA (together with EDINA and MIMAS) has implemented a process whereby users are allowed to access distributed data collections when they have signed appropriate terms and conditions . In this mechanism, a central SP sets the eduPersonEntitlement value for the other SPs, which is conveyed to other SPs without the need for SPs to share attributes (which is not allowed). Ross agreed to share this solution.
None at present.
Related but more problematic scenarios occur where access crosses national jurisdictions. Again, we should identify separate cases that may require different treatment: (a) access within an HE institution that has international campuses (b) access between different institutions, whether this is a matter of long-term cooperation between institutions, or shorter-term international research projects that require cross-border ...more »
Related but more problematic scenarios occur where access crosses national jurisdictions. Again, we should identify separate cases that may require different treatment: (a) access within an HE institution that has international campuses (b) access between different institutions, whether this is a matter of long-term cooperation between institutions, or shorter-term international research projects that require cross-border access to restricted material in distributed locations.
The UK Federation is focussed (naturally) on the UK, although technically this is not an absolute restriction; for example, if an international publisher were to become a member it would have the same rights as UK institutional members . Some countries, but not as yet all, have set up Shibboleth-based access management federations of their own. Future identity management and access management strategies must be able to work in such globalised, cross-federation environments.
International federated access gives rise to data protection issues, in cases which require cross-border transfer of personal data (when it is possible to avoid exporting personal data, these issues can be avoided entirely) Data protection legislation is quite well aligned within Europe, but outside it is more difficult, particularly where personally identifiable information is involved.
In case (a), there is a single institution, so users at international campuses may have identities provided by the home IdP; the University of London, for example, has used Shibboleth for exactly this purpose. Even here, however, there may be data protection restrictions in moving data between different national jurisdictions. The situation becomes even more complex if some operations and services are sub-contracted to local companies within the overseas jurisdiction, as these may not be under the institution’s control, and it may be difficult to apply sanctions in case of breaches. Depending on the nature of the data, it may be acceptable for there to be some “leakage” of restricted information, so long as there is a policy that is managed proactively, and violation is kept within reasonable limits (e.g. in cases of copyright infringement).
Some work has been done on inter-federation agreements, both between different US federations (state federations and the national InCommon federation), and between the UK and US federations . In Europe, there has been discussion within TERENA about federating European federations , and the Kalmar Union has been established as a cross-federation of the national academic identity federations for the Nordic countries . From a technical perspective, there should be little problem within the EU as the member states follow European law and data protection legislation is quite well aligned within Europe . The questions here concern risk and the fabric of trust – how far are SPs willing to go in accepting attributes from international IdPs, and thus will it be possible to obtain equivalent levels of assurance across the board? There is in addition the issue of consistency in publishing attributes across Europe; this has turned out to be hard enough even within the UK. Of course, work on inter-federation agreements does not help for those countries that do not have a federation.
JISC Legal has recently completed work on issues raised by moving data across borders: Feasibility of a cross-jurisdiction Common Access Management Federation Agreement . Also relevant here is ongoing work by the Article 29 Working Party, which is addressing the protection and processing of personal data across the EU , and work by Andrew Cormack from JANET.
These developments are of interest to JISC and should be monitored. JISC intends to look at inter-federation issues in a forthcoming programme, initially focussing on getting agreement for UK-US federations, then testing this process more widely.
It would be useful if JISC Legal could provide some guidance to HEIs on what they can and cannot do (Note: they are not allowed to give advice, only general information).
By “non-published” material in the repository, we mean material that is only accessible to an individual or to an identified group of people. Examples are: (i) “work in progress”, where the author may only wish a small group of users to have access; (ii) research data that is stored in a repository while it is still being worked with, and where access is restricted to the members of the research group; (iii) research ...more »
By “non-published” material in the repository, we mean material that is only accessible to an individual or to an identified group of people. Examples are: (i) “work in progress”, where the author may only wish a small group of users to have access; (ii) research data that is stored in a repository while it is still being worked with, and where access is restricted to the members of the research group; (iii) research funding bodies that want access to project outputs for management and reporting purposes; (iv) external examiners requiring access to material in a local repository; (v) e-portfolio use cases, where an external assessor needs to access repository material (e.g. for a job application). In all these examples the access may cross institutional boundaries.
In examples (iii)-(v), a number of external individuals are granted access to certain resources held in the repository. These individuals are in general all independent of one another. Assuming that their home institutions belong to the federation, and that they are employed by the hosting institution on a relatively regular basis, the simplest solution appears to be an eduPersonEntitlement provided by their home IdPs.
The situation in examples (i) and (ii) is more complex; here, a repository forms (part of) a working environment, or “virtual research environment”. Some people may deny repositories that such a role is appropriate; however, such systems are being developed and are likely to become more common in the future. A more specific example is collaborative word processing, such as Scribd and GoogleDocs; although the outputs are not at present stored in repositories, this may change – indeed, a JISC-funded project is already doing this for GoogleDocs . The research data scenario is common in grid environments, which are frequently used to manage data from large scientific projects. In some cases different data environments are used in parallel for different roles (raw data, derived data, associated material, backup, etc.), for example in the LHC model.
Role-based access management, and delegation rights for access management, are of relevance here. Much technical work in this area has been done – for example, the PERMIS and DyVOSE (Dynamic Virtual Organisations for e-Science Education) projects. This approach requires some actor in the scenario to look after the policies –
in the “research group” scenarios (examples (i)-(ii) above) this might be the PI, in the “assessment” scenarios (examples (iii)-(v)) this might be the head of assessment.
A closely related issue is group management, and “virtual organisations”, in which access rights are given to groups of individuals (e.g. in VREs). In principle, groups could be defined by using eduPersonEntitlement to encode this information, but this would require the IdPs to be involved in group management, by setting up bilateral agreements between the SP and each of the IdPs. This is undesirable for a number of reasons: it would require coordination of the IdPs, which are external to the group, and whose managers may not appreciate the additional work; it would require the group to place significant trust in the IdPs, who would have full power to add the entitlement to individuals in their organisation. Another approach is to
An alternative is a system like VOMS (Virtual Organisation Management System), which is being used in the SARoNGS (Shibboleth Access to Resources on the NGS) project for grid access management . However, this is a heavyweight mechanism that may not be flexible enough to support all these use cases. It takes a significant amount of work to set up groups in this way (perhaps a week to set up a new VO). Although this is not too onerous for long-term research projects, in many cases it is necessary to set up ad hoc groups quickly from the bottom up. The VPMAN project has made some progress here, by bridging role-based privileges and VOMS, and this may be integrated into SARoNGS. The GFIV0 project is addressing these issues using Grouper. Note: this also relevant to Scenario 10.
Commision some early adopter projects piloting approaches to collaborative work in IRs. These projects should involve real users (teachers, learners and researchers) and should use existing collaborative tools. The projects could look either at collaborative authoring (e.g. of documents) or at collaborative work on research data.
In the case of research data, important issues to address are: (i) the granularity of the access decisions that need to be made; (ii) the need for lightweight group/VO management. These projects should build on previous work such as Grouper and GFIV0.
It would also be useful to produce a summary of relevant use cases and technologies, and to produce a synthesis.
A user’s interaction with a repository may be personalised, for example by saved searches or tailored notifications of new material. Personalisation is by no means specific to repository environments, and it is not clear that repositories raise any particular personalisation issues that do not arise in other environments. However, it is still worth mentioning in this context. Within JISC, the DPIE2 (Developing Personalisation ...more »
A user’s interaction with a repository may be personalised, for example by saved searches or tailored notifications of new material. Personalisation is by no means specific to repository environments, and it is not clear that repositories raise any particular personalisation issues that do not arise in other environments. However, it is still worth mentioning in this context. Within JISC, the DPIE2 (Developing Personalisation for the Information Environment 2) project investigated how the JISC Information Environment (IE) could make use of personalisation for users of JISC services, and the results of the study are contained in the final report . In particular, the report looks at how the UK Federation infrastructure could support personalisation, for example with extended attributes (Section 2.7 of the report), and some potential privacy and legal issues raised (Sections 4 and 5). The report also proposes some demonstrator projects (Section 7).
There is an issue created by the federation rules, which ‘prevent’ the sharing of attributes between SPs. In cases where users can access resources via multiple routes (i.e. via multiple SPs), this makes personalisation difficult. However, SPs are permitted to share attributes when they have a valid reason and they have obtained consent; this boils down to a decision based on balancing the need to manage risk with the potential benefit to the user.
JISC to keep a watching brief on this area, which in any case is a focus for the JISC IE.
It may be necessary to make a resource available to different degrees in the case of different users. Examples of this are: (i) medical data containing personal information may need to be anonymised for one set of users, but not for (say) the patient’s own doctor(s); (ii) some parts of a thesis may be made available immediately, others only after a certain period has elapsed; (iii) a thesis may contain copyrighted material ...more »
It may be necessary to make a resource available to different degrees in the case of different users. Examples of this are: (i) medical data containing personal information may need to be anonymised for one set of users, but not for (say) the patient’s own doctor(s); (ii) some parts of a thesis may be made available immediately, others only after a certain period has elapsed; (iii) a thesis may contain copyrighted material that cannot be made available, such as extracts of audio, sheet music or lyrics in a music thesis.
It may be required to hide datasets at different levels of granularity: hiding entire datasets, hiding rows, hiding columns. Some grid data management systems allow access control to be defined for individual rows, although this is harder for columns.
A number of project are looking at “marking up” data in some way to define what people can access: ASPiS and iREAD (investigating this in context of iRODS data grids); SPIDER (investigating creation of perimeters around certain subsets of data); AGAST (using RDF to define restrictions).
The metadata may also be subject to access control, both for humans and for machines, for example web crawler robots. EGEE projects such as AMGA can mark up metadata in this way . A special case of this is when even the knowledge that particular material exists is subject to restrictions, for example in the case of certain types of medical material. Consequently, access management needs to be applied to metadata as well as to the resources themselves, and must be taken into account when carrying out (federated) searches.
Not all access to repositories will involve a researcher sitting at a web browser. Other possibilities include the following, neither of which is currently handled easily by Shibboleth: (a) Access from a desktop client, which may allow updates in some form (e.g. a metadata editor, a client performing multiple inserts). Examples of relevant projects and approaches include: HERMES (see below); SWITCH (the Swiss access ...more »
Not all access to repositories will involve a researcher sitting at a web browser. Other possibilities include the following, neither of which is currently handled easily by Shibboleth:
(a) Access from a desktop client, which may allow updates in some form (e.g. a metadata editor, a client performing multiple inserts). Examples of relevant projects and approaches include: HERMES (see below); SWITCH (the Swiss access federation), which provides access to EGEE (grid) resources; Endnote (Z39.50); use of myProxy for access.
(b) Access that does not immediately involve humans at all, for example from a workflow that consumes digital objects from a repository and inserts new digital objects (although ultimately there will be a person involved somewhere, e.g. a researcher executing the workflow). Typically these operations on a repository will occur via web services that are exposed by the repository. Other relevant approaches: n-tier authentication and authorisation.
HERMES is part of the ARCHER project (see above), and is a desktop client tool for browsing research data and metadata held in SRB-based data grids. What is interesting from our point of view is that, although it is not a web-based tool, it supports Shibboleth-based authorisation by using Identity Selector technologies, specifically CardSpace (on Windows) or DigitalMe (on Linux or Mac OS), to interface to Shibboleth IdPs.
This solution is however highly specific to the particular task. It is not clear whether such workarounds as are used in these cases offer generalisable solutions, nor whether the newer version of Shibboleth – 2.0 – offers any advantages for these scenarios; the UK federation currently uses version 1.3. A comparison in this regard would be useful.
Complexities can also arise when carrying out federated searches across distributed licensed/restricted resources that require AuthN/AuthZ, for example using MetaLib. MIMAS has to combine a number of forms of AuthN/AuthZ (including, say, different usernames/passwords) to handle such searches, as not all resources will have http/browser access . The situation can also be further complicated when a hosted Metalib instance is used and the same IP(s) are shared by different institutions with differing entitlements.
There is significant interest in and need for an approach to Shibboleth that works outside http; this is demonstrated by the number of ad hoc solutions and workarounds that are being developed. The fragmented, ad hoc nature of these solutions makes the need for a generic approach all the more urgent. Internet2 is interested in supporting non-browser access, and may take this interest further if sufficient demand could be demonstrated. There are various relevant activities ongoing at Internet2, including REST interfaces, Kerberos, etc. , but it may be better not to wait for Internet2 to take the initiative.
Document the various ad hoc solutions and workarounds, and on the basis of these examples identify and specify best-practice workarounds. Also, define some detailed use cases to help spell out the requirements in this area more clearly. SDSS may be in a good position to carry out this work and pass it on to the Internet2 team as a basis for development work.
This scenario covers two cases: (a) users at institutions that are not in the federation (or which do not possess an IdP); (b) users that are not affiliated with an institution at all. Different solutions may be appropriate for these cases An example of (a) is provided by academic research groups that cross over into non-academic environments. A case worth highlighting is in medical research, where a group needing access ...more »
This scenario covers two cases: (a) users at institutions that are not in the federation (or which do not possess an IdP); (b) users that are not affiliated with an institution at all. Different solutions may be appropriate for these cases
An example of (a) is provided by academic research groups that cross over into non-academic environments. A case worth highlighting is in medical research, where a group needing access to a dataset (and in particular a dataset that has particular security and privacy issues) may well include academics and employees of NHS trusts that are associated with the university, but are not members of the university. Another example: users from commercial organisations, who may, for example, be collaborators in a research project; the issue is particularly pressing when the collaboration is sporadic or short-term and the partners are SMEs, rather than for continuous collaboration with large industrial partners. The JISC-commissioned study on access and identity management in BCE (Business and Community Engagement) contexts is of importance here, as it contains several relevant use cases involving commercial organisations.
For long-term relationships, such as that between a university medical school and an NHS hospital, the easiest solution may be to set up one’s own federation to cover the users involved. Although there is some administrative burden involved, this can be achieved easily within the current framework, as has been done at Kidderminster College and Cardiff University, each of which has its own internal IdP and SP. Such arrangement are also quite common in the US .
Shorter-term projects could also use this approach, although the overheads in setting it up would be proportionally greater. A simpler approach would be to use collaboration mechanisms that lie outside the federation (e.g. Google Apps, or whatever), then rely on staff within the institution for interaction with the “real” repository (e.g. depositing data). This approach is more informal, but we need to avoid developing overcomplicated solutions that do not offer real benefit. It would be useful to identify and promote good practice for using such third party tools and services.
Examples of (b) are private researchers/scholars (not unusual in some disciplines, such as the humanities and astronomy), and members of the general public or specific communities who are submitting information in a Web 2.0-type environment, e.g. for cultural, anthropological or social history programmes .
Various ways of approaching such cases were discussed:
• It may be possible to incorporate independent researchers by allowing someone in a federated institution, or via a professional society (such as the British Academy), to vouch for them and take responsibility for them. However, it is not clear that professional societies do a high level of identity checking, so the level of assurance would be low.
• OpenID is sometimes proposed as a potential solution to this sort of problem; while it has its uses in the wider world, it may have limitations regarding security and privacy that make it unsuitable for at least some of the cases that we have in mind . It offers a low level of assurance, but could be useful for repositories where this is sufficient, e.g. social networking extensions to repositories. Signing up for a federation requires an organisation to agree to certain rules; on the other hand, obtaining an OpenId does not constrain a user to agree to anything much. A JISC-commissioned report by EDINA on OpenID has just been published, and although a number of vendors are now issuing Open IDs, it turns out that no IS/IT director would accept an OpenID for access to any of their Services, including their repository.
• There a a number of third-party services that offer user IDs, for example ProtectNetwork and TypeKey , but it is not clear how many SPs accept these IDs – they are used mainly for blogs and wikis.
• Adopt the “home for the homeless” approach used by SWITCH in Switzerland, whereby independent users can register with a special IdP that is included in the federation. This as some similarities with OpenID, and the level of assurance may be low, depending on the level of vetting that is applied to applicants.
• One way of obtaining greater assurance would be to adopt an approach similar to that used for obtaining certificates, where local Registration Authorities require personal attendance and high-assurance photo IDs (such as passports) from applicants. Perhaps non-institutional researchers could register at a Post Office, or some other widely distributed and accessible body. There could also be a digital equivalent of the mechanism whereby university libraries allow access to non-members from other institutions.
• It may be possible to exploit the Government Gateway scheme, which can be used to access online government services such as the Inland Revenue, as an independent IdP. This scheme would provide a high level of assurance that the ID was genuine, as applicants require an NI number and lots of checks are carried out, and SPs might thus be more inclined to accept it.
It was concluded that some form of external IdP scheme would be appropriate. Such a scheme has been proposed for the federation before, but until now there has been no proven demand for it . It would be easy to set up but would require significant effort to maintain – we would need to identify potential bodies to run these IDPs, bodies that are trusted enough to provide sufficient assurance for most use cases. We must also determine the level of need for this – among NGS users, approximately 96% of certificated use of the NGS could be covered by federation IdPs ; however, these are probably not typical of the users that we want to cover. Note that JISC has tried to commission work around this before, but received no response to the ITT.
Identify the available tools and services supporting collaboration (outside federation), determine thire advantages and disadvantages, and document best practice for using them.
Commission study into external IdP schemes, as described above. This study would need to: (i) examine and quantify the need for such a scheme in the community; (ii) scope out use cases based on real user needs; (iii) identify the associated sustainability issues; (iv) identify bodies that may be able to maintain the IdP.
Many of the digital objects in a repository will continue to exist over the long term, possibly with modifications relating to preservation requirements. It is essential from a preservation point of view that provenance/audit information is preserved about these objects (who did what, when and how); otherwise the authenticity of the objects and the trust placed in the repository cannot be guaranteed. These scenarios also ...more »
Many of the digital objects in a repository will continue to exist over the long term, possibly with modifications relating to preservation requirements. It is essential from a preservation point of view that provenance/audit information is preserved about these objects (who did what, when and how); otherwise the authenticity of the objects and the trust placed in the repository cannot be guaranteed. These scenarios also require personal identification that is either persistent and immutable, or at least that accommodates changes to the personal identifiers (e.g. by allowing identifiers for a common individual to be linked).
This scenario may be regarded as a special case of Scenario 3 (b)/(c) above, and actions for JISC would be special cases of the actions arising from Scenario 3. Specifically, we would need to develop a model of the actions that can be taken and the events that can occur in curation, and analyse the corresponding curation use cases. JISC could use its contacts with the curation/preservation community to determine to what extent it is important to record “who” (person name, role, etc.) took particular actions.
This website aims to provide initial points and facilities to stimulate discussion, not to carry out a complete survey of all related issues. It contains the following sections: • A broad statement of the issues being addressed. • An overview of the current approach to Federated Access Management (FAM) in the UK HE environment. • A set of brief scenarios describing different aspects of FAM in digital repository environments. ...more »
This website aims to provide initial points and facilities to stimulate discussion, not to carry out a complete survey of all related issues. It contains the following sections:
• A broad statement of the issues being addressed.
• An overview of the current approach to Federated Access Management (FAM) in the UK HE environment.
• A set of brief scenarios describing different aspects of FAM in digital repository environments. The list does not aim at completeness; it is an initial list to provoke discussion.
The scope of the consultation assumes that federated access management is a requirement for repositories; it does not infer this as a solution to previously-identified requirements. This assumption was made by JISC as a pragmatic approach to developing a tentative programme of work, which will be subject to further iterative development. The aim of the consultation is to identify scenarios related to teaching, research and administration, in which repositories are involved and where access management is an issue; and then to gain feedback from people experienced in FAM and repositories about the issues raised. Some of the scenarios relate to identity management rather than to access management per se; however, these scenarios arose naturally out of FAM considerations, and of repositories as service providers within the federation. A key question is how to prioritise the scenarios with a view to future funding programmes by JISC, taking into account criteria such as: importance and urgency, feasibility of implementation, and whether solutions are better implemented within or beyond the scope of the federation.
In addition, there are various projects looking at issues of federated access management, some developing software, others looking at procedures and policies. These projects do not all address repositories in the narrow sense; they do, however, address in various contexts issues that are also of relevance to repositories, and it is useful to consider how we can make best use of the results of this work, and determine which outputs are usable, sustainable, open, standards-conformant, of high quality, generalisable, and so forth. In particular, some projects produce a solution by getting round a restriction imposed by the federation, which leads to consider not only what can be achieved within the federation as it currently stands, but also how requirements arising from the demands of repositories may influence the direction of the federation of the future.
The results of this consultation will be collated into a fuller report on the issues surrounding FAM in digital repositories, which will inform JISC funding decisions.
2. Problem statement
Digital repositories are changing, both in the type of content that they hold and in the ways in which they are used. Earlier exemplars managed relatively simple objects, such as pre-prints and publications, but increasingly institutions are starting to use repositories to manage complex research data in a variety of disciplines, in part as a result of various programmes funded by the JISC . Whereas a major motivation in setting up and populating repositories has been (and is still) to make the results of research available to a wider audience, where possible following open access principles, this broadening of content raises important issues of access management.
In addition, repositories are moving beyond the stand-alone model, towards more sophisticated scenarios in which repositories are integrated components of wider research and teaching infrastructures that incorporate a variety of services, tools and workflows . Digital repositories, although often managed on an institutional basis, are no longer separate silos of data, but form part of a wider ecosystem of dispersed data, services and actors supporting collaborative research and education. New hybrid networks are emerging, where formal documents in institutional repositories stand alongside more informal documents such as blogs and wikis.
As repositories become integrated into such wider networks, we are faced with new access management challenges. Repository architectures have to control access not only to single, isolated systems, but to cross-institutional federations of data and services. Repositories have to be ‘sure’ of the users who access the services they provide, and also of the authenticity of the digital objects they contain. Authentication, authorisation and identity management require policies and mechanisms that work across institutions and jurisdictions. There is a need to examine these requirements, and uses made of repositories and repository services, in the context of the UK Access Management Federation.
3. FAM in the UK: the current situation
Federated access management in the UK higher education community is provided by the UK Access Management Federation for Education and Research , which is supported by JISC and Becta, and operated by JANET (UK), and is based on Shibboleth, a solution that has also been adopted by other national federations, e.g. in the USA, Australia, Germany and Switzerland. Developed by Internet2/MACE, Shibboleth is based to a great extent on the OASIS Security Assertion Markup Language (SAML). SAML 1.0 became an OASIS standard in November 2002, and a major revision (SAML 2.0) was released in March 2005. In essence, SAML is an XML-based language for defining the exchange of authentication and authorisation data, and it also defines functions to create and manage federated networks that combine and appropriately share pre-existing identity information.
Shibboleth supports cross-domain attribute-based authorisation while preserving user privacy. The UK federation allows a participant organisation to join as an Identity Provider (IdP), allowing its users to access resources throughout the federation while managing their identity locally, and/or as a Service Provider (SP), allowing resource managers to control access to restricted resources to users from both within and outside their home institutions.
At a technical level, the Shibboleth mechanism and information flow is essentially linked to the http protocol and to browser-based access. An IdP issues SAML assertions to SPs upon request, and an SP consumes SAML assertions obtained from IdPs for the purpose of making access control. A third component, a WAYF (“Where Are You From?”) service, is set up to broker trust between participating organisations with common needs . At a technical level, the WAYF defines the federation: the SPs that the WAYF will redirect, and the IdPs that the WAYF provides as options.
On a policy level, the federation is wedded to the concept of authorized but anonymous access to resources. The only attribute that an IdP in the UK Federation is obliged to release is eduPersonScopedAffiliation, which, for example, may indicate that a person is a staff member at King’s College London, although already some services require more information to be released from an IdP . There is some requirement on traceability from the SP to the IdP: it is possible for an SP manager to identify, from the Session Identifier associated with an access, the IdP with which the user is associated. The federation rules require the IdP to take action if the user misuses privileges, but still the user’s anonymity is maintained. Optionally, if the institution subscribes to Section 6 of the Federation rules , institutions can offer stronger identity assurance.
As well as being subject to release restrictions, with a view to preserving anonymity, personal identifiers used within the federation are also not persistent. There are two commonly used attributes that can identify an individual: eduPersonPrincipalName and eduPersonTargetedID. The former is commonly regarded as the federation equivalent of a username, and typically resembles an e-mail address, combining a locally unique identifier with a domain to indicate the scope of the identifier. For the privacy reasons mentioned above, this attribute is not in general released to SPs. The latter is an opaque identifier that provides an SP with a unique identifier for an individual while maintaining that individual’s privacy. The eduPersonTargetedID attribute is unique for each combination of person and SP, and will be the same each time that person returns to the same SP.
However, in neither case is the attribute persistent in the long term; the values can be re-assigned to other users after 24 months, or even earlier if the user’s organisation does not subscribe to Section 6 of the Federation rules. Moreover, they are local to the individual’s institution – if a user moves institutions their identifiers will change, and there is no way of linking them – and moreover there is no guarantee that a value will not be reassigned at some point in the future – thus at different times the same attribute value will correspond to different people. While this is not an issue when managing access to library-style resources, it is important in other scenarios. The federation is aware of the need for a rule preventing such attribute re-use, but currently many institutions would be unable to meet such a requirement.
The Federation encourages publication only of the four standard attributes that it defines as core: eduPersonTargetedID, eduPersonPrincipalName, eduPersonEntitlement, and eduPersonAffiliation . Note also that the values assigned to eduPersonAffiliation can have different levels of granularity, e.g. an institution can publish “staff” and “student” or just “member”. The values of eduPersonPrincipalName are to some extent standardised but require common interpretation within the Federation. The values of eduPersonEntitlement are more open-ended (but see (ii) in next section).
Initially, much of the use made of the UK federation was based around the model of managing web browser (read) access to online, library-type resources (often from publishers), as a federated successor to the Athens system. Shibboleth was, however, developed (in the US) with a view to using it in a variety of institutional applications, and there are now a number of other sorts of application using Shibboleth, for example blogs, wikis, video-conferencing and e-mail systems; although, at least in the UK, such applications are not made available via SPs within the federation, but are used purely within the institution . There is also increasing support for Shibboleth-managed access to other, non-publisher, resources, for example JISCmail, JORUM, Google Apps , iTunes U (supplying access to online course content), DreamSpark (supplying Microsoft development tools to students) .
However, digital repositories were possibly not high priority when the federation was set up, and people are increasingly struggling against the set-up of the federation to make it do what they want to do. In these circumstances, it seems to be an appropriate time to identify and spell out these use cases and requirements.
One potential way of resolving the conflict between privacy requirements and the need to identify users is to investigate scenarios in which users are asked for their consent to the release of personal information. The DPIE2 (Developing Personalisation for the Information Environment 2) project has considered this (see below), and there are some related current and upcoming JISC ITTs: the Identity Management Toolkit ...more »
One potential way of resolving the conflict between privacy requirements and the need to identify users is to investigate scenarios in which users are asked for their consent to the release of personal information. The DPIE2 (Developing Personalisation for the Information Environment 2) project has considered this (see below), and there are some related current and upcoming JISC ITTs: the Identity Management Toolkit and Personalisation: developing practice for user consent for data , which will look at how user consent might work in a UK HE context.
In SARoNGS, a project that is looking at Shibbolising access to the NGS, the anonymous eduPersonTargetedID supplied by the IdP is linked to an identifier that identifies the user, specifically (in this case) the user’s email address. However, this is requested from the user, not from the IdP (which would in general be unable to supply such personal data).
On a technical note, Shibboleth 2.2 will support attribute release with user consent, based on the SWITCH ArpViewer (ARP = Attribute Release Policy).
None. This scenario is covered adequately by the current and upcoming JISC ITTs mentioned above.
Organic access management, where the resource owner is directly involved in determining who has access to resources under their control. The central idea here is that people are willing to share in self-selected communities. This scenario is likely to be most applicable to more informal resources (e.g. presentations, videos) in “social” repositories (e.g. Yahoo groups, MyFlickr). This is not to say that they are any less ...more »
Organic access management, where the resource owner is directly involved in determining who has access to resources under their control. The central idea here is that people are willing to share in self-selected communities. This scenario is likely to be most applicable to more informal resources (e.g. presentations, videos) in “social” repositories (e.g. Yahoo groups, MyFlickr). This is not to say that they are any less serious from an academic perspective – for example, the repositories of scientific workflows facilitated by the myExperiment project . Similar approaches have been taken up in the arts and humanities, where small, informal communities are using (e.g.) a wiki or blog as a “repository” and annotation environment, with access control managed locally by the administrator – examples are arts-humanities.net and The Digital Classicist . It would be useful to be able to trust people and to provide an ad hoc and lightweight mechanism for delegating the ability to grant access for particular resources.
If researchers are doing things for themselves, not via their home institutions, then this may fall outside the scope of what JISC needs to support. In any case there may be sustainability risks incurred by material being ‘out there’ on Google Docs, wikis, etc. If researchers are working through their institutional IdPs, this scenario is covered elsewhere, e.g. the group management work described in in Scenario 6 – although more lightweight mechanisms are desirable.
Projects relevant to this issue include Grouper , GFIVO , the SWITCH (Swiss access management federation) group management tool, and OAuth .
It may be a useful exercise to: (i) document these requirements in more detail, and (ii) determine how much such systems are used in HE. However, this is not of high priority for JISC as the scenario is for the moment sufficiently well covered outside the federation and in other scenarios.
Much work on digital repositories has focused on issues surrounding Open Access to research outputs (generally pre- or post-prints). If access to an object is truly open, then it may seem that access management is not an issue. However, we may distinguish 2 cases: The SP has no interest in who is accessing a resource. Access to resources is open, but the SP is still interested in capturing information about usage. ...more »
Much work on digital repositories has focused on issues surrounding Open Access to research outputs (generally pre- or post-prints). If access to an object is truly open, then it may seem that access management is not an issue. However, we may distinguish 2 cases:
The SP has no interest in who is accessing a resource.
Access to resources is open, but the SP is still interested in capturing information about usage.
In the latter case, FAM becomes an issue. On the one hand, personal information may be used for statistical purposes, rather than tracking individuals; however, information relating to specific individuals may be of interest. Two examples of the latter are: (i) recording usage data for accounting and audit purposes in grid environments such as the NGS (National Grid Service) ; (ii) arXiv, which blocks users indiscriminately downloading material via their IP addresses. Systematic collection of usage data for OA material may also be useful for marketing an IR, providing feedback to users, and development and dissemination of the arguments for OA among researchers and others.
A common theme which arises here is the concept of “user registration”, where users are required to provide information additional to the standard Shibboleth attributes the first time that they access a resource: examples include the SARoNGS (Shibboleth Access to Resources on the NGS) project for the NGS , and the EThOS system, which requires registration for access to Open Access e-theses
In addition, personal information may also be useful for personalisation, as in the GoldDust project. People are willing to trade personal information for benefits so long as this is done openly and up front. If users are required to register and log in then the repository is no longer Open Access, but with the carrot of added functionality, users may be encouraged to register on a voluntary basis. If this is not the case, people are likely to object; indeed, the current federation rules place strict restrictions on the uses that may be made of personal data, as anonymity is a current priority.
A general issue here is that the current federation rules place strict restrictions on how long logs can be kept for and on the uses that may be made of personal data, as anonymity is a current priority. However, there is some flexibility – logs can be kept for longer than the standard 6 months if the IdP agrees – and the rules are open to modification if an appropriate justification can be made .
For some use cases, the limitation on how long logs can be kept is not an issue; the raw information is needed only in the short term, and once the logs have been used to generate statistical information it is not clear that they are needed further . In any case, the logs will be of little use in the long term as the IdP will not maintain in perpetuity the mapping between eduPersonTargetedID and named individuals. On the other hand, if the repository management wants to look at longer-term trends, and the effects triggered by external events (such as declarations and other publicity related to OA), then it is difficult to predict what raw data may be needed for analysis in the future. These considerations are closely linked to recent JISC work on Usage Statistics .
Anonymity can work to the disadvantage of users. For example, users may not always interact with repositories in the most effective way; in such circumstances, the repository managers would be able to help the users if they were able to trace and contact them, but this will not be possible unless the IdP agrees to put the SP in touch with the user . It would be possible to ask users whether they want this sort of feedback the first time that they access the repository, but means that the SP would have to trust them to enter their information correctly, and would take upon itself the burden of maintaining the data – it would be better to push this back onto the IdP where it belongs.
An important theme that arose was ‘registration’ – getting people to provide additional personal information the first time that they access a repository This was recognised as an area of work common to several scenarios, e.g. personalisation, consent management.
A resource may be published and widely available yet still subject to access restrictions within a digital repository (SHERPA/RoMEO maintains an extensive database of the copyright and self-archiving policies of publishers). For example, a publisher may place restrictions on accessing a post-print in an institutional repository to subscribers to the journal in which it was included . This is analogous to a library allowing ...more »
A resource may be published and widely available yet still subject to access restrictions within a digital repository (SHERPA/RoMEO maintains an extensive database of the copyright and self-archiving policies of publishers). For example, a publisher may place restrictions on accessing a post-print in an institutional repository to subscribers to the journal in which it was included . This is analogous to a library allowing its members to access online journals to which it has subscribed; however, the situation is more complex, as the resources are scattered in IRs and not grouped according to publishers’ access regimes (in a conventional library situation they are grouped within journals). An IR can not be expected to manage information about which institutions have access to a particular journal.
It may be possible to support this sort of scenario by using the eduPersonEntitlement attribute – the value of this attribute must be a URI indicating a (set of) right(s) to a particular resource. To be useful in a federated environment, however, the appropriate values would have to be agreed at a federation level, otherwise such an SP would have to negotiate bilateral agreements with all its subscriber IdPs individually to get the extra attributes, which is possible but probably impractical. A way of avoiding this would be for JISC to negotiate standard entitlements at a national level to enable repositories to share material, by including T&Cs for (e.g.) NESLI (National e-Journals Initiative) material. Currently, JANET makes no statements about any standard values for eduPersonEntitlement.