Rank2

Idea#13

This idea is active.
Scenarios »

Scenario 3: Updates to repository

Some repository access will involve changes to the content of the repository, for example:

 deposit of an object;

 editorial activities such as modification of an object’s metadata

 annotation of an object;

 administrative and maintenance activities.

The question arises: in such a case, is it important for repository managers to be able to determine who carried out the action at a later date? This would place requirements on the persistence of user identifiers, and on traceability of users. Typically, IdPs in the Federation are not required to keep records beyond six months, and eduPersonTargetedID may in any case be recycled. However, the need for detailed user-related provenance data may be an unusual case, the exception rather than the norm.

The following distinct cases were identified:

(a) In some situations there is a legal/regulatory requirement for tracking user-related provenance at the level of the individual over the long term, for example, when dealing with medical data. The question arises as to whether the federation level is the best approach to dealing with such cases. The medical case is probably beyond the current federation model because of the need to identify individuals; if the institution were liable for legal infringements, then it might be expected to keep records of who made changes to such data. On the other hand, in other cases it may be better to avoid “non-standard” mechanisms that circumvent the federation.

(b) In many cases in which the repository managers will not care exactly who took a particular action. It will be enough to know that the action taken was, when it took place, under the control of a trusted institution, and managed in a reliable way (for example, it is unlikely that anyone would want to track minor changes to metadata at this level). In addition, many of these changes will take place within an institution, and can be tracked by other mechanisms, outside FAM.

(c) However, there are cases in which a repository would want to track exactly who made changes to an object. One example is changes to a document that is being edited by several people – look at what Google Docs does, for example . Another is where objects are annotated during the research process; for various reasons it would be desirable to know who made particular annotations. For example, an annotation would be viewed differently if made by a professor than if made by a PhD student (although this particular example could be handled by using sufficiently fine-grained eduPersonScopedAffiliation).

Note: This Scenario is closely associated with issues of personal identity management – see Scenario 5 below.

Proposed actions:

It was agreed that JISC should concentrate on investigating cases (b) and (c), to determine real scenarios and their implications. Case (a) should be considered out of scope for the moment..

i) The scenario has two general implications for the federation: user anonymity and the (lack of) persistence for user IDs. The federation is amenable to accommodating persistent user IDs, and this scenario provides concrete use cases to support the inclusion of ID persistence in the federation rules (further such use cases arise in grid environments). However, there is a problem within the institutions, which say that they are unable to guarantee that IDs won’t be recycled. JISC needs to determine whether (and how) HE IdPs can be helped to address this issue.

ii) Carry out a detailed and systematic analysis of use cases, to identify those in which these issues (anonymity and persistence) are important, and to determine what IDs would need to be kept, how long they would need to be kept for, whether this is a necessity or a ‘nice to have’, whether users would be willing to reveal their identity, what the DPA implications are, and so forth. This is important as most repositories will have to address this eventually. At a later date, JISC will need to work with institutions on implementing these use cases.

iii) Develop a general vocabulary for expressing “active” access to repositories, that is doing things in repositories other than just search/browse/read access, with the aim of achieving consistency in expressing these actions as entitlements. There is a definite need for such a vocabulary.

Comment

Submitted by Neil Jacobs 5 years ago

Vote Activity

  1. Agreed
    5 years ago
  2. Agreed
    5 years ago
  3. Agreed
    5 years ago

Comments (4)

  1. I think this scenario (or an alternative one) should just consider the much simpler (and much more common!) use-case of deposit, effectively asking the question:

    Does the UK Federation, as currently manifest, provide a useful infrastructure for managing federated access to a distributed network of repositories for the purposes of depositing research papers.

    Issues to be investigated might include things like: does a researcher have to maintain multiple profiles at multiple repositories? how does the system handle situations such as one researcher being the author (and possibly the depositor) of multiple papers in multiple repositories?

    I think this is a sufficiently complex scenario to warrant further work without getting into all the complexities outlined in this scenario as it is currently worded.

    5 years ago
  2. Unsubscribed User

    This is quite similar to scenario six as we are talking about role-based access, the roles are just read, write, deposit, annotate rather than group based roles.

    5 years ago
  3. Answering Nicole's comment, there's a difference between groups (such as people studying a particular course) and privileges (being able to deposit) from an administrative POV, because it's often easier to use existing, already defined, groups and assign them a privilege in the repository software - or in the SP - rather than giving them a new group membership (those who can deposit) in a system which is often relatively difficult to manage. This is especially true if you want to manage privileges on a small level: give students on a course plus two named individuals the right to deposit in this collection, but no one else is much simpler to set up at the SP end than the IdP end. But it's a minor point, and the two are pretty much interchangeable.

    5 years ago
  4. I think that it would be less confusing to have one scenario for the actions which do require identification of an individual and actions which can be anonymised. What these actions are would depend on the purpose and the configuration of the repository and its associated SP.

    targetedID provides the possibility of pseudonymous actiions as well, of course: the phrasing suggests that a targetedID might change pretty frequently (as it's in same sentence as the six monthly log retention being mentioned). This is unlikely to happen, unless the IdP is forced to make a blanket change (as happened recently to some when it became apparent that with some configurations clashes were likely), as the data is likely to be based on information which remains constant for at least the length of time that a user is affiliated to an institution or longer (we use information which will never be reused for another user to generate ours targetedIDs, for example). Part of the point of targetedIDs is, indeed, that they should rarely change. Obviously, a user moving to another IdP will produce a change, but at that point it is quite likely that all their entitlements to update a repository are going to change too.

    There is no particular reason why SPs have to be anonymous. If there is a need to obtain personal information, then it is possible to ask for it. It's better to do this by persuading the IdP management involved that it's needed, because then there can be associated procedures and management controls in place, but it would also be possible to ask for extra information from a user before they gain full access to the SP and repository. If we don't start thinking about using FAM for scenarios which do involve full user identification, then we will never really move to anything more complex than the anonymous-user-accessing-online-resources use case. So I think that action ii) is important here; it is something we wanted to do in the FAR project, but never really got to.

    5 years ago