Skip to main content

Best Practices for Managing User Identifiers

Introduction

This document presents best practices for assigning and managing unique identifiers to the users of computer systems in medium to large organizations. It begins with definitions and background information, then proceeds to explain scope, uniqueness, business processes, challenges and best practices.

Defining user identifiers

What is a user identifier, or ID for short?

Technical definition:

Multi-user computer systems often need to identify users, so that access to applications and data can be controlled, logged and attributed to people. Computers refer to people using unique numbers or strings of characters. These numbers or character strings are user identifiers.

User-centric definition:

Users have a variety of identifiers, which uniquely identify them in some context. Examples in the IT environment include operating system login IDs, e-mail addresses, employee numbers. Examples from day-to-day life include driver's license numbers, credit card numbers and passport numbers.

Different types of identifiers

In the context of a medium to large organization, users often have at least the following identifiers:

  1. An employee number.
  2. At least one network login ID.
  3. Possibly additional login IDs to a variety of applications.
  4. At least one e-mail address.

This document offers guidance to organizations regarding the management of these corporate user IDs.

Scope and uniqueness

An ID must uniquely identify a person within a defined scope.

For example, since no two users can have the same login ID on an application, the application can be thought of as an identification domain, within which each user has a unique ID.

Unique IDs commonly have a scope drawn from the following list of possibilities:

Scope

Examples
Single system or application

Active Directory domain, RAC/F security database.
Single organization

Employee number, standardized cross-application login ID
Sub-national

Driver's license, voter number.
National

Passport number, federal tax number.
Global

Fully qualified e-mail address.

 

In general, the scope over which an ID is unique can be expanded by appending the context where it was defined. This can be illustrated with some additional examples:

Original scope

Example

Append

New scope

Example
Single system

JSMITH

Application name

Organization

JSMITH@App01
Single organization

JSMITH

Organization name

Global

JSMITH@Acme.com
State/province

DL 1341135-013

Jurisdiction

National

DL 1341135-013@NewYork
National

QC0318876

Country code

Global

QC0318876 from Canada

 

When identifiers are assigned

When discussing how identifiers are assigned, it is helpful to consider when they are assigned. Here are some examples:

  1. At birth -- as happens in some jurisdictions for government IDs, social insurance numbers, etc.
  2. When joining an organization -- enrolling as a student, starting a new job, etc.
  3. When being granted a new login ID to a system or application.

Identifiers are sometimes changed as well -- for example following name changes, which in turn often follow marriage or divorce.

Machine-readable versus human-readable identifiers

People find it easier to remember and enter memorable strings of characters. On the other hand, computers are able to assign numeric identifiers which are guaranteed to be unique in some scope. This leads to two broad categories of identifiers:

  1. Human-friendly identifiers, such as e-mail addresses and login IDs.
  2. Computer-friendly identifiers, such as globally unique IDs (GUIDs) - which are strings of 32 hexadecimal digits.

Computer-friendly identifiers often have the benefits of being unique in a larger scope and of never changing during the lifecycle of a user. In contrast, user-friendly identifiers are less unique (unique only in a smaller scope) and more volatile, but are easier for people to manage.

Desirable attributes of identifiers

(1)

Following is a list of desirable characteristics of user IDs. When designing an algorithm to assign IDs to users or business processes for managing user IDs, it is helpful to consider each of these and to develop a process which satisfies as many of them as possible.

  • Identify a person, not a position:

    Identifiers should refer to people, not to positions. People often move from one position to another and changing their identifier when this happens is a nuisance and creates inconsistencies in audit logs.

  • User friendly:

    Identifiers should be reasonably easy to remember and short enough to enter quickly. Long and hard-to-remember IDs should be avoided unless they are only used by machines.

  • Easily recognizable:

    It is helpful for users to be able to recognize that a string of characters is a user ID on casual inspection. In other words, user IDs should be constructed in an easily recognizable format. This is helpful both for users, when reading text that contains IDs, and for automated processes, which can scan log files, scripts, network traffic or other data sets for user IDs.

  • Reusable:

    It makes sense to assign the smallest possible number of identifiers to a user and to reuse existing identifiers where possible. This is more user friendly, less troublesome to manage and easier to audit. In short, use an existing identifier if possible, rather than creating a new one. Standardized identifiers across as many systems as possible.

  • Compatible:

    Identifiers are often used on a variety of systems. For example, a user might type the same identifier to sign into Windows / Active Directory, into a mainframe using RAC/F and into an ERP running SAP. Each of these systems will have different constraints on the allowable length and characters that can comprise an identifier. In order to support reuse (previous objective), it makes sense to assign identifiers that are compatible with the largest possible number of systems.

  • Maximum scope:

    Different systems may have different, overlapping user populations. It makes sense to assign identifiers which are unique over the largest possible scope, so that they can be reused by the largest possible number of systems.

  • Unchanging: (2)

    Identifiers assigned to a user should be designed so that they never have to be changed. Changing identifiers is an administrative burden and leads to inconsistencies in audit logs,

    Changes in user IDs can create significant operational problems. For example, the ID may appear on multiple systems, making it costly to change. Changing the ID would create a discontinuity in audit logs, perhaps violating security policy. The ID may be embedded in programs or scripts, which would stop working after the change. The ID may be known to other users, who would have to be informed of the change.

  • Never reused:

    Identifiers should never be reused. For example, when a user leaves an organization, that (old) user's identifier should never be assigned again, to another (new) user. Doing so can have undesirable and unexpected consequences, such as the new user acquiring security access rights from the old user's profile. This means that a repository of every identifier that has ever been assigned must be maintained, rather than just a repository of currently-in-use identifiers.

  • Not offensive:

    People have an amazing ability to read meaning into meaningless strings of characters. This leads to situations which range from humorous to offensive, where identifiers are assigned to users, often by automatic processes, which users can read -- literally or with "poetic license" to have colorful or offensive meanings.

    This problem suggests that a human review process is often needed when new identifiers are assigned, so that they can be vetted and perhaps replaced if they are found to be offensive.

  • Cross-language: (3)

    Many organizations span countries, languages and cultures. In this context, a question of cultural, rather than just technical compatibility arises. For example, would a uniligual English speaker be able to read, remember or type an identifier for a co-worker if that identifier is in Kanji (Japanese)?

    Since identifiers may have to be accessible by multiple users, it is important to consider the ability of users fluent in different languages to read and enter them.

  • Accessible only within an appropriate scope:

    In some cases, an organization may consider identifiers to be confidential. This is true in the legal sense with some identifiers, such as social security numbers. Confidentiality of identifiers may also be considered a secondary line of defense against security attacks such as automated password guessing.

    Since users often have to know, remember and enter their own identifiers, confidentiality means limiting the visibility of identifiers to just authorized users and not disclosing information about whether an identifier is valid to unauthorized or unauthenticated users.

Addressing challenges in identifier management

Some challenges arise in most organizations in the course of assigning new or managing existing identifiers. These are described below:

  • Collisions:

    If the algorithm used to assign unique IDs to users is based on users' names then users with identical or even similar names may be assigned the same identifier. This obviously needs to be rectified.

    For example, an organization may employ 10 people with the (common among English speakers) name Michael Smith. If IDs are assigned using the algorithm "last name plus first initial" then they would all be assigned the ID "smithm." Assigning the same ID to multiple users would defeat the purpose of IDs -- unique identification -- so the algorithm must be adjusted to eliminate these collisions. This may be done by appending one or two digits to the IDs above, for example.

  • Name changes:

    Where IDs are assigned using an algorithm based on the user's name, in the event that the user's name changes (for example, due to marriage or divorce) the user may wish the in the event that the user's name changes (for example, due to marriage or divorce) the user may wish to change his ID to match his new name.

    Changes to user IDs are undesirable, as described in (2).

  • Short names:

    Where IDs are based on user names, the algorithm used to calculate IDs may produce unsatisfactory results for users with short names. For example, two common Chinese surnames are written (in English) as Wu and Li. An organization with many Chinese users and IDs based on surname might have many collisions and require two or more extra characters appended to IDs, to make them unique. These unique suffixes are hard to remember and tend to lead to confusion, such as e-mails intended for one user being sent to another.

  • Changes in user role or status:

    Where IDs are based on a user's role (e.g., which department he works in) or status (e.g., employee vs. contractor), changes in the user's role or status would trigger a change to the user's ID. For example, a contractor who is subsequently hired as an employee would be assigned a new ID.

    Changes to user IDs are undesirable, as described in (2).

  • Multiple character sets:

    As described in (_label_multi-cultural), users fluent in one language, or whose computer is configured for text input in one language, may be unable to read, remember or enter an ID in another language, especially when the two languages use different character sets.

Common and recommended algorithms for assigning login IDs

(4)

Login IDs for internal users

(5)

The following process and algorithm can be used to satisfy each of the requirements set forth in (1):

Requirement

Strategy

Identify a person

Assign IDs to people, not roles.

User friendly

IDs should be 7 characters, total.

Easily recognizable

Formulate IDs as "Unnnnnn" where n represents a digit. There are 10,000,000 possible IDs of this form.

Reusable

Use the same ID on every system and application.

Compatible

IDs starting with a letter and containing only one letter and 6 digits work on almost every conceivable system and application.

Maximum scope

Assign an ID to every user in the organization and use these IDs to sign users into applications. If possible, use the same ID as an employee number as well.

Unchanging

Since the IDs are numeric, changes in user names should not trigger a request for a new ID. Since they do not represent user role or status, changes in these attributes also do not trigger a request for a different ID.

Never reused

Create a database of every ID ever assigned. Only append to it and never reuse IDs.

Not offensive

Numbers are not generally offensive, though some numbers are considered "bad luck" in some cultures. Give users an opportunity to request a new ID (but not to specify what it will be) when they are first assigned an ID.

Cross-language

Roman letters (U) and digits are legible across cultures and languages.

Limited disclosure

Do not publish lists of IDs or the correlation between user names and IDs.

 


Another reasonable process is as follows:

Requirement

Strategy

Identify a person

Assign IDs to people, not roles.

User friendly

IDs should be 7 characters, total.

Easily recognizable

Formulate IDs as the user's surname, in English, with up to 3 characters followed by a 4 digit number assigned sequentially for each prefix. Example: the fourth "Mike Smith" could be assigned "SMI0003."

Reusable

Use the same ID on every system and application.

Compatible

IDs always start with a letter, only have letters and digits and contain no more than 7 characters. Almost every conceivable system and application supports this.

Maximum scope

Assign an ID to every user in the organization and use these IDs to sign users into applications. If possible, use the same ID as an employee number as well.

Unchanging

Since IDs do not represent user role or status, changes in these attributes do not trigger a request for a different ID. Changes in a user's name may cause users to request an ID, but in most cases only a short subset of the name is used, so users are likely to tolerate continuing use of their old ID.

Never reused

Create a database of every ID ever assigned. Only append to it and never reuse IDs.

Not offensive

Short strings of letters are not usually offensive and neither are numbers. Give users an opportunity to request a new ID, indicating the string they did not like, when they are first assigned an ID.

Cross-language

Roman letters and digits are legible across cultures and languages.

Limited disclosure

Do not publish lists of IDs or the correlation between user names and IDs.

 


Login IDs for external users

External users that sign into an organization's Internet-facing applications generally only sign on infrequently. Since Internet users generally already have an e-mail address and since e-mail addresses are guaranteed to be globally unique, it makes sense to identify external users with their fully qualified e-mail address.

This has many advantages:

Requirement

Strategy

Identify a person

Use fully qualified e-mail addresses.

User friendly

Users already know their own e-mail addresses.

Easily recognizable

E-mail addresses are easily recognized by people and programs.

Reusable

Users already use their e-mail address elsewhere, so by definition assigning this as an ID is reusing it.

Compatible

E-mail addresses are not compatible with all applications. They can be quite long (over 100 characters) and may contain symbols not supported by some applications (@, _, -, .). These limitations are not usually problematic with Internet-facing applications, but they can present difficulties for "back office" systems, such as mainframes.

Maximum scope

E-mail addresses can be used as IDs on every Internet-facing application.

Unchanging

Users do periodically change their e-mail address, so this requirement is, unfortunately, violated.

Never reused

Few if any e-mail systems assign the same ID, consecutively, to different users. This reduces the problem of ID reuse to a vanishingly small size.

Not offensive

Users presumably already address this problem when provisioning their e-mail account, so this problem is transferred to another organization.

Cross-language

SMTP e-mail addresses are, by definition, cross-cultural and global.

Limited disclosure

E-mail addresses are widely known, so this requirement cannot be met using this strategy.

 

Assigning new E-mail addresses to internal users

(6)

Requirement

Strategy

Identify a person

Assign a new and unique e-mail address to every new e-mail user.

User friendly

Assign firstName.lastName@organizationDomain and insert .uniqueID before the @ if required, where the uniqueID is two letters -- aa, ab, ac, etc.

Easily recognizable

E-mail addresses are easily recognized by people and programs.

Reusable

Users can use their e-mail address to sign into a variety of web-based applications. Since many legacy applications do not support long IDs or IDs containing punctuation marks, e-mail addresses cannot be reused everywhere, nor should they -- because they are long and so take longer to type than other, typically internal IDs.

Compatible

E-mail addresses a standard format, compatible with all mail systems. Compatibility with other applications is not predictable.

Maximum scope

E-mail addresses can be used as IDs on many 3rd party Internet-facing application.

Unchanging

Unfortunately, users will generally demand changes to their e-mail address when their name changes. This is unavoidable with this format.

Never reused

Create a repository of all current and previously assigned e-mail addresses. Even in the case where a user with a given name leaves and later a different person with the same name joins, use the unique field.

Not offensive

Users are not generally offended by their own names.

Cross-language

SMTP e-mail addresses are, by definition, cross-cultural and global.

Limited disclosure

E-mail addresses are widely known, so this requirement cannot be met.

 

Example business processes

Following are some typical examples that illustrate how the naming algorithms described in (_label_naming-algorithms) above are used.

Employee / contractor onboarding

  1. For employees: HR creates a new employee record.
  2. For contractors: a manager submits a new-contractor request.
  3. In either case, the request includes the user's full name.
  4. Once the request is approved:
    1. A new login ID is assigned.
    2. Using the algorithm in (5):
      1. A database is referenced to find the highest-number, already-assigned ID.
      2. The next number is used.
      3. Database locking is used to ensure that two users, provisioned at nearly the same instant, do not get the same ID.
      4. The ID might be U0012311..
    3. A new e-mail address is assigned.
    4. Using the algorithm in (_label_new-email):
      1. "John Smith" might become "john.smith@acme.com"
      2. As with the previous example, a database lookup is required to check for duplicates.
      3. If a duplicate is found, the e-mail address might become "john.smith.aa@acme.com"
      4. The new ID must be stored in the database, correlated to U0012311.
      5. Also as before, record locking semantics must be used to avoid a case where two same-named users are assigned the same address if they are provisioned nearly simultaneously.

Customer onboarding (Internet-facing)

  1. A new customer fills in an access request form.
  2. The form should include a CAPTCHA to ensure that it is filled in by a person, rather than a (possibly malicious) script.
  3. The user should be required to enter his existing e-mail address.
  4. Form input should validate that the e-mail address is well formed.
  5. Account activation may involve e-mail validation:
    1. An activation URL is sent to the user's e-mail address.
    2. The URL includes a pseudo-random string.
    3. The user has to click through to the URL to activate the account.
    4. Activation strings and un-activated accounts should be scrubbed periodically -- for example when they are over 24 hours old.
  6. This method ensures that all users have a globally-unique, already-remembered ID.
  7. Password reset can be accomplished by sending an activation string to the user, just like account activation.

Renaming an existing employee login ID

  1. Users may ask for a new ID in the event that their old ID was based on their name, which has since changed.
  2. Organizational changes -- mergers, acquisitions, etc. -- may trigger renames to align naming standards.
  3. In general, so long as a user has the same ID on all systems, it is safer to leave that ID alone and provision any new accounts for the same user with the pre-existing ID. Name changes are dangerous since scripts or programs may explicitly refer to the old name.
  4. Where renaming a user is deemed essential, be careful to consider:
    1. Scripts or programs that refer to the old ID.
    2. Uniqueness of the new ID (should not be used by any other user on any system).
    3. Compatibility of the new ID with all systems, not just those which the user will access immediately.
  5. Before renaming a user, notify him of the change, both so that he can sign in after it happens and so that he can report problems that may have been caused by the change quickly.

page top page top