Skip to main content

Design and Implementation of Administrator Session Monitoring


This document introduces the business case for implementing a session monitoring system to record login sessions to privileged accounts. It examines a series of technological design decisions that must be considered when developing a session monitoring system and offers guidance about how such a system might be best deployed and managed in practice.

Business drivers for recording login sessions

There are three main business drivers for recording the activity of users as they sign into privileged accounts:

  1. Forensic audits:

    In the event that an IT user is under suspicion or has been found to act unethically or illegally, it is helpful to be able to play back all of that user's activity, to see what inappropriate actions they may have taken. This data may be required as supporting evidence if the user must be terminated and may be needed in the course of legal proceedings thereafter. This data may also be needed to find and reverse any harmful changes the user has made to systems or data.

  2. Accountability:

    The knowledge that their actions are being recorded and that they may be held accountable for them may alter user behaviour for the better.

  3. Knowledge sharing:

    Recording user activity makes it possible to replay work. This can aid in knowledge sharing, under a number of scenarios:

    1. A user records the steps taken to complete a task and shares this recording with peers, in the context of training. This is intentional, planned knowledge sharing.
    2. One user accesses a recording of another's actions from some time in the past, to learn how a task was performed. This may be done with or without the original user's active participation. This is ad-hoc, after-the-fact knowledge sharing.

Which login sessions should be recorded?

When deploying a session recording system, the first question is which sessions to record. There are several possibilities:

  1. All sessions, by all users.
  2. All sessions to sensitive, by any user with access to those systems.
  3. All sessions by high-risk users (i.e., users whose actions could cause harm).

The cost and impact of session recording technology directly affects how this question is answered. If capturing more sessions is relatively inexpensive and if it does not noticeably slow down the work of the affected users, then it makes sense to record more sessions. Conversely, as the cost of capture, transmission and storage rise, the motivation to more carefully target what is and what is not recorded diminishes.

In the context of session recording of system administrators, Hitachi ID Systems recommends that all logins to sensitive accounts should be recorded.

In the context of session recording of high-risk business users -- for example, HR staff, financial traders, etc. -- Hitachi ID Systems recommends that all logins by those users, to any system, should be recorded.

Over time, as the cost of storage and bandwidth continue to decline, it may make sense to record all login sessions by all users to all systems. Hitachi ID Systems does not recommend this approach at the time that this document was prepared (mid-2011).

What data should be captured?

The data that can be recorded from a modern, graphical user interface is extensive. It includes:

  1. Screen captures -- i.e., image files of the contents of a single application or of a user's graphical desktop.
  2. Process information, such as the names of and arguments passed to running programs.
  3. User interface elements, such as window titles, labels and text from input fields.
  4. Keyboard events, such as key presses and releases.
  5. Pointer device (mouse) events, such as movement and button clicks.
  6. The contents of the operating system copy buffer.
  7. Filesystem events, such as mounting or detaching network drives or removable media.
  8. File transfers, such as copying files from one filesystem to another.
  9. Video or image streams from a video capture device such as a webcam.
  10. Network data transfers, such as e-mails or web pages.

At a minimum, when recording the login sessions of a user into an administrator-level account, it makes sense to capture what they typed and what the system displayed. This means video capture as well as capture of input from both the keyboard and copy buffer.

Regarding video capture, it may make sense to capture the user's entire desktop, so that in the event that the user downloaded a file with sensitive data to his computer, the recording will show what he then did with that file? For instance, if a sensitive file was briefly examined -- as would be normal in the context of troubleshooting -- and then deleted, the action can be taken to be innocuous. On the other hand, if a sensitive file was copied to a USB flash drive or sent to the user's personal GMail account, the action can be interpreted as malicious.

Regarding input capture, it makes sense to capture both keyboard events and copy buffer contents. This is because the user may have constructed commands in advance and pasted them into the login session, without generating any keyboard events.

Finally, it may make sense to capture webcam video. This is useful in the event of serious misconduct leading to legal proceedings. When this happens, the user in question is likely to claim that the recorded actions were taken by someone else -- i.e., "that wasn't me -- someone must have stolen my password!" With webcam capture, this argument won't work, since images of the user who performed the actions in question will accompany screen captures and input events.

Data format and volume

There are two broad categories of data that may be captured by a session recording system:

  1. High volume, unstructured data, principally video capture from the screen and possibly web camera.
  2. Low volume, structured data, principally keyboard events, copy buffer contents, process IDs, UI elements, etc.

It makes sense to store the low volume data stream in a database, so that it can be manipulated and searched.

Modern databases do not cope well with large volume data such as video. It therefore makes sense to store only pointers to this data set in the database and store the actual raw data either on a filesystem or in a content archiving system.

For data stored on a filesystem, the next question is how to encode it. For efficiency, it makes sense to capture differential data (i.e., what changed from one screen capture to the next) and to compress the data. For screen capture, lossless compression such as PNG makes sense, since the data is normally very uniform. For web cam capture, lossy capture makes more sense, since the input stream consists of more "natural" lighting and scenes. For this, it makes sense to capture JPEG files.

In either case, when constructing videos for playback, it is important to use standard encoding and packaging, such as MPEG4 or AVI. This ensures that popular playback programs can be used.

Where to insert instrumentation

When a user connects to a privileged account on a server, there are three basic places where the connection can be instrumented for recording:

  1. On the user's own PC.
  2. On the server to which the user connects.
  3. On the network in between these two endpoints.

Each of these approaches has its own pros and cons:

Monitor user PCs


  • Nothing to install on servers.
  • Works with every type of server -- operating systems, databases, network devices, applications, etc.

  • May require software to be installed on many computers.
  • User might be able to tamper with and disable monitoring.
  • Implies uniform types of endpoint devices, or at least a limited set of options (e.g., all Windows, or Windows+Mac, etc.).


Monitor managed systems


  • Nothing to install on user PCs.
  • More difficult to bypass.
  • Can monitor user sessions even if they are made directly to the console of a server, not via a privileged access management system at all.

  • Potentially destabilizing change control on sensitive servers, to install invasive surveillance code.
  • Only compatible with a few types of servers. For example, this approach is not likely to work with relatively closed systems such as network devices or with vertical market or custom applications.


Monitor network (proxy)


  • Nothing to install on either user endpoint devices or servers.
  • Compatible with multiple types of endpoint devices.
  • Difficult or impossible for users to disable monitoring.

  • Since a specific proxy is required for each type of application (e.g., SSH, RDP, etc.), only works for some types of servers.
  • Quite difficult to add support for new applications or even new versions of old applications (e.g., new version of RDP, SSH, SQL clients, etc.).
  • Difficult or impossible to capture everything that happens with SSH, since users can use that to proxy other connections, including other SSH connections.
  • Session playback is a difficult problem, especially for complex and multi-layered client/server protocols.
  • Creates a single point of failure for system administration (the proxy).
  • May introduce performance problems in administrator sessions.


Examples can help clarify this analysis:

  • Assuming a client based approach is used, which instruments Windows PCs used by administrators. If an administrator wishes to connect to a system from a Mac or Linux PC or using a smart phone or tablet, then monitoring of a direct connection will not be possible. Instead, assuming that monitoring is mandatory, the user will have to first launch a connection to a Terminal Services or Citrix server and from there -- where monitoring is available -- connect to the managed system.
  • Assuming that a proxy-based approach is used, users will no longer be able to sign into managed systems directly. Instead, they must connect to the proxy, which forwards their connection to the managed system and also records it. While any protocol can be recorded this way, conversion of a recorded data stream to a human-legible video requires a deep understanding of the protocol in question. This means that in practice only a handful of versions of a handful of the most popular protocols -- remote desktop (RDP), secure shell (SSH), Telnet/TN3270 and perhaps some SQL variant -- will be recorded. This strictly limits the ability of the system to monitor all administrator logins, to all systems. This approach also does not work if a user signs directly into the console of a managed system.
  • Assuming that a proxy-based approach is used, even console logins can be recorded. On the other hand, support for only the most common and most open types of managed systems is possible -- most likely just Windows and Linux servers. Connections to network devices, applications and databases will simply not be possible.

In practice, multiple approaches can be combined. In particular, the client-based and server-based approaches work well together as they provide lightweight and protocol-neutral session monitoring in general plus a hard-to-bypass solution for the most sensitive servers, with the two mechanisms sharing the same playback technology.

Visible or stealth surveillance

With a proxy-based monitoring solution, the fact that sessions can be recorded is self-evident. This is not necessarily true of recording on the user's PC or on a system to which the user connects.

When a user's login session -- on the console of his own PC or connected via an application such as RDP, SSH or SQL Studio to another system -- is recorded, the recording process itself may be evident to the user or stealthed. Stealth recording means that there are no obvious user interface elements to indicate to the user that recording is happening. It should be noted that a sophisticated user will always be able to tell that monitoring is happening -- by inspecting the computer's process table, network traffic or simply by noting that the activity indicator lights up on his web cam.

In general it seems reasonable to inform users that recording is happening:

  1. This is less likely to violate privacy-protection legislation.
  2. Awareness of monitoring is likely to encourage business-appropriate behaviour on the part of users.

If stealthy monitoring is chosen, the process should be reviewed by an organization's legal counsel. Moreover, an explanation should be made ready for users who detect that their logins are being recorded despite stealth measures.

Tamper-proofing the recording process

When a proxy-based solution is used, there may simply be no network path that allows users to bypass the recording system.

When monitoring is launched from a user's PC, it should be linked to the login session, so that any interference with the monitoring process itself or with the recording data stream sent to the monitoring server causes (a) an alarm and (b) the login session to be automatically disconnected.

When monitoring is implemented directly on a managed system, it should likewise be configured to detect interference and automatically sign off any logged in users in the event that surveillance is interrupted.

In some cases, it may be desirable to allow users to establish and maintain login sessions even in the event that session recording is non-functional. This may be the case in the event of a network outage that interrupts connectivity to the session recording server, for instance. In these cases, at least a local cache of recorded data should be maintained. A business decision must be made to choose which is more important -- the ability to sign into and manage systems, even if session recording is not available -- or the assurance that all administrative logins are recorded. It is likely that a different choice will be made on each system, depending on how highly available that system must be versus the sensitivity of data on that system.

Planning for network bandwidth and storage requirements

Session recordings can generate terabytes of data, mainly due to video capture. A single user whose PC is instrumented, generating one screen shot of his desktop per second, will generate about 10 kBytes/second of data per monitor. In comparison to this, structured data (keystroke events, etc.) adds a negligible amount of data.

This data stream must be transmitted from the user's PC to a central storage server and from there perhaps replicated to a backup location. Assuming that a user's entire work day is subject to surveillance, this amounts to:

  1. Bandwidth from user to server: 10kBytes/sec.
  2. Data on server: 290MByte/day (8 hours/day); 64GByte/year (220 work days/year).

Assuming that 100 users are being monitored this scales up to:

  1. Bandwidth from 100 users to a single server: 1MBytes/sec or 10MBits/sec.
  2. Data on server: 29GByte/day (8 hours/day); 6.4TByte/year.

While these storage requirements are manageable with contemporary technology, if the data is replicated this means data transfer of 29GByte/day -- over a wide area network this is significant.

Assuming that 7 years of data are retained to support possible future forensic audits, this amounts to 45TB of storage.

What to do in the event of a termination

Session recording is very helpful in the event that a user with access to privileged accounts has to be terminated:

  1. At the point of termination, the active communication between the user's PC and managed systems can be leveraged to actively disconnect any currently logged-in sessions.
  2. Post termination, recent user activity can be reviewed to see if the user has taken any inappropriate actions.

Privacy protection and access controls

In most jurisdiction, it is reasonable for employers to monitor the activity of their employees and contractors, so long as:

  1. The surveillance takes place at the workplace.
  2. The surveillance takes place on employer-owned equipment.
In some jurisdictions, there may be a third requirement, which is to make sure that users are aware of the surveillance.

These criteria create some boundary cases which organizations should consider carefully and avoid wherever possible:

  1. Regarding video capture (i.e., web cam surveillance) -- it is important to only enable it on corporate computers and only when those computers are known to be physically in an employer-owned location / facility and perhaps even only in a location that is closed to outsiders.
  2. It should be understood that if a user who can legitimately be monitored uses a corporate computer for some private activity while at work -- even where this is perfectly legitimate and reasonable -- that activity will be captured. For example, if a system administrator takes a break and does some on-line banking, his bank account number and password may be captured along with his work.

To mitigate the risks of inappropriate compromise of employee or contractor privacy, it is essential to implement security measures to ensure that access to recordings is legitimate and authorized in all cases. Since the simple act of performing a search on the database of recordings may yield privacy-related information, it makes sense to authorize access to recordings in two steps: First, request and approve the right to perform a specific search. Second, request and approve the right to retrieve the recording from a specific session.


Session recording is a powerful technology that enables organizations to create accountability, generate forensic audit trails and support knowledge sharing. The advent of inexpensive broadband networks and large storage systems makes it feasible to record and archive large numbers of login sessions in detail.

When designing a session monitoring system, it is important to take into consideration compatibility with both end user devices and back-end systems. Other parameters, such as tamper-proofing the system, making users aware of its operation, protecting the privacy of users by controlling when recordings happen and who can retrieve them are also all very important.

A session recording system can generate large volumes of data. Because of this, it is important to plan in advance for the network bandwidth and storage requirements of the system.

page top page top