Choosing a database product
The first and most obvious question when considering the database architecture for Hitachi ID Suite is which database software is supported?
The Hitachi ID Suite replicating data service can be configured to use the following SQL database engines as its physical data store:
- Microsoft SQL Server 2016/2014/2012, Standard Edition.
- Microsoft SQL Server 2016/2014/2012, Express Edition, with Advanced Services (free download from https://www.microsoft.com/en-ca/) -- suitable for development, test and very small production environments.
These options are all technically compatible, but which is most appropriate for a given deployment? That depends on how Hitachi ID Suite will be used, as follows:
Demonstration or proof of concept
If Hitachi ID Suite is being installed in a demonstration setting, for example to evaluate its suitability for a given business problem, then cost is likely a dominant consideration. In this case, Microsoft SQL Server Express is appropriate, as it is offered at no cost:
In a production deployment of Hitachi ID Suite, the no-cost SQL Express is not appropriate:
- It has strict limitations on data volume (10GB).
- There are severe performance governors -- a single CPU core and 1GB of RAM.
- There is no technical support.
This means that the "Standard" or "Enterprise" version of SQL Server should be used, rather than the "Express" edition:
Shared database infrastructure vs. dedicated database server
In many organizations, there is a standard, shared database infrastructure which many applications can leverage. Where a shared database infrastructure is available, organizations may consider whether to use it or whether it is better to deploy a stand-alone database solely to be used by Hitachi ID Suite.
Motivation for using a shared database
Reasons to use a shared database infrastructure include:
- The database software has already been licensed, which may eliminate
a few thousand dollars from the expense of deploying Hitachi ID Suite.
It should be noted that the total cost of ownership of the database instance used by Hitachi ID Suite will not be zero and may not actually be lower, because database administrator (DBA) time, physical capacity and possibly more database CPU licenses may have to be purchased.
- The existing, shared database infrastructure is presumably well
understood by DBAs and should consequently be well managed.
- An existing investment in database server hardware, software,
monitoring infrastructure, administration processes, etc. can
- The IT group which owns the existing database infrastructure may
insist that it be used, as a matter of policy.
- The team deploying Hitachi ID Suite may have no database installation, configuration or management skills, so may be reluctant to install and deploy their own database instance.
Problems with using a shared database
While there are advantages to using a shared database infrastructure, there are also drawbacks to this approach:
- Initial configuration of the database instance that will be used
by Hitachi ID Suite may be complicated, if the DBA cannot or will not
allow the setup program to run with sufficient privileges to
setup its schema, create indexes, upload stored procedures, etc.
- In many organizations, there are simply not enough DBAs. This
leads to project delays as the Hitachi ID Suite deployment team waits
for a DBA to make a database instance available, to configure
its schema, etc.
- In the event that there is a performance problem with Hitachi ID Suite,
a DBA's involvement is usually required to troubleshoot it. Normally
only DBAs have adequate access to internal database logs, analytical
facilities, etc. to work out where a performance bottleneck is and
what might be causing it.
- In a shared database server farm, allocation of CPU, disk, memory
and I/O bandwidth to a given instance is up to the DBAs.
An advantage of this is that database capacity is not usually
over-provisioned, so total hardware cost is lower. The drawback
is that sometimes database capacity is over-committed, leading
to performance problems for Hitachi ID Suite when unrelated workloads
- Higher network packet latency between the Hitachi ID Suite application
server and the database server farm, as compared to effectively
zero latency between Hitachi ID Suite and a database hosted either on
the same server or nearby, can introduce substantial delays to both
batch processes and the interactive user interface on Hitachi ID Suite.
- When using a shared database, the communication data between the
IDM server and its database back-end must be encrypted for security
purposes. Although both supported SQL database engines provide a
mechanism to encrypt client/server communication, enabling this
feature may have an impact on performance and scalability.
- Some of the data housed in Hitachi ID Suite can be quite sensitive --
privileged passwords to target systems, security questions used
to authenticate users, etc. While this data is always encrypted,
there is a security argument to be made for limiting the number
of people with access to the database that houses this:
- In a dedicated database scenario, very few (possibly just 1 or 2) people can access this encrypted data.
- In a shared database infrastructure, every DBA and OS system
administrator can, at least in principle, access this encrypted
- Database-level encryption can be used to mitigate this, but enabling encryption of everything in a database at the database server level triggers a significant performance penalty. Note that Hitachi ID Suite's built-in encryption is only applied to sensitive data, such as passwords and answers to security questions, not to the full database.
- A shared database server can also be equipped with a hardware cryptographic module, which should eliminate the performance penalty for full-database encryption, but introduces a significant hardware cost instead.
- Some configuration choices made by the administrator of a shared
database infrastructure may cause performance problems and can
be quite difficult to troubleshoot. Two examples illustrate this:
- One Hitachi ID Systems customer configured their shared database
server farm with synchronous data replication between two
physical locations. This allowed them to have instant
recovery in the event that one of the server clusters was
knocked out of service. Unfortunately, this also made
database commits extremely slow -- creating a hard to diagnose
- The database infrastructure at another Hitachi ID customer was configured to calculate a SQL statement execution plan whenever a stored procedure was installed. When Hitachi ID Suite was installed, since the database was empty, all stored procedures got execution plans that did not take advantage of indexes. The resulting run-time performance once data was loaded was very poor, again in a difficult-do-diagnose manner.
- One Hitachi ID Systems customer configured their shared database server farm with synchronous data replication between two physical locations. This allowed them to have instant recovery in the event that one of the server clusters was knocked out of service. Unfortunately, this also made database commits extremely slow -- creating a hard to diagnose performance problem.
Motivation for using a stand-alone database
There are a number of advantages to deploying a dedicated database for use by Hitachi ID Suite:
- A stand-alone database would normally be managed by the same person
who installs and configures Hitachi ID Suite. Consequently, changes
to the database configuration -- to add a schema for Hitachi ID Suite,
manage security rights for the Hitachi ID Suite user, etc. -- should
not require their own change control process. Eliminating a
change control can shorten project delays - by weeks or months
depending on how busy the team of database administrators are.
- In many organizations, use of a shared database infrastructure
triggers project charges. This may be inter-departmental
(example: the database group in IT charges the application team
that needs database access). While inter-departmental charges
may be perceived as "funny money" the charges could be quite
real, for example where a vendor is used to manage the database
infrastructure and charges per activity or per database instance.
Use of a shared database infrastructure may also include charges for disk capacity, CPU capacity, network bandwidth, software licenses, etc. In other words, shared infrastructure does not imply that the infrastructure is already paid for and there is no incremental cost.
In some cases, the cost of adding a database instance to the shared infrastructure, once all charges are added in, may be higher than the cost of licensing and installing database server software on each Hitachi ID Suite server.
- In a shared database infrastructure settings, DBAs can, should
and do limit the administrative rights granted to application
owners. This is a reasonable precaution taken to prevent
mistakes made by less experienced admins adversely impacting
This sort of restriction has undesirable side-effects and in particular can make it much more difficult to troubleshoot application problems, including performance issues.
With a stand-alone database instance, application owners can have full access to their database without impacting any other systems. This allows application owners to troubleshoot database issues, including inspecting execution plans, analyzing SQL performance and trying out changes such as new indexes without waiting for a change control or risking harm to other applications.
- The network packet latency between an application server, such
as Hitachi ID Suite, and a shared database server cluster is almost
always significantly higher than the latency between two processes
on the same server (Hitachi ID Suite software and local database) or
on adjacent servers (Hitachi ID Suite software on one server, database
software on an adjacent server).
Network latency has a negligible effect on performance when issuing just a few SQL transactions but can significantly impair performance in cases where the application has to issue hundreds of SQL transactions to complete a task, such as displaying a complex UI screen.
It is helpful to consider example data to understand this impact. Assume that the network latency between Hitachi ID Suite and its database server is just 20ms. If Hitachi ID Suite needs to call 300 stored procedures to render a UI form and if each transaction requires just 5 packets to be transmitted, the total delay added to the UI by the latency is 20ms x 5packets x 300transactions = 30seconds. This is a huge delay!
Conversely, with an adjacent database server this delay is generally well under 1ms and with a database server installed on the same computer as Hitachi ID Suite it is effectively 0. The result is that a local database will render the same UI form used in the example above at least 25 seconds faster. Continuing with the example, total time (including processing, not just network latency delays) may be 32s vs. 2 for a local database -- 16 times faster for the user.
- Where a local database is used, and where it is installed on the
same physical server as Hitachi ID Suite, there is no incremental
hardware cost. This means that the incremental cost of adding a
local database is mainly just the license fees for the database
software -- typically under $10,000 for the entire project.
This may compare favorably to the cost for a new application
to access a shared database infrastructure and to manage several
tens of gigabytes on that system.
- Where a local database is used, the database software itself is
normally left with default settings. Hitachi ID tests software releases
with default configurations of Microsoft SQL Server
and consequently the system should have good performance.
This means that a DBA's services are not required to extract good
performance from Hitachi ID Suite using a default-configuration database.
In contrast, many shared database clusters have been tuned to meet the specific needs of other applications, to provide fault tolerance, to support data warehousing, etc. These configuration changes can adversely impact Hitachi ID Suite performance in ways that are sometimes difficult to troubleshoot.
- Using a stand-alone database installed locally on the Hitachi ID Suite
server prevents sensitive data from being captured by a packet
sniffer. It also allows for secure client/server communication
without requiring the overhead of encryption.
- Where Hitachi ID Suite and the database are installed on the same
VM, a snapshot can be made of the
entire physical or virtual server. This means that if or when
a restore is needed, the entire system can be recovered.
In contrast, where the database is on one server and the application on another, backups are made at different times and a restore can be more complicated to orchestrate, since two restores are required and some configuration changes may be required to bring them back into synchronization. The business impact of this is that -- where a shared database infrastructure is used -- recovering from a disaster can be much more time consuming, leading to extended down-time.