The attributes you search for are indexed in the directory, so the directory server can retrieve them more quickly. Attribute values do not have to be strings. Some attribute values, like certificates and photos, are binary.
Barbara Jensen's entry is located under an entry with DN ou=People,dc=example,dc=com, an organization unit and parent entry for the people at Example.com. The ou=People entry is located under the entry with DN dc=example,dc=com, the base entry for Example.com. DC is an acronym for domain component. The directory has other base entries, such as cn=config, under which the configuration is accessible through LDAP. A directory can serve multiple organizations, too. You might find dc=example,dc=com, dc=mycompany,dc=com, and o=myOrganization in the same LDAP directory. Therefore, when you look up entries, you specify the base DN to look under in the same way you need to know whether to look in the New York, Paris, or Tokyo phone book to find a telephone number. The root entry for the directory, technically the entry with DN "" (the empty string), is called the root DSE. It contains information about what the server supports, including the other base DNs it serves.
As mentioned early in this chapter, directories have indexes for multiple attributes. By default, DS does not let normal users perform searches that are not indexed, because such searches mean DS servers have to scan an entire directory database when looking for matches.
As directory administrator, part of your responsibility is making sure directory data is properly indexed. DS software provides tools for building and rebuilding indexes, for verifying indexes, and for evaluating how well indexes are working.
You find the installation and upgrade tools, setup, and upgrade, in the parent directory of the other tools, as these tools are not used for everyday administration. For example, if the path to most tools is /path/to/opendj/bin you can find these tools in /path/to/opendj. For instructions on how to use the installation and upgrade tools, see the Installation Guide.
In a backend database, the id2entry index holds LDIF representations of directory entries. For a database that is not encrypted, the corresponding low-level database shows the cleartext strings, as is evident in the following example:
This only happens if the timestamp for the indexed attribute matches to the nearest millisecond on more than 4000 entries (for default settings). This corresponds to four million timestamp updates per second, which would be very difficult to reproduce in a real directory service.
It would generally be a waste of resources to have the directory server check all entries to see whether they have a CN of Babs Jensen. Instead, directory servers maintain indexes to expedite checking whether a search filter matches.
LDAP directory servers like DS directory servers even go so far as to disallow searches that cannot be handled expediently using indexes. Maintaining appropriate indexes is a key aspect of directory administration.
The role of an index is to answer the question, "Which entries have an attribute with this corresponding value?" Each index is therefore specific to an attribute. Each index is also specific to the comparison implied in the search filter. For example, a directory server maintains distinct indexes for exact (equality) matching and for substring matching. The types of indexes are explained in "Index Types and Their Functions". Furthermore, indexes are configured in specific directory backends.
This is how DS directory servers use indexes. When the search filter is (cn=Babs Jensen), the directory server retrieves the IDs for entries with a CN matching Babs Jensen from the equality index of the CN attribute. (For a complex filter, the directory server might optimize the search by changing the order in which it uses the indexes.) A successful result is zero or more entry IDs. These are the candidate result entries.
For each candidate, the DS directory server retrieves the entry by ID from a special system index called id2entry, which, as its name suggests, returns an entry for an entry ID. If there is a match, and the client application has the right to access to the data, the directory server returns the search result. It continues this process until no candidates are left.
If there are no indexes that correspond to a search request, then the DS directory server must potentially check for a match against every entry in the scope of the search. Evaluating every entry for a match is referred to as an unindexed search. An unindexed search is an expensive operation, particularly for large directories. For this reason, a directory server refuses unindexed searches unless the user making the request has specific permission to make such requests. Permission to perform an unindexed search is granted with the unindexed-search privilege. This privilege is reserved for the directory root user by default, and should not be granted lightly.
DS directory servers maintain generally useful indexes for data imported into the default backend. When you create a new backend, the directory server only maintains the necessary system indexes unless you configure additional indexes. For details, see "Default Indexes".
Index maintenance has its costs. Every time an indexed attribute is updated, the DS directory server must update each affected index to reflect the change, which is wasteful if the index is hardly used. Indexes, especially substring indexes, can take up more memory and disk space than the corresponding data.
Aim to maintain only those indexes that speed up appropriate searches, and that allow the DS directory server to operate properly. The latter indexes include non-configurable internal indexes, and generally are handled by the directory server without intervention. The former, indexes for appropriate searches, require thought and investigation. Whether a search is appropriate depends on the circumstances.
Begin by reviewing the attributes of your directory data. Which attributes would you expect to see in a search filter? If an attribute is going to show up frequently in reasonable search filters, then it ought to be indexed.
Compare your guesses with what you see actually happening in the directory. One way of doing this is to review the access log for search results that are marked with additional items including unindexed:
Read the full messages in the access log on your server, as they also specify the search filter and scope. Understand the search that led to each unindexed search. If the filter is appropriate and frequently used, add an index to facilitate the search. You can either consume the access logs to determine how often a search filter is used, or monitor what is happening in the directory by using the index analysis feature.
One inappropriate search filter that led to an unindexed search, (mail=*.com), had no matches because, "The filter value exceeded the index entry limit for the /dc=com,dc=example/mail.caseIgnoreIA5SubstringsMatch:6 index." It appears that some client application is trying to list all entries with an email address ending in .com. There are so many such entries that although an index exists for the mail attribute, the server has given up maintaining the list of entries with email addresses ending in .com. In a large directory, there might be many thousands of matching entries. If you take action to allow this expensive search, the requests could consume a large share of directory resources, or even cause a denial of service to other requests.
Directory users might complain to you that their searches are refused because they are unindexed. Ask for the result code, additional information, and search filter. DS directory servers respond to LDAP client applications that attempt unindexed searches with a result code of 50 and additional information about the unindexed search. The following example attempts, anonymously, to get the entries for all users whose email address ends in .com:
Perhaps they do have a legitimate reason to get the full list of all entries in one operation, such as regularly rebuilding some database that depends on the directory. If so, their application can perform the search as a user who has the unindexed-search privilege. To assign the unindexed-search privilege, see "Configuring Privileges".
Sometimes it is not obvious by inspection how a directory server handles a given search request internally. The directory root user can inspect how the DS directory server resolves the search request by performing the same search with the debugsearchindex attribute.
A default global access control setting prevents users from reading the debugsearchindex attribute. To allow an administrator to read the attribute, add a global access control setting as in the following example for a directory server using ACIs:
When you index a JSON attribute defined in this way, the default directory server behavior is to maintain index keys for each JSON field. Large or numerous JSON objects can result in large indexes, which is wasteful. If you know which fields are used in search filters, you can choose to index only those fields.
A special VLV index can enable the server to sort results for a search that is technically unindexed. For example, this feature facilitates paging through an entire directory database in a UI, where the user does not necessarily filter the data before knowing what is available.
In this special case, you can safely use the rebuild-index --clearDegradedState command to avoid having to scan the entire directory backend before rebuilding the new, unused index. In this example, an index has just been created for newUnusedAttribute.
As the number of entries in the directory grows, the list of entry IDs for some keys can become very large. For example, every entry in the directory has the value top for the objectClass attribute. If the directory maintains a substring index for mail, the number of entries ending in .com could be huge. 2b1af7f3a8