pmap - Hybrid Container Extensions to the C++ STL

Greg Hood, Pittsburgh Supercomputing Center (ghood@psc.edu)
October 2005

Copyright (c) 2003,2005,2006,2014  Pittsburgh Supercomputing Center



Although the C++ STL (standard template library) provides containers appropriate for a wide variety of applications, there are situations where it can be extremely useful to have containers that support combinations of characteristics of the basic ones found in the STL (e.g., vector, priority_queue, map, unordered_map) These hybrid containers are not trivial to construct simply by building an implementation on top of the existing containers, especially if one is concerned about performance.  I describe here four such hybrid containers that I created a while ago and have found useful in multiple contexts, and include source code so that others may use them.  These are distributed as the pmap package (pmap is short for "priority map").

These templates have been tested to some extent, but there is always the possibility that bugs still exist, so use at your own risk!  They were written without my having a great familiarity with the intricacies of the various STL implementations, so I'm sure there are improvements that could be made to bring them more in line with existing STL style.  If you spot a bug, or have style suggestions or proposed code revisions, feel free to pass them along to me in e-mail.

If you would like to be informed of any changes to this code, please e-mail me and I will put you on a mailing list (which will only be used for making announcements about this code).



priority_map / priority_multimap

A priority_map is a container that combines the functionality of a priority_queue and an unordered_map.   The central idea is that one can both insert an entry (i.e., a Key/Value pair) into the priority queue, and can also address that entry via its Key, and possibly delete it prior to its being popped from the queue.  A priority_multimap allows more than one entry with the same Key, and thus corresponds to the combination of a priority_queue and an unordered_multimap.

One use of priority_map is in the construction of data structures for scheduling, where a task is entered into the priority_map when it is generated, a task is popped from the priority_map when it is actually executed, and where it is possible to dynamically modify the priority of the task by deleting it and re-inserting it with the new priority.

The internal data structure used here to implement the priority_map is a splay tree [Sleator & Tarjan 1985].  This data structure guarantees an amortized access cost of O(log(n)), although individual operations may take as long as O(n) time.

Both priority_map and priority_multimap are defined within the pmap namespace, and one can use them by putting the following lines in an application:

    #include "pmap.h"

    using pmap::priority_map;
    using pmap::priority_multimap;


Both priority_map and priority_multimap containers are defined as template<class Key, class Value, class H, class L, class A>, where the template parameters are as follows:

    Key  - the type used to refer to entries within the container
    Value - the type used to hold the values of the entries; these determine the sorting order of the entries
    H - the hash functor type that hashes the Key into a size_t (defaults to hash<Key>)
    L - the "less than" functor type that compares two Value's and returns a bool (defaults to std::less<Value>)
    A - the allocator to be used for allocating memory associated with the container (defaults to std::allocator<pair<Key,Value> >)

The following types are defined for priority_map and priority_multimap:

       typedef pair<const Key, const Value> value_type;
    class iterator;
    class reverse_iterator;
    typedef iterator const_iterator;
    typedef reverse_iterator const_reverse_iterator;
    typedef size_t size_type;
    typedef size_t difference_type;
    typedef typename A::pointer pointer;
    typedef typename A::const_pointer const_pointer;
    typedef typename A::reference reference;
    typedef typename A::const_reference const_reference;

Note that iterator and const_iterator are identical types.  This is because the iterators for these containers cannot be used to modify the items in the container.  Allowing direct modification of the stored items in this way would potentially violate the consistency of the container's internal data structures.

The following methods are supported by a priority_map:

    priority_map ();                                          // default constructor
    priority_map (const priority_map& x);                     // copy constructor
    ~priority_map ();                                         // destructor
    priority_map operator= (const priority_map& x);           // assignment operator
    void swap (priority_map& x);
 

    iterator begin () const;
    iterator end () const;
    reverse_iterator rbegin() const;
    reverse_iterator rend() const;
    iterator front () const;
    iterator back () const;
    size_type size() const;
    size_type max_size() const;
    bool empty() const;
    void clear();
    void check () const;                // checks all data structures for correctness

    iterator find (const Key&) const;
    pair<iterator,bool> insert (const Key&, const Value&);
    pair<iterator,bool> insert (const pair<Key,Value>&);
    void push (const Key&, const Value&);
    void push (const pair<Key,Value>&);
    void erase (const Key&);
    void erase (const_iterator&);
    void erase (const_iterator&,const_iterator&);
    size_type count (const Key& k) const;

    iterator top () const;
    void pop ();
    iterator lower_bound (const Value& v);
    iterator upper_bound (const Value& v);

The color coding of the methods indicates the amortized computational complexity of each.  Blue methods run in constant time, i.e., O(1), green methods run in O(log(n)) time, and red methods run in O(n) time, where n is the number of items in the container.  Orange methods run in O(m*log(n)) time, where m is the number of elements in the specified iterator range.

An iterator may be incremented (++), decremented (--), assigned, and compared.  If x is an iterator, dereferencing it produces a const pair<Key,Value>.  Thus, one can access the Key with x->first, and the Value with x->second. Formally, we have the following methods for an iterator:

    iterator& operator= (const iterator& x);
    const pair<Key,Value>& operator* () const;
    const pair<Key,Value>* operator-> () const;
    iterator& operator++ ();
    iterator operator++ (int);
    iterator& operator-- ();
    iterator operator-- (int);
    bool operator== (const iterator& x) const;
    bool operator!= (const iterator& x) const;
    bool operator< (const iterator& x) const;
    bool operator> (const iterator& x) const;
    bool operator<= (const iterator& x) const;
    bool operator>= (const iterator& x) const;

The methods of a priority_multimap are nearly identical to that of a priority_map, the only substantive difference being that the return type of the insert methods is simply an iterator since insertions always succeed:

    priority_multimap ();                                     // default constructor
    priority_multimap (const priority_multimap& x);           // copy constructor
    ~priority_multimap ();                                    // destructor
    priority_multimap operator= (const priority_multimap& x); // assignment operator
    void swap (priority_multimap& x);

    iterator insert (const Key&, const Value&);
    iterator insert (const pair<Key,Value>&);




indexed_priority_map / indexed_priority_multimap

An indexed_priority_map is similar to priority_map in that it has a Key and Value, and supports all of the operations of a priority_map.  However, one can also directly access the nth entry in the container, sorted by Value.  This indexing operation is generalized so that instead of indexing with an  int, we can index with an arbitrary Index type. Each item inserted into the container may include a delta Index, which specifies how much the Index changes in going from itself to the next item in the container, sorted by Value.   This delta Index defaults to 1 if not specified, thereby yielding the default index sequence 0, 1, 2, 3...   One can retrieve the Index for any item in the container in  O(log(n)) time.  Accessing an iterator by specifying an Index also happens in amortized O(log(n)) time,   An indexed_priority_multimap is similar to an indexed_priority_map except that it allows more than one entry with the same Key.

Since the Index may be any type which supports the operations +, -, =, ==, and which has an initializer of the form Index(0), the Index can represent various cumulative ranking operations.  An example of where an indexed_priority_multimap would be useful is in the memory management of large objects where an entry for each object is kept in the container.  Each entry might have a priority Value that corresponds to how important it is to keep that object in primary memory.   The Index for each object's entry could be the size of that object.  Then, it is possible to use an iterator's index_lower_bound method to find in amortized O(log(n)) time the highest-priority subset of objects which fits into, say, 128 megabytes.

[Usage tip: If one would happen to need two or more pieces of index information (e.g., the cumulative object count and the cumulative object size) to be easily obtained for each item in the container, then it is possible to define an Index of type struct that includes all those pieces of information, with the +, -, =, and == operators appropriately defined for that struct.]

Both indexed_priority_map and indexed_priority_multimap are defined within the pmap namespace, and one can use them by putting the following lines in an application:

    #include "pmap.h"

    using pmap::indexed_priority_map;
    using pmap::indexed_priority_multimap;


Since these classes hold items which are composed of Key, Value, and Index, pmap defines a triple template which is analogous to the pair template of the STL:

    template <class T1, class T2, class T3>
      struct triple
      {
      triple ();
      triple (const T1& x, const T2& y, const T3& z);
      template <class U1, class U2, class U3>
        triple (const triple<U1, U2, U3>& x);
      };

pmap
also defines the following operations involving triples:

    bool operator== (const triple<T1, T2, T3>& x,
                     const triple<T1, T2, T3>& y);
    bool operator< (const triple<T1, T2, T3>& x,
                    const triple<T1, T2, T3>& y);
    bool operator!= (const triple<T1, T2, T3>& x,
                     const triple<T1, T2, T3>& y);
    bool operator> (const triple<T1, T2, T3>& x,
                    const triple<T1, T2, T3>& y);
    bool operator<= (const triple<T1, T2, T3>& x,
                     const triple<T1, T2, T3>& y);
    bool operator>= (const triple<T1, T2, T3>& x,
                     const triple<T1, T2, T3>& y);
    triple<T1, T2, T3> make_triple(const T1& x,
                                   const T2& y,
                                   const T3& z);

An indexed_priority_map template is defined as template<class Key, class Value, class Index, class H, class L, class A> where the template parameters are as follows:

    Key  - the type used to refer to entries within the container
    Value - the type used to hold the values of the entries; these determine the sorting order of the entries
    Index - the type used for randomly indexing into the container (defaults to size_t)
    H - the hash functor type that hashes the Key into a size_t (defaults to hash<Key>)
    L - the "less than" functor type that compares two Value's and returns a bool (defaults to std::less<Value>)
    A - the allocator to be used for allocating memory associated with the container (defaults to std::allocator<pmap::triple<Key,Value,Index> >)


An indexed_priority_map supports all of the operations of a priority_map, plus the following additional ones:

     pair<iterator,bool> insert (const Key&, const Value&, const Index&);
     pair<iterator,bool> insert (const triple<Key,Value,Index>&);
     void push (const Key&, const Value&, const Index&);
     void push (const triple<Key,Value,Index>&);
     Index index (const Key&) const;
     Index index (const iterator&) const;
     iterator index_find (const Index&) const;
     iterator index_lower_bound (const Index&) const;
     iterator index_upper_bound (const Index&) const;

When methods such as insert or push are called without specifying an Index parameter, a default Index of 1 is used (and conversion from int to the Index type must be supported).  Additionally, the iterator for an indexed_priority_map supports the following operations:

     iterator& operator+ (const Index x);
     iterator& operator+= (const Index x);
     iterator& operator- (const Index x);
     iterator& operator-= (const Index x);
     iterator& operator[] (const Index x);

The methods of an indexed_priority_multimap are identical to that of an indexed_priority_map, except that the return type of the insert methods differ since insertions always succeed:

     iterator insert (const Key&, const Value&, const Index&);
     iterator insert (const triple<Key,Value,Index>&);




Download


The current version of this webpage is available at http://www.psc.edu/~ghood/pmap.html.

The version of the software corresponding to the present webpage is available at:  pmap-1.0.0.tar.gz

See the LICENSE file for license information (it is the GNU GPL).




Installation


  1. For most sites, to use the pmap templates you simply include pmap.h along with other includes at the beginning of your program.  If you want to install pmap for multiple users, you can copy pmap.h, pmap.tcc, pmap_common.h, and pmap_common.tcc to whatever include file directory you choose (e.g., /usr/local/include).
  2. See the SITE-SPECIFIC DEFINES section of pmap.h.  These preprocessor definitions may be modified in pmap.h, in the including program prior to including pmap.h, or on the compiler command line.  If pmap.h is to be shared by multiple users, it is recommended that pmap.h be left as is, and that the latter two methods be used for customizing it.
  3. To test the code, modify the Makefile in this directory to suit your platform.  Then type "make test" in this directory which will compile the four example programs and then run them.  If they all complete successfully, then you're ready to go.  If they produce uncaught exceptions, then please submit a bug report including hardware platform, OS version, compiler version, and the error output to ghood@psc.edu.


Revision History

1.0.0  (20 Mar 2014) - Converted to use unordered_map instead of ext/hash_map.
0.0.2  (29 Jan 2007) - Fixed compilation problem in erase; eliminated unnecessary #includes.
0.0.1  (3 May 2006) - Fixed various documentation flaws; no changes to code.
0.0.0  (25 Apr 2006) - Initial release.



References


Sleator, D., and Tarjan, R., "Self-adjusting Binary Search Trees", Journal of the ACM, 32(3):652-686, 1985.