Scippy

SoPlex

Sequential object-oriented simPlex

datahashtable.h
Go to the documentation of this file.
1 /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
2 /* */
3 /* This file is part of the class library */
4 /* SoPlex --- the Sequential object-oriented simPlex. */
5 /* */
6 /* Copyright (C) 1996-2016 Konrad-Zuse-Zentrum */
7 /* fuer Informationstechnik Berlin */
8 /* */
9 /* SoPlex is distributed under the terms of the ZIB Academic Licence. */
10 /* */
11 /* You should have received a copy of the ZIB Academic License */
12 /* along with SoPlex; see the file COPYING. If not email to soplex@zib.de. */
13 /* */
14 /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
15 
16 /**@file datahashtable.h
17  * @brief Generic hash table for data objects.
18  */
19 #ifndef _DATAHASHTABLE_H_
20 #define _DATAHASHTABLE_H_
21 
22 #include <iostream>
23 #include <assert.h>
24 #include <limits.h>
25 
26 #include "spxdefines.h"
27 
28 #define HASHTABLE_FILLFACTOR 0.7
29 
30 namespace soplex
31 {
32 /**@brief Generic hash table for data objects.
33  @ingroup Elementary
34 
35  Class DataHashTable provides a generic hash table for
36  \ref DataObjects "Data Objects",
37  i.e., a map that maps arguments called \a HashItems to values called \a Infos.
38  HashItem and Info types are passed as template arguments. HashItems
39  must provide a comparison operator==(). Furthermore, both the HashItem and
40  Info must be data objects in the sense that the assignment operator is
41  equivalent to a <tt>memcpy()</tt> of the structure and no destructor is
42  required.
43 
44  The construction of a DataHashTable requires a \em hash \em function that
45  assigns an integer value to every HashItem. Provided this, pairs of a
46  HashItem and a Info can be added to the DataHashTable. No more
47  than one Info can be assigned to the same HashItem at a time. The Info
48  to a HashItem can be accessed through the subscript operator[]() with
49  the Info object as a subscript.
50 
51  The maximum number of elemens a DataHashTable can hold can be
52  specified upon construction and may be reset with reMax() later on.
53  Further, a value hash size value is required. This value must be less then
54  the maximum number of elements and must not have a common dominator with
55  the maximum number of elements. If not specified explicitely, it
56  is set automatically to a reasonable value.
57 
58  The implementation relies on an array of DataHashTable::Element%s, from
59  now on referred to as elements. Upon construction, all elements are
60  marked as \c FREE. When an entry is added
61  to the DataHashTable, the hash value is computed by calling #m_hashfun
62  for its HashItem. If this array element is unused, it is
63  taken right away. Otherwise, the array index is incremented by
64  the hash size (modulo the element array size()) until an unused element
65  is found.
66 
67  Removing elements is simply done by marking it as \c RELEASED. Hence,
68  when searching for an element, the search loop may not stop, when a
69  \c RELEASED element is encountered. However, such an element may be
70  reused when adding a new element to the DataHashTable.
71 
72  Further, memory management with resizing of the element array is
73  straight forward.
74 */
75 template < class HashItem, class Info >
77 {
78 private:
79 
80  //-----------------------------------
81  /**@name Types */
82  //@{
83  /// template class for elements stored in the hash table
84  template < class ElemHashItem, class ElemInfo >
85  class Element
86  {
87  public:
88  ///
89  ElemHashItem item;
90  ///
91  ElemInfo info;
92  /// States of an element
93  enum states
94  {
95  FREE, ///< element has never been used
96  RELEASED, ///< element had been used, but released
97  USED ///< element is in use
98  } stat;
99  };
100  typedef Element< HashItem, Info > Elem;
101  //@}
102 
103  //-----------------------------------
104  /**@name Data */
105  //@{
106  /// stores all elements of the hash table
108  /// increment added to hash index, if allready used
110  /// current number of entries in the hash table
111  int m_used;
112  /// pointer to hash function (mapping: \a HashItem -> int)
113  int (*m_hashfun) (const HashItem*);
114  /// memory is \ref soplex::DataHashTable::reMax() "reMax()"ed by this factor if a new element does't fit
116  /// some prime numbers for fast access
117  int primes[50];
118  /// number of stored prime numbers
119  int nprimes;
120 
121  //@}
122 
123 public:
124 
125  //-----------------------------------
126  /**@name Access / modification */
127  //@{
128  /// Is item \p h present in DataHashTable?
129  bool has(const HashItem& h) const
130  {
131  return index(h) >= 0;
132  }
133 
134  /// returns const pointer to \a Info of \a HashItem \p h or 0,
135  /// if item is not found.
136  /** Returns a pointer to \a Info component of hash element \p h or a zero
137  * pointer if element \p h is not in the table.
138  */
139  const Info* get(const HashItem& h) const
140  {
141  int i = index(h);
142 
143  return (i >= 0) ? &m_elem[i].info : 0;
144  }
145  /// references \a Info of \a HashItem \p h.
146  /** Index operator for accessing the \a Info associated to
147  * \a HashItem \p h. It is required that \p h belongs to the
148  * DataHashTable, otherwise it core dumps. Methods has() or
149  * get() can be used for inquiring wheater \p h belongs to the
150  * DataHashTable or not.
151  */
152  const Info& operator[](const HashItem& h) const
153  {
154  assert(has(h));
155 
156  return m_elem[index(h)].info;
157  }
158  /// adds a new entry to the hash table.
159  /** Adds a new entry consisting of \a HashItem \p h and \a Info \p info to the
160  * DataHashTable. No entry with HashItem \p h must be in the
161  * DataHashTable yet. After completion, \p info may be accessed via get() or
162  * operator[]() with \p h as parameter. The DataHashTable is #reMax()%ed
163  * if it becomes neccessary.
164  */
165  void add(const HashItem& h, const Info& info)
166  {
167  assert(!has(h));
168 
169  if (m_used >= m_elem.size() * HASHTABLE_FILLFACTOR)
170  reMax(int(m_memfactor * m_used) + 1);
171 
172  assert(m_used < m_elem.size());
173 
174  int i;
175 
176  for(i = (*m_hashfun)(&h) % m_elem.size();
177  m_elem[i].stat == Elem::USED;
178  i = (i + m_hashsize) % m_elem.size())
179  ;
180 
181  assert(m_elem[i].stat != Elem::USED);
182 
183  m_elem[i].stat = Elem::USED;
184  m_elem[i].info = info;
185  m_elem[i].item = h;
186 
187  m_used++;
188 
189  assert(has(h));
190  }
191 
192  /// remove \a HashItem \p h from the DataHashTable.
193  void remove(const HashItem& h)
194  {
195  assert(has(h));
196  m_elem[index(h)].stat = Elem::RELEASED;
197  m_used--;
198  assert(!has(h));
199  }
200 
201  /// remove all entries from DataHashTable.
202  void clear()
203  {
204  for(int i = 0; i < m_elem.size(); i++)
205  m_elem[i].stat = Elem::FREE;
206  m_used = 0;
207  }
208  /// reset size of the DataHashTable.
209  /** Reset the maximum number of elements of a DataHashTable to \p newSize.
210  * However, if \p newSize < #m_used, it is resized to #m_used only.
211  * If \p newHashSize < 1, a new hash size is computed automatically.
212  * Otherwise, the specified value will be taken.
213  */
214  void reMax (int newSize = -1, int newHashSize = 0)
215  {
216  DataArray< Elem > save(m_elem);
217 
218  m_elem.reSize(newSize < m_used ? m_used : newSize);
219 
220  clear();
221 
222  m_hashsize = (newHashSize < 1) ? autoHashSize() : newHashSize;
223 
224  for(int i = 0; i < save.size(); i++)
225  if (save[i].stat == Elem::USED)
226  add(save[i].item, save[i].info);
227  }
228  //@}
229 
230  //-----------------------------------
231  /**@name Debugging */
232  //@{
233  /// checks whether DataHashTable is consistent
234  bool isConsistent() const
235  {
236 #ifdef ENABLE_CONSISTENCY_CHECKS
237  int total = 0;
238 
239  for(int i = 0; i < m_elem.size(); i++)
240  {
241  if (m_elem[i].stat == Elem::USED)
242  {
243  total++;
244  if (!has(m_elem[i].item))
245  return MSGinconsistent("DataHashTable");
246  }
247  }
248  if (total != m_used)
249  return MSGinconsistent("DataHashTable");
250 
251  return m_elem.isConsistent();
252 #else
253  return true;
254 #endif
255  }
256  //@}
257 
258  //-----------------------------------
259  /**@name Construction / destruction */
260  //@{
261  /// default constructor.
262  /** Allocates a DataHashTable for \p maxsize entries using \p hashfun
263  * as hash function. If \p hashsize > 0, #m_hashsize is set to the
264  * specified value, otherwise a suitable hash size is computed
265  * automatically. Parameter \p factor is used for memory management:
266  * If more than \p maxsize entries are added to the DataHashTable, it
267  * will automatically be #reMax()%ed by a factor of \p factor.
268  *
269  * @param hashfun pointer to hash function.
270  * @param maxsize maximum number of hash elements.
271  * @param hashsize hash size.
272  * @param factor factor for increasing data block.
273  */
274  explicit DataHashTable(
275  int (*hashfun)(const HashItem*),
276  int maxsize = 265,
277  int hashsize = 0,
278  Real factor = 2.0)
279  : m_elem(maxsize)
280  , m_hashfun(hashfun)
281  , m_memfactor(factor)
282  {
283  clear();
284 
285  primes[0] = 1523;
286  primes[1] = 3547;
287  primes[2] = 8011;
288  primes[3] = 17707;
289  primes[4] = 38723;
290  primes[5] = 83833;
291  primes[6] = 180317;
292  primes[7] = 385897;
293  primes[8] = 821411;
294  primes[9] = 1742369;
295  primes[10] = 3680893;
296  primes[11] = 5693959;
297  primes[12] = 7753849;
298  primes[13] = 9849703;
299  primes[14] = 11973277;
300  primes[15] = 14121853;
301  primes[16] = 17643961;
302  primes[17] = 24273817;
303  primes[18] = 32452843;
304  primes[19] = 49979687;
305  primes[20] = 67867967;
306  primes[21] = 86028121;
307  primes[22] = 104395301;
308  primes[23] = 122949823;
309  primes[24] = 141650939;
310  primes[25] = 160481183;
311  primes[26] = 179424673;
312  primes[27] = 198491317;
313  primes[28] = 217645177;
314  primes[29] = 256203161;
315  primes[30] = 314606869;
316  primes[31] = 373587883;
317  primes[32] = 433024223;
318  primes[33] = 492876847;
319  primes[34] = 553105243;
320  primes[35] = 613651349;
321  primes[36] = 694847533;
322  primes[37] = 756065159;
323  primes[38] = 817504243;
324  primes[39] = 879190747;
325  primes[40] = 941083981;
326  primes[41] = 982451653;
327  primes[42] = INT_MAX;
328  nprimes = 43;
329 
330  m_hashsize = (hashsize < 1) ? autoHashSize() : hashsize;
331 
332  assert(m_memfactor > 1.0);
333  assert(isConsistent());
334  }
335 
336  /// assignment operator.
338  {
339  m_elem = base.m_elem;
340  m_hashfun = base.m_hashfun;
341  m_memfactor = base.m_memfactor;
342  m_used = base.m_used;
343  m_hashsize = base.m_hashsize;
344  primes = base.primes;
345  nprimes = base.nprimes;
346 
347  assert(m_memfactor > 1.0);
348  assert(isConsistent());
349  return *this;
350  }
351 
352  /// copy constructor.
354  : m_elem(base.m_elem)
355  , m_hashfun(base.m_hashfun)
356  , m_memfactor(base.m_memfactor)
357  , m_used(base.m_used)
358  , m_hashsize(base.m_hashsize)
359  , primes(base.primes)
360  , nprimes(base.nprimes)
361  {
362  assert(m_memfactor > 1.0);
363  assert(isConsistent());
364  }
365  //@}
366 
367 private:
368 
369  //-----------------------------------
370  /**@name Helpers */
371  //@{
372  /// determine a good \ref soplex::DataHashTable::m_hashsize "m_hashsize".
373  /** Determine next larger prime number for new #m_hashsize
374  * @return good value for #m_hashsize
375  */
376  int autoHashSize() const
377  {
378  int oldsize = m_elem.size();
379 
380  int left = 0;
381  int right = nprimes - 1;
382  int middle;
383 
384  while( left <= right)
385  {
386  middle = (left + right) / 2;
387 
388  if( oldsize < primes[middle])
389  {
390  right = middle - 1;
391  }
392  else if( oldsize > primes[middle])
393  {
394  left = middle + 1;
395  }
396  else
397  {
398  assert(oldsize == primes[middle]);
399  return primes[middle + 1];
400  }
401  }
402 
403  assert(left == right + 1);
404  return primes[left];
405  }
406 
407  /// automatically computes a good \ref soplex::DataHashTable::m_hashsize "m_hashsize".
408  /** Computes a good #m_hashsize as the product of all prime numbers
409  * not divisors of the number of elements that are <=
410  * the maximum divisor of the number of elemens.
411  * @return good value for #m_hashsize
412  */
413  int autoHashSizeold() const
414  {
415  DataArray< bool > prime(m_elem.size());
416  int hashsize = 1;
417  int maxsize = m_elem.size();
418  int i;
419 
420  for (i = 2; i < maxsize; i++)
421  prime[i] = true;
422 
423  for (i = 2; i < maxsize; ++i)
424  {
425  if (prime[i])
426  {
427  for (int j = i; j < maxsize; j += i)
428  prime[j] = false;
429 
430  if (m_elem.size() % i != 0)
431  {
432  hashsize *= i;
433 
434  if (hashsize > maxsize)
435  {
436  hashsize /= i;
437  break;
438  }
439  }
440  }
441  }
442  return hashsize;
443  }
444 
445  /// returns hash index of \a HashItem \p h or -1, if \p h is not present.
446  /** Using the hash function #m_hashfun, the hash value of \p h
447  * is calculated.
448  * Starting with this hash index, every #m_hashsize%-th element is
449  * compared with \p h until \p h is found or all elements have been checked.
450  *
451  * @param h \a HashItem, for which the hash index should be calculated
452  * @return hash index of \p h or -1,
453  * if \p h is not a member of the hash table
454  */
455  int index(const HashItem& h) const
456  {
457  if (m_used == 0)
458  return -1;
459 
460  assert(m_elem.size() > 0);
461 
462  int i = (*m_hashfun)(&h) % m_elem.size();
463  int j = i;
464 
465  while(m_elem[i].stat != Elem::FREE)
466  {
467  if ( (m_elem[i].stat == Elem::USED)
468  && (m_elem[i].item == h))
469  return i;
470 
471  i = (i + m_hashsize) % m_elem.size();
472 
473  if (i == j)
474  break;
475  }
476  return -1;
477  }
478  //@}
479 
480 };
481 
482 } // namespace soplex
483 #endif // _DATAHASHTABLE_H_
Safe arrays of data objects.Class DataArray provides safe arrays of Data Objects. For general C++ obj...
Definition: dataarray.h:63
element has never been used
Definition: datahashtable.h:95
DataHashTable(int(*hashfun)(const HashItem *), int maxsize=265, int hashsize=0, Real factor=2.0)
default constructor.
DataHashTable(const DataHashTable &base)
copy constructor.
Element< HashItem, Info > Elem
void clear()
remove all entries from DataHashTable.
#define HASHTABLE_FILLFACTOR
Definition: datahashtable.h:28
bool isConsistent() const
consistency check
Definition: dataarray.h:288
double Real
SOPLEX_DEBUG.
Definition: spxdefines.h:200
element had been used, but released
Definition: datahashtable.h:96
template class for elements stored in the hash table
Definition: datahashtable.h:85
int index(const HashItem &h) const
returns hash index of HashItem h or -1, if h is not present.
int primes[50]
some prime numbers for fast access
DataArray< Elem > m_elem
stores all elements of the hash table
int size() const
return nr. of elements.
Definition: dataarray.h:211
int m_used
current number of entries in the hash table
DataHashTable & operator=(const DataHashTable &base)
assignment operator.
Generic hash table for data objects.Class DataHashTable provides a generic hash table for Data Object...
Definition: datahashtable.h:76
int autoHashSize() const
determine a good m_hashsize.
Debugging, floating point type and parameter definitions.
states
States of an element.
Definition: datahashtable.h:93
bool has(const HashItem &h) const
Is item h present in DataHashTable?
Real m_memfactor
memory is reMax()ed by this factor if a new element does&#39;t fit
Everything should be within this namespace.
void add(const HashItem &h, const Info &info)
adds a new entry to the hash table.
int(* m_hashfun)(const HashItem *)
pointer to hash function (mapping: HashItem -> int)
const Info & operator[](const HashItem &h) const
references Info of HashItem h.
bool isConsistent() const
checks whether DataHashTable is consistent
int autoHashSizeold() const
automatically computes a good m_hashsize.
int nprimes
number of stored prime numbers
int m_hashsize
increment added to hash index, if allready used
#define MSGinconsistent(name)
Definition: spxdefines.h:121
void reSize(int newsize)
reset size to newsize.
Definition: dataarray.h:223
void reMax(int newSize=-1, int newHashSize=0)
reset size of the DataHashTable.
enum soplex::DataHashTable::Element::states stat