Scippy

SoPlex

Sequential object-oriented simPlex

datahashtable.h
Go to the documentation of this file.
1 /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
2 /* */
3 /* This file is part of the class library */
4 /* SoPlex --- the Sequential object-oriented simPlex. */
5 /* */
6 /* Copyright (C) 1996-2019 Konrad-Zuse-Zentrum */
7 /* fuer Informationstechnik Berlin */
8 /* */
9 /* SoPlex is distributed under the terms of the ZIB Academic Licence. */
10 /* */
11 /* You should have received a copy of the ZIB Academic License */
12 /* along with SoPlex; see the file COPYING. If not email to soplex@zib.de. */
13 /* */
14 /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
15 
16 /**@file datahashtable.h
17  * @brief Generic hash table for data objects.
18  */
19 #ifndef _DATAHASHTABLE_H_
20 #define _DATAHASHTABLE_H_
21 
22 #include <iostream>
23 #include <assert.h>
24 #include <limits.h>
25 
26 #include "soplex/spxdefines.h"
27 
28 #define HASHTABLE_FILLFACTOR 0.7
29 
30 namespace soplex
31 {
32 /**@brief Generic hash table for data objects.
33  @ingroup Elementary
34 
35  Class DataHashTable provides a generic hash table for
36  \ref DataObjects "Data Objects",
37  i.e., a map that maps arguments called \a HashItems to values called \a Infos.
38  HashItem and Info types are passed as template arguments. HashItems
39  must provide a comparison operator==(). Furthermore, both the HashItem and
40  Info must be data objects in the sense that the assignment operator is
41  equivalent to a <tt>memcpy()</tt> of the structure and no destructor is
42  required.
43 
44  The construction of a DataHashTable requires a \em hash \em function that
45  assigns an integer value to every HashItem. Provided this, pairs of a
46  HashItem and a Info can be added to the DataHashTable. No more
47  than one Info can be assigned to the same HashItem at a time. The Info
48  to a HashItem can be accessed through the subscript operator[]() with
49  the Info object as a subscript.
50 
51  The maximum number of elemens a DataHashTable can hold can be
52  specified upon construction and may be reset with reMax() later on.
53  Further, a value hash size value is required. This value must be less then
54  the maximum number of elements and must not have a common dominator with
55  the maximum number of elements. If not specified explicitely, it
56  is set automatically to a reasonable value.
57 
58  The implementation relies on an array of DataHashTable::Element%s, from
59  now on referred to as elements. Upon construction, all elements are
60  marked as \c FREE. When an entry is added
61  to the DataHashTable, the hash value is computed by calling #m_hashfun
62  for its HashItem. If this array element is unused, it is
63  taken right away. Otherwise, the array index is incremented by
64  the hash size (modulo the element array size()) until an unused element
65  is found.
66 
67  Removing elements is simply done by marking it as \c RELEASED. Hence,
68  when searching for an element, the search loop may not stop, when a
69  \c RELEASED element is encountered. However, such an element may be
70  reused when adding a new element to the DataHashTable.
71 
72  Further, memory management with resizing of the element array is
73  straight forward.
74 */
75 template < class HashItem, class Info >
77 {
78 private:
79 
80  //-----------------------------------
81  /**@name Types */
82  //@{
83  /// template class for elements stored in the hash table
84  template < class ElemHashItem, class ElemInfo >
85  class Element
86  {
87  public:
88  ///
89  ElemHashItem item;
90  ///
91  ElemInfo info;
92  /// States of an element
93  enum states
94  {
95  FREE, ///< element has never been used
96  RELEASED, ///< element had been used, but released
97  USED ///< element is in use
98  } stat;
99  };
100  typedef Element< HashItem, Info > Elem;
101  //@}
102 
103  //-----------------------------------
104  /**@name Data */
105  //@{
106  /// stores all elements of the hash table
108  /// increment added to hash index, if allready used
110  /// current number of entries in the hash table
111  int m_used;
112  /// pointer to hash function (mapping: \a HashItem -> int)
113  int (*m_hashfun)(const HashItem*);
114  /// memory is \ref soplex::DataHashTable::reMax() "reMax()"ed by this factor if a new element does't fit
116  /// some prime numbers for fast access
117  int primes[50];
118  /// number of stored prime numbers
119  int nprimes;
120 
121  //@}
122 
123 public:
124 
125  //-----------------------------------
126  /**@name Access / modification */
127  //@{
128  /// Is item \p h present in DataHashTable?
129  bool has(const HashItem& h) const
130  {
131  return index(h) >= 0;
132  }
133 
134  /// returns const pointer to \a Info of \a HashItem \p h or 0,
135  /// if item is not found.
136  /** Returns a pointer to \a Info component of hash element \p h or a zero
137  * pointer if element \p h is not in the table.
138  */
139  const Info* get(const HashItem& h) const
140  {
141  int i = index(h);
142 
143  return (i >= 0) ? &m_elem[i].info : 0;
144  }
145  /// references \a Info of \a HashItem \p h.
146  /** Index operator for accessing the \a Info associated to
147  * \a HashItem \p h. It is required that \p h belongs to the
148  * DataHashTable, otherwise it core dumps. Methods has() or
149  * get() can be used for inquiring wheater \p h belongs to the
150  * DataHashTable or not.
151  */
152  const Info& operator[](const HashItem& h) const
153  {
154  assert(has(h));
155 
156  return m_elem[index(h)].info;
157  }
158  /// adds a new entry to the hash table.
159  /** Adds a new entry consisting of \a HashItem \p h and \a Info \p info to the
160  * DataHashTable. No entry with HashItem \p h must be in the
161  * DataHashTable yet. After completion, \p info may be accessed via get() or
162  * operator[]() with \p h as parameter. The DataHashTable is #reMax()%ed
163  * if it becomes neccessary.
164  */
165  void add(const HashItem& h, const Info& info)
166  {
167  assert(!has(h));
168 
169  if(m_used >= m_elem.size() * HASHTABLE_FILLFACTOR)
170  reMax(int(m_memfactor * m_used) + 1);
171 
172  assert(m_used < m_elem.size());
173 
174  int i;
175 
176  for(i = (*m_hashfun)(&h) % m_elem.size();
177  m_elem[i].stat == Elem::USED;
178  i = (i + m_hashsize) % m_elem.size())
179  ;
180 
181  assert(m_elem[i].stat != Elem::USED);
182 
183  m_elem[i].stat = Elem::USED;
184  m_elem[i].info = info;
185  m_elem[i].item = h;
186 
187  m_used++;
188 
189  assert(has(h));
190  }
191 
192  /// remove \a HashItem \p h from the DataHashTable.
193  void remove(const HashItem& h)
194  {
195  assert(has(h));
196  m_elem[index(h)].stat = Elem::RELEASED;
197  m_used--;
198  assert(!has(h));
199  }
200 
201  /// remove all entries from DataHashTable.
202  void clear()
203  {
204  for(int i = 0; i < m_elem.size(); i++)
205  m_elem[i].stat = Elem::FREE;
206 
207  m_used = 0;
208  }
209  /// reset size of the DataHashTable.
210  /** Reset the maximum number of elements of a DataHashTable to \p newSize.
211  * However, if \p newSize < #m_used, it is resized to #m_used only.
212  * If \p newHashSize < 1, a new hash size is computed automatically.
213  * Otherwise, the specified value will be taken.
214  */
215  void reMax(int newSize = -1, int newHashSize = 0)
216  {
217  DataArray< Elem > save(m_elem);
218 
219  m_elem.reSize(newSize < m_used ? m_used : newSize);
220 
221  clear();
222 
223  m_hashsize = (newHashSize < 1) ? autoHashSize() : newHashSize;
224 
225  for(int i = 0; i < save.size(); i++)
226  if(save[i].stat == Elem::USED)
227  add(save[i].item, save[i].info);
228  }
229  //@}
230 
231  //-----------------------------------
232  /**@name Debugging */
233  //@{
234  /// checks whether DataHashTable is consistent
235  bool isConsistent() const
236  {
237 #ifdef ENABLE_CONSISTENCY_CHECKS
238  int total = 0;
239 
240  for(int i = 0; i < m_elem.size(); i++)
241  {
242  if(m_elem[i].stat == Elem::USED)
243  {
244  total++;
245 
246  if(!has(m_elem[i].item))
247  return MSGinconsistent("DataHashTable");
248  }
249  }
250 
251  if(total != m_used)
252  return MSGinconsistent("DataHashTable");
253 
254  return m_elem.isConsistent();
255 #else
256  return true;
257 #endif
258  }
259  //@}
260 
261  //-----------------------------------
262  /**@name Construction / destruction */
263  //@{
264  /// default constructor.
265  /** Allocates a DataHashTable for \p maxsize entries using \p hashfun
266  * as hash function. If \p hashsize > 0, #m_hashsize is set to the
267  * specified value, otherwise a suitable hash size is computed
268  * automatically. Parameter \p factor is used for memory management:
269  * If more than \p maxsize entries are added to the DataHashTable, it
270  * will automatically be #reMax()%ed by a factor of \p factor.
271  *
272  * @param hashfun pointer to hash function.
273  * @param maxsize maximum number of hash elements.
274  * @param hashsize hash size.
275  * @param factor factor for increasing data block.
276  */
277  explicit DataHashTable(
278  int (*hashfun)(const HashItem*),
279  int maxsize = 265,
280  int hashsize = 0,
281  Real factor = 2.0)
282  : m_elem(maxsize)
283  , m_hashfun(hashfun)
284  , m_memfactor(factor)
285  {
286  clear();
287 
288  primes[0] = 1523;
289  primes[1] = 3547;
290  primes[2] = 8011;
291  primes[3] = 17707;
292  primes[4] = 38723;
293  primes[5] = 83833;
294  primes[6] = 180317;
295  primes[7] = 385897;
296  primes[8] = 821411;
297  primes[9] = 1742369;
298  primes[10] = 3680893;
299  primes[11] = 5693959;
300  primes[12] = 7753849;
301  primes[13] = 9849703;
302  primes[14] = 11973277;
303  primes[15] = 14121853;
304  primes[16] = 17643961;
305  primes[17] = 24273817;
306  primes[18] = 32452843;
307  primes[19] = 49979687;
308  primes[20] = 67867967;
309  primes[21] = 86028121;
310  primes[22] = 104395301;
311  primes[23] = 122949823;
312  primes[24] = 141650939;
313  primes[25] = 160481183;
314  primes[26] = 179424673;
315  primes[27] = 198491317;
316  primes[28] = 217645177;
317  primes[29] = 256203161;
318  primes[30] = 314606869;
319  primes[31] = 373587883;
320  primes[32] = 433024223;
321  primes[33] = 492876847;
322  primes[34] = 553105243;
323  primes[35] = 613651349;
324  primes[36] = 694847533;
325  primes[37] = 756065159;
326  primes[38] = 817504243;
327  primes[39] = 879190747;
328  primes[40] = 941083981;
329  primes[41] = 982451653;
330  primes[42] = INT_MAX;
331  nprimes = 43;
332 
333  m_hashsize = (hashsize < 1) ? autoHashSize() : hashsize;
334 
335  assert(m_memfactor > 1.0);
336  assert(isConsistent());
337  }
338 
339  /// assignment operator.
341  {
342  m_elem = base.m_elem;
343  m_hashfun = base.m_hashfun;
344  m_memfactor = base.m_memfactor;
345  m_used = base.m_used;
346  m_hashsize = base.m_hashsize;
347  primes = base.primes;
348  nprimes = base.nprimes;
349 
350  assert(m_memfactor > 1.0);
351  assert(isConsistent());
352  return *this;
353  }
354 
355  /// copy constructor.
357  : m_elem(base.m_elem)
358  , m_hashfun(base.m_hashfun)
359  , m_memfactor(base.m_memfactor)
360  , m_used(base.m_used)
361  , m_hashsize(base.m_hashsize)
362  , primes(base.primes)
363  , nprimes(base.nprimes)
364  {
365  assert(m_memfactor > 1.0);
366  assert(isConsistent());
367  }
368  //@}
369 
370 private:
371 
372  //-----------------------------------
373  /**@name Helpers */
374  //@{
375  /// determine a good \ref soplex::DataHashTable::m_hashsize "m_hashsize".
376  /** Determine next larger prime number for new #m_hashsize
377  * @return good value for #m_hashsize
378  */
379  int autoHashSize() const
380  {
381  int oldsize = m_elem.size();
382 
383  int left = 0;
384  int right = nprimes - 1;
385  int middle;
386 
387  while(left <= right)
388  {
389  middle = (left + right) / 2;
390 
391  if(oldsize < primes[middle])
392  {
393  right = middle - 1;
394  }
395  else if(oldsize > primes[middle])
396  {
397  left = middle + 1;
398  }
399  else
400  {
401  assert(oldsize == primes[middle]);
402  return primes[middle + 1];
403  }
404  }
405 
406  assert(left == right + 1);
407  return primes[left];
408  }
409 
410  /// automatically computes a good \ref soplex::DataHashTable::m_hashsize "m_hashsize".
411  /** Computes a good #m_hashsize as the product of all prime numbers
412  * not divisors of the number of elements that are <=
413  * the maximum divisor of the number of elemens.
414  * @return good value for #m_hashsize
415  */
416  int autoHashSizeold() const
417  {
418  DataArray< bool > prime(m_elem.size());
419  int hashsize = 1;
420  int maxsize = m_elem.size();
421  int i;
422 
423  for(i = 2; i < maxsize; i++)
424  prime[i] = true;
425 
426  for(i = 2; i < maxsize; ++i)
427  {
428  if(prime[i])
429  {
430  for(int j = i; j < maxsize; j += i)
431  prime[j] = false;
432 
433  if(m_elem.size() % i != 0)
434  {
435  hashsize *= i;
436 
437  if(hashsize > maxsize)
438  {
439  hashsize /= i;
440  break;
441  }
442  }
443  }
444  }
445 
446  return hashsize;
447  }
448 
449  /// returns hash index of \a HashItem \p h or -1, if \p h is not present.
450  /** Using the hash function #m_hashfun, the hash value of \p h
451  * is calculated.
452  * Starting with this hash index, every #m_hashsize%-th element is
453  * compared with \p h until \p h is found or all elements have been checked.
454  *
455  * @param h \a HashItem, for which the hash index should be calculated
456  * @return hash index of \p h or -1,
457  * if \p h is not a member of the hash table
458  */
459  int index(const HashItem& h) const
460  {
461  if(m_used == 0)
462  return -1;
463 
464  assert(m_elem.size() > 0);
465 
466  int i = (*m_hashfun)(&h) % m_elem.size();
467  int j = i;
468 
469  while(m_elem[i].stat != Elem::FREE)
470  {
471  if((m_elem[i].stat == Elem::USED)
472  && (m_elem[i].item == h))
473  return i;
474 
475  i = (i + m_hashsize) % m_elem.size();
476 
477  if(i == j)
478  break;
479  }
480 
481  return -1;
482  }
483  //@}
484 
485 };
486 
487 } // namespace soplex
488 #endif // _DATAHASHTABLE_H_
Safe arrays of data objects.Class DataArray provides safe arrays of Data Objects. For general C++ obj...
Definition: dataarray.h:63
bool isConsistent() const
consistency check
Definition: dataarray.h:298
element has never been used
Definition: datahashtable.h:95
DataHashTable(int(*hashfun)(const HashItem *), int maxsize=265, int hashsize=0, Real factor=2.0)
default constructor.
DataHashTable(const DataHashTable &base)
copy constructor.
Element< HashItem, Info > Elem
bool isConsistent() const
checks whether DataHashTable is consistent
void clear()
remove all entries from DataHashTable.
#define HASHTABLE_FILLFACTOR
Definition: datahashtable.h:28
int index(const HashItem &h) const
returns hash index of HashItem h or -1, if h is not present.
double Real
Definition: spxdefines.h:218
element had been used, but released
Definition: datahashtable.h:96
template class for elements stored in the hash table
Definition: datahashtable.h:85
int primes[50]
some prime numbers for fast access
DataArray< Elem > m_elem
stores all elements of the hash table
int autoHashSizeold() const
automatically computes a good m_hashsize.
int autoHashSize() const
determine a good m_hashsize.
int m_used
current number of entries in the hash table
DataHashTable & operator=(const DataHashTable &base)
assignment operator.
Generic hash table for data objects.Class DataHashTable provides a generic hash table for Data Object...
Definition: datahashtable.h:76
const Info & operator[](const HashItem &h) const
references Info of HashItem h.
Debugging, floating point type and parameter definitions.
states
States of an element.
Definition: datahashtable.h:93
Real m_memfactor
memory is reMax()ed by this factor if a new element does&#39;t fit
Everything should be within this namespace.
void add(const HashItem &h, const Info &info)
adds a new entry to the hash table.
int(* m_hashfun)(const HashItem *)
pointer to hash function (mapping: HashItem -> int)
int size() const
return nr. of elements.
Definition: dataarray.h:214
int nprimes
number of stored prime numbers
int m_hashsize
increment added to hash index, if allready used
bool has(const HashItem &h) const
Is item h present in DataHashTable?
#define MSGinconsistent(name)
Definition: spxdefines.h:126
void reSize(int newsize)
reset size to newsize.
Definition: dataarray.h:226
void reMax(int newSize=-1, int newHashSize=0)
reset size of the DataHashTable.
enum soplex::DataHashTable::Element::states stat