Scippy

SoPlex

Sequential object-oriented simPlex

datahashtable.h
Go to the documentation of this file.
1 /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
2 /* */
3 /* This file is part of the class library */
4 /* SoPlex --- the Sequential object-oriented simPlex. */
5 /* */
6 /* Copyright (C) 1996-2022 Konrad-Zuse-Zentrum */
7 /* fuer Informationstechnik Berlin */
8 /* */
9 /* SoPlex is distributed under the terms of the ZIB Academic Licence. */
10 /* */
11 /* You should have received a copy of the ZIB Academic License */
12 /* along with SoPlex; see the file COPYING. If not email to soplex@zib.de. */
13 /* */
14 /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
15 
16 /**@file datahashtable.h
17  * @brief Generic hash table for data objects.
18  */
19 #ifndef _DATAHASHTABLE_H_
20 #define _DATAHASHTABLE_H_
21 
22 #include <iostream>
23 #include <assert.h>
24 #include <limits.h>
25 
26 #include "soplex/spxdefines.h"
27 #include "soplex/array.h"
28 
29 #define HASHTABLE_FILLFACTOR 0.7
30 
31 namespace soplex
32 {
33 /**@brief Generic hash table for data objects.
34  @ingroup Elementary
35 
36  Class DataHashTable provides a generic hash table for
37  \ref DataObjects "Data Objects",
38  i.e., a map that maps arguments called \a HashItems to values called \a Infos.
39  HashItem and Info types are passed as template arguments. HashItems
40  must provide a comparison operator==(). Furthermore, both the HashItem and
41  Info must be data objects in the sense that the assignment operator is
42  equivalent to a <tt>memcpy()</tt> of the structure and no destructor is
43  required.
44 
45  The construction of a DataHashTable requires a \em hash \em function that
46  assigns an integer value to every HashItem. Provided this, pairs of a
47  HashItem and a Info can be added to the DataHashTable. No more
48  than one Info can be assigned to the same HashItem at a time. The Info
49  to a HashItem can be accessed through the subscript operator[]() with
50  the Info object as a subscript.
51 
52  The maximum number of elemens a DataHashTable can hold can be
53  specified upon construction and may be reset with reMax() later on.
54  Further, a value hash size value is required. This value must be less then
55  the maximum number of elements and must not have a common dominator with
56  the maximum number of elements. If not specified explicitely, it
57  is set automatically to a reasonable value.
58 
59  The implementation relies on an array of DataHashTable::Element%s, from
60  now on referred to as elements. Upon construction, all elements are
61  marked as \c FREE. When an entry is added
62  to the DataHashTable, the hash value is computed by calling #m_hashfun
63  for its HashItem. If this array element is unused, it is
64  taken right away. Otherwise, the array index is incremented by
65  the hash size (modulo the element array size()) until an unused element
66  is found.
67 
68  Removing elements is simply done by marking it as \c RELEASED. Hence,
69  when searching for an element, the search loop may not stop, when a
70  \c RELEASED element is encountered. However, such an element may be
71  reused when adding a new element to the DataHashTable.
72 
73  Further, memory management with resizing of the element array is
74  straight forward.
75 */
76 template < class HashItem, class Info >
78 {
79 private:
80 
81  //-----------------------------------
82  /**@name Types */
83  ///@{
84  /// template class for elements stored in the hash table
85  template < class ElemHashItem, class ElemInfo >
86  class Element
87  {
88  public:
89  ///
90  ElemHashItem item;
91  ///
92  ElemInfo info;
93  /// States of an element
94  enum states
95  {
96  FREE, ///< element has never been used
97  RELEASED, ///< element had been used, but released
98  USED ///< element is in use
99  } stat;
100  };
101  typedef Element< HashItem, Info > Elem;
102  ///@}
103 
104  //-----------------------------------
105  /**@name Data */
106  ///@{
107  /// stores all elements of the hash table
109  /// increment added to hash index, if allready used
111  /// current number of entries in the hash table
112  int m_used;
113  /// pointer to hash function (mapping: \a HashItem -> int)
114  int (*m_hashfun)(const HashItem*);
115  /// memory is \ref soplex::DataHashTable::reMax() "reMax()"ed by this factor if a new element does't fit
117  /// some prime numbers for fast access
118  int primes[50];
119  /// number of stored prime numbers
120  int nprimes;
121 
122  ///@}
123 
124 public:
125 
126  //-----------------------------------
127  /**@name Access / modification */
128  ///@{
129  /// Is item \p h present in DataHashTable?
130  bool has(const HashItem& h) const
131  {
132  return index(h) >= 0;
133  }
134 
135  /// returns const pointer to \a Info of \a HashItem \p h or 0,
136  /// if item is not found.
137  /** Returns a pointer to \a Info component of hash element \p h or a zero
138  * pointer if element \p h is not in the table.
139  */
140  const Info* get(const HashItem& h) const
141  {
142  int i = index(h);
143 
144  return (i >= 0) ? &m_elem[i].info : nullptr;
145  }
146  /// references \a Info of \a HashItem \p h.
147  /** Index operator for accessing the \a Info associated to
148  * \a HashItem \p h. It is required that \p h belongs to the
149  * DataHashTable, otherwise it core dumps. Methods has() or
150  * get() can be used for inquiring wheater \p h belongs to the
151  * DataHashTable or not.
152  */
153  const Info& operator[](const HashItem& h) const
154  {
155  assert(has(h));
156 
157  return m_elem[index(h)].info;
158  }
159  /// adds a new entry to the hash table.
160  /** Adds a new entry consisting of \a HashItem \p h and \a Info \p info to the
161  * DataHashTable. No entry with HashItem \p h must be in the
162  * DataHashTable yet. After completion, \p info may be accessed via get() or
163  * operator[]() with \p h as parameter. The DataHashTable is #reMax()%ed
164  * if it becomes neccessary.
165  */
166  void add(const HashItem& h, const Info& info)
167  {
168  assert(!has(h));
169 
170  if(m_used >= m_elem.size() * HASHTABLE_FILLFACTOR)
171  reMax(int(m_memfactor * m_used) + 1);
172 
173  assert(m_used < m_elem.size());
174 
175  decltype(m_elem.size()) i;
176 
177  for(i = (*m_hashfun)(&h) % m_elem.size();
178  m_elem[i].stat == Elem::USED;
179  i = (i + m_hashsize) % m_elem.size())
180  ;
181 
182  assert(m_elem[i].stat != Elem::USED);
183 
184  m_elem[i].stat = Elem::USED;
185  m_elem[i].info = info;
186  m_elem[i].item = h;
187 
188  m_used++;
189 
190  assert(has(h));
191  }
192 
193  /// remove \a HashItem \p h from the DataHashTable.
194  void remove(const HashItem& h)
195  {
196  int i = index(h);
197 
198  if(i < 0)
199  return;
200 
201  m_elem[i].stat = Elem::RELEASED;
202  m_used--;
203  assert(!has(h));
204  }
205 
206  /// remove all entries from DataHashTable.
207  void clear()
208  {
209  for(int i = 0; i < m_elem.size(); i++)
210  m_elem[i].stat = Elem::FREE;
211 
212  m_used = 0;
213  }
214  /// reset size of the DataHashTable.
215  /** Reset the maximum number of elements of a DataHashTable to \p newSize.
216  * However, if \p newSize < #m_used, it is resized to #m_used only.
217  * If \p newHashSize < 1, a new hash size is computed automatically.
218  * Otherwise, the specified value will be taken.
219  */
220  void reMax(int newSize = -1, int newHashSize = 0)
221  {
222  Array< Elem > save(m_elem);
223 
224  m_elem.reSize(newSize < m_used ? m_used : newSize);
225 
226  clear();
227 
228  m_hashsize = (newHashSize < 1) ? autoHashSize() : newHashSize;
229 
230  for(int i = 0; i < save.size(); i++)
231  if(save[i].stat == Elem::USED)
232  add(save[i].item, save[i].info);
233  }
234  ///@}
235 
236  //-----------------------------------
237  /**@name Debugging */
238  ///@{
239  /// checks whether DataHashTable is consistent
240  bool isConsistent() const
241  {
242 #ifdef ENABLE_CONSISTENCY_CHECKS
243  int total = 0;
244 
245  for(int i = 0; i < m_elem.size(); i++)
246  {
247  if(m_elem[i].stat == Elem::USED)
248  {
249  total++;
250 
251  if(!has(m_elem[i].item))
252  return MSGinconsistent("DataHashTable");
253  }
254  }
255 
256  if(total != m_used)
257  return MSGinconsistent("DataHashTable");
258 
259  return m_elem.isConsistent();
260 #else
261  return true;
262 #endif
263  }
264  ///@}
265 
266  //-----------------------------------
267  /**@name Construction / destruction */
268  ///@{
269  /// default constructor.
270  /** Allocates a DataHashTable for \p maxsize entries using \p hashfun
271  * as hash function. If \p hashsize > 0, #m_hashsize is set to the
272  * specified value, otherwise a suitable hash size is computed
273  * automatically. Parameter \p factor is used for memory management:
274  * If more than \p maxsize entries are added to the DataHashTable, it
275  * will automatically be #reMax()%ed by a factor of \p factor.
276  *
277  * @param hashfun pointer to hash function.
278  * @param maxsize maximum number of hash elements.
279  * @param hashsize hash size.
280  * @param factor factor for increasing data block.
281  */
282  explicit DataHashTable(
283  int (*hashfun)(const HashItem*),
284  int maxsize = 265,
285  int hashsize = 0,
286  Real factor = 2.0)
287  : m_elem(maxsize)
288  , m_hashfun(hashfun)
289  , m_memfactor(factor)
290  {
291  clear();
292 
293  primes[0] = 1523;
294  primes[1] = 3547;
295  primes[2] = 8011;
296  primes[3] = 17707;
297  primes[4] = 38723;
298  primes[5] = 83833;
299  primes[6] = 180317;
300  primes[7] = 385897;
301  primes[8] = 821411;
302  primes[9] = 1742369;
303  primes[10] = 3680893;
304  primes[11] = 5693959;
305  primes[12] = 7753849;
306  primes[13] = 9849703;
307  primes[14] = 11973277;
308  primes[15] = 14121853;
309  primes[16] = 17643961;
310  primes[17] = 24273817;
311  primes[18] = 32452843;
312  primes[19] = 49979687;
313  primes[20] = 67867967;
314  primes[21] = 86028121;
315  primes[22] = 104395301;
316  primes[23] = 122949823;
317  primes[24] = 141650939;
318  primes[25] = 160481183;
319  primes[26] = 179424673;
320  primes[27] = 198491317;
321  primes[28] = 217645177;
322  primes[29] = 256203161;
323  primes[30] = 314606869;
324  primes[31] = 373587883;
325  primes[32] = 433024223;
326  primes[33] = 492876847;
327  primes[34] = 553105243;
328  primes[35] = 613651349;
329  primes[36] = 694847533;
330  primes[37] = 756065159;
331  primes[38] = 817504243;
332  primes[39] = 879190747;
333  primes[40] = 941083981;
334  primes[41] = 982451653;
335  primes[42] = INT_MAX;
336  nprimes = 43;
337 
338  m_hashsize = (hashsize < 1) ? autoHashSize() : hashsize;
339 
340  assert(m_memfactor > 1.0);
341  assert(isConsistent());
342  }
343 
344  /// assignment operator.
346  {
347  m_elem = base.m_elem;
348  m_hashfun = base.m_hashfun;
349  m_memfactor = base.m_memfactor;
350  m_used = base.m_used;
351  m_hashsize = base.m_hashsize;
352  primes = base.primes;
353  nprimes = base.nprimes;
354 
355  assert(m_memfactor > 1.0);
356  assert(isConsistent());
357  return *this;
358  }
359 
360  /// copy constructor.
362  : m_elem(base.m_elem)
363  , m_hashfun(base.m_hashfun)
364  , m_memfactor(base.m_memfactor)
365  , m_used(base.m_used)
366  , m_hashsize(base.m_hashsize)
367  , primes(base.primes)
368  , nprimes(base.nprimes)
369  {
370  assert(m_memfactor > 1.0);
371  assert(isConsistent());
372  }
373  ///@}
374 
375 private:
376 
377  //-----------------------------------
378  /**@name Helpers */
379  ///@{
380  /// determine a good \ref soplex::DataHashTable::m_hashsize "m_hashsize".
381  /** Determine next larger prime number for new #m_hashsize
382  * @return good value for #m_hashsize
383  */
384  int autoHashSize() const
385  {
386  auto oldsize = m_elem.size();
387 
388  int left = 0;
389  int right = nprimes - 1;
390  int middle;
391 
392  while(left <= right)
393  {
394  middle = (left + right) / 2;
395 
396  if(oldsize < primes[middle])
397  {
398  right = middle - 1;
399  }
400  else if(oldsize > primes[middle])
401  {
402  left = middle + 1;
403  }
404  else
405  {
406  assert(oldsize == primes[middle]);
407  return primes[middle + 1];
408  }
409  }
410 
411  assert(left == right + 1);
412  return primes[left];
413  }
414 
415  /// automatically computes a good \ref soplex::DataHashTable::m_hashsize "m_hashsize".
416  /** Computes a good #m_hashsize as the product of all prime numbers
417  * not divisors of the number of elements that are <=
418  * the maximum divisor of the number of elemens.
419  * @return good value for #m_hashsize
420  */
421  int autoHashSizeold() const
422  {
423  DataArray< bool > prime(m_elem.size());
424  int hashsize = 1;
425  int maxsize = m_elem.size();
426  int i;
427 
428  for(i = 2; i < maxsize; i++)
429  prime[i] = true;
430 
431  for(i = 2; i < maxsize; ++i)
432  {
433  if(prime[i])
434  {
435  for(int j = i; j < maxsize; j += i)
436  prime[j] = false;
437 
438  if(m_elem.size() % i != 0)
439  {
440  hashsize *= i;
441 
442  if(hashsize > maxsize)
443  {
444  hashsize /= i;
445  break;
446  }
447  }
448  }
449  }
450 
451  return hashsize;
452  }
453 
454  /// returns hash index of \a HashItem \p h or -1, if \p h is not present.
455  /** Using the hash function #m_hashfun, the hash value of \p h
456  * is calculated.
457  * Starting with this hash index, every #m_hashsize%-th element is
458  * compared with \p h until \p h is found or all elements have been checked.
459  *
460  * @param h \a HashItem, for which the hash index should be calculated
461  * @return hash index of \p h or -1,
462  * if \p h is not a member of the hash table
463  */
464  int index(const HashItem& h) const
465  {
466  if(m_used == 0)
467  return -1;
468 
469  assert(m_elem.size() > 0);
470 
471  auto i = (*m_hashfun)(&h) % m_elem.size();
472  auto j = i;
473 
474  while(m_elem[i].stat != Elem::FREE)
475  {
476  if((m_elem[i].stat == Elem::USED)
477  && (m_elem[i].item == h))
478  return i;
479 
480  i = (i + m_hashsize) % m_elem.size();
481 
482  if(i == j)
483  break;
484  }
485 
486  return -1;
487  }
488  ///@}
489 
490 };
491 
492 } // namespace soplex
493 #endif // _DATAHASHTABLE_H_
int size() const
return the number of elements.
Definition: array.h:191
bool isConsistent() const
Consistency check.
Definition: array.h:258
element has never been used
Definition: datahashtable.h:96
DataHashTable(int(*hashfun)(const HashItem *), int maxsize=265, int hashsize=0, Real factor=2.0)
default constructor.
DataHashTable(const DataHashTable &base)
copy constructor.
Safe arrays of arbitrary types.Class Array provides safe arrays of arbitrary type. Array elements are accessed just like ordinary C++ array elements by means of the index operator[](). Safety is provided by.
Definition: array.h:63
Element< HashItem, Info > Elem
bool isConsistent() const
checks whether DataHashTable is consistent
Array< Elem > m_elem
stores all elements of the hash table
void clear()
remove all entries from DataHashTable.
#define HASHTABLE_FILLFACTOR
Definition: datahashtable.h:29
int index(const HashItem &h) const
returns hash index of HashItem h or -1, if h is not present.
double Real
Definition: spxdefines.h:256
element had been used, but released
Definition: datahashtable.h:97
template class for elements stored in the hash table
Definition: datahashtable.h:86
int primes[50]
some prime numbers for fast access
int autoHashSizeold() const
automatically computes a good m_hashsize.
int autoHashSize() const
determine a good m_hashsize.
int m_used
current number of entries in the hash table
DataHashTable & operator=(const DataHashTable &base)
assignment operator.
Generic hash table for data objects.Class DataHashTable provides a generic hash table for Data Object...
Definition: datahashtable.h:77
const Info & operator[](const HashItem &h) const
references Info of HashItem h.
Debugging, floating point type and parameter definitions.
states
States of an element.
Definition: datahashtable.h:94
Real m_memfactor
memory is reMax()ed by this factor if a new element does&#39;t fit
Everything should be within this namespace.
void add(const HashItem &h, const Info &info)
adds a new entry to the hash table.
void reSize(int newsize)
reset the number of elements.
Definition: array.h:197
int(* m_hashfun)(const HashItem *)
pointer to hash function (mapping: HashItem -> int)
Save arrays of arbitrary types.
int nprimes
number of stored prime numbers
int m_hashsize
increment added to hash index, if allready used
bool has(const HashItem &h) const
Is item h present in DataHashTable?
#define MSGinconsistent(name)
Definition: spxdefines.h:164
void reMax(int newSize=-1, int newHashSize=0)
reset size of the DataHashTable.
enum soplex::DataHashTable::Element::states stat