Scippy

SoPlex

Sequential object-oriented simPlex

datahashtable.h
Go to the documentation of this file.
1 /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
2 /* */
3 /* This file is part of the class library */
4 /* SoPlex --- the Sequential object-oriented simPlex. */
5 /* */
6 /* Copyright (C) 1996-2020 Konrad-Zuse-Zentrum */
7 /* fuer Informationstechnik Berlin */
8 /* */
9 /* SoPlex is distributed under the terms of the ZIB Academic Licence. */
10 /* */
11 /* You should have received a copy of the ZIB Academic License */
12 /* along with SoPlex; see the file COPYING. If not email to soplex@zib.de. */
13 /* */
14 /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
15 
16 /**@file datahashtable.h
17  * @brief Generic hash table for data objects.
18  */
19 #ifndef _DATAHASHTABLE_H_
20 #define _DATAHASHTABLE_H_
21 
22 #include <iostream>
23 #include <assert.h>
24 #include <limits.h>
25 
26 #include "soplex/spxdefines.h"
27 #include "soplex/array.h"
28 
29 #define HASHTABLE_FILLFACTOR 0.7
30 
31 namespace soplex
32 {
33 /**@brief Generic hash table for data objects.
34  @ingroup Elementary
35 
36  Class DataHashTable provides a generic hash table for
37  \ref DataObjects "Data Objects",
38  i.e., a map that maps arguments called \a HashItems to values called \a Infos.
39  HashItem and Info types are passed as template arguments. HashItems
40  must provide a comparison operator==(). Furthermore, both the HashItem and
41  Info must be data objects in the sense that the assignment operator is
42  equivalent to a <tt>memcpy()</tt> of the structure and no destructor is
43  required.
44 
45  The construction of a DataHashTable requires a \em hash \em function that
46  assigns an integer value to every HashItem. Provided this, pairs of a
47  HashItem and a Info can be added to the DataHashTable. No more
48  than one Info can be assigned to the same HashItem at a time. The Info
49  to a HashItem can be accessed through the subscript operator[]() with
50  the Info object as a subscript.
51 
52  The maximum number of elemens a DataHashTable can hold can be
53  specified upon construction and may be reset with reMax() later on.
54  Further, a value hash size value is required. This value must be less then
55  the maximum number of elements and must not have a common dominator with
56  the maximum number of elements. If not specified explicitely, it
57  is set automatically to a reasonable value.
58 
59  The implementation relies on an array of DataHashTable::Element%s, from
60  now on referred to as elements. Upon construction, all elements are
61  marked as \c FREE. When an entry is added
62  to the DataHashTable, the hash value is computed by calling #m_hashfun
63  for its HashItem. If this array element is unused, it is
64  taken right away. Otherwise, the array index is incremented by
65  the hash size (modulo the element array size()) until an unused element
66  is found.
67 
68  Removing elements is simply done by marking it as \c RELEASED. Hence,
69  when searching for an element, the search loop may not stop, when a
70  \c RELEASED element is encountered. However, such an element may be
71  reused when adding a new element to the DataHashTable.
72 
73  Further, memory management with resizing of the element array is
74  straight forward.
75 */
76 template < class HashItem, class Info >
78 {
79 private:
80 
81  //-----------------------------------
82  /**@name Types */
83  ///@{
84  /// template class for elements stored in the hash table
85  template < class ElemHashItem, class ElemInfo >
86  class Element
87  {
88  public:
89  ///
90  ElemHashItem item;
91  ///
92  ElemInfo info;
93  /// States of an element
94  enum states
95  {
96  FREE, ///< element has never been used
97  RELEASED, ///< element had been used, but released
98  USED ///< element is in use
99  } stat;
100  };
101  typedef Element< HashItem, Info > Elem;
102  ///@}
103 
104  //-----------------------------------
105  /**@name Data */
106  ///@{
107  /// stores all elements of the hash table
109  /// increment added to hash index, if allready used
111  /// current number of entries in the hash table
112  int m_used;
113  /// pointer to hash function (mapping: \a HashItem -> int)
114  int (*m_hashfun)(const HashItem*);
115  /// memory is \ref soplex::DataHashTable::reMax() "reMax()"ed by this factor if a new element does't fit
117  /// some prime numbers for fast access
118  int primes[50];
119  /// number of stored prime numbers
120  int nprimes;
121 
122  ///@}
123 
124 public:
125 
126  //-----------------------------------
127  /**@name Access / modification */
128  ///@{
129  /// Is item \p h present in DataHashTable?
130  bool has(const HashItem& h) const
131  {
132  return index(h) >= 0;
133  }
134 
135  /// returns const pointer to \a Info of \a HashItem \p h or 0,
136  /// if item is not found.
137  /** Returns a pointer to \a Info component of hash element \p h or a zero
138  * pointer if element \p h is not in the table.
139  */
140  const Info* get(const HashItem& h) const
141  {
142  int i = index(h);
143 
144  return (i >= 0) ? &m_elem[i].info : nullptr;
145  }
146  /// references \a Info of \a HashItem \p h.
147  /** Index operator for accessing the \a Info associated to
148  * \a HashItem \p h. It is required that \p h belongs to the
149  * DataHashTable, otherwise it core dumps. Methods has() or
150  * get() can be used for inquiring wheater \p h belongs to the
151  * DataHashTable or not.
152  */
153  const Info& operator[](const HashItem& h) const
154  {
155  assert(has(h));
156 
157  return m_elem[index(h)].info;
158  }
159  /// adds a new entry to the hash table.
160  /** Adds a new entry consisting of \a HashItem \p h and \a Info \p info to the
161  * DataHashTable. No entry with HashItem \p h must be in the
162  * DataHashTable yet. After completion, \p info may be accessed via get() or
163  * operator[]() with \p h as parameter. The DataHashTable is #reMax()%ed
164  * if it becomes neccessary.
165  */
166  void add(const HashItem& h, const Info& info)
167  {
168  assert(!has(h));
169 
170  if(m_used >= m_elem.size() * HASHTABLE_FILLFACTOR)
171  reMax(int(m_memfactor * m_used) + 1);
172 
173  assert(m_used < m_elem.size());
174 
175  decltype(m_elem.size()) i;
176 
177  for(i = (*m_hashfun)(&h) % m_elem.size();
178  m_elem[i].stat == Elem::USED;
179  i = (i + m_hashsize) % m_elem.size())
180  ;
181 
182  assert(m_elem[i].stat != Elem::USED);
183 
184  m_elem[i].stat = Elem::USED;
185  m_elem[i].info = info;
186  m_elem[i].item = h;
187 
188  m_used++;
189 
190  assert(has(h));
191  }
192 
193  /// remove \a HashItem \p h from the DataHashTable.
194  void remove(const HashItem& h)
195  {
196  assert(has(h));
197  m_elem[index(h)].stat = Elem::RELEASED;
198  m_used--;
199  assert(!has(h));
200  }
201 
202  /// remove all entries from DataHashTable.
203  void clear()
204  {
205  for(int i = 0; i < m_elem.size(); i++)
206  m_elem[i].stat = Elem::FREE;
207 
208  m_used = 0;
209  }
210  /// reset size of the DataHashTable.
211  /** Reset the maximum number of elements of a DataHashTable to \p newSize.
212  * However, if \p newSize < #m_used, it is resized to #m_used only.
213  * If \p newHashSize < 1, a new hash size is computed automatically.
214  * Otherwise, the specified value will be taken.
215  */
216  void reMax(int newSize = -1, int newHashSize = 0)
217  {
218  Array< Elem > save(m_elem);
219 
220  m_elem.reSize(newSize < m_used ? m_used : newSize);
221 
222  clear();
223 
224  m_hashsize = (newHashSize < 1) ? autoHashSize() : newHashSize;
225 
226  for(int i = 0; i < save.size(); i++)
227  if(save[i].stat == Elem::USED)
228  add(save[i].item, save[i].info);
229  }
230  ///@}
231 
232  //-----------------------------------
233  /**@name Debugging */
234  ///@{
235  /// checks whether DataHashTable is consistent
236  bool isConsistent() const
237  {
238 #ifdef ENABLE_CONSISTENCY_CHECKS
239  int total = 0;
240 
241  for(int i = 0; i < m_elem.size(); i++)
242  {
243  if(m_elem[i].stat == Elem::USED)
244  {
245  total++;
246 
247  if(!has(m_elem[i].item))
248  return MSGinconsistent("DataHashTable");
249  }
250  }
251 
252  if(total != m_used)
253  return MSGinconsistent("DataHashTable");
254 
255  return m_elem.isConsistent();
256 #else
257  return true;
258 #endif
259  }
260  ///@}
261 
262  //-----------------------------------
263  /**@name Construction / destruction */
264  ///@{
265  /// default constructor.
266  /** Allocates a DataHashTable for \p maxsize entries using \p hashfun
267  * as hash function. If \p hashsize > 0, #m_hashsize is set to the
268  * specified value, otherwise a suitable hash size is computed
269  * automatically. Parameter \p factor is used for memory management:
270  * If more than \p maxsize entries are added to the DataHashTable, it
271  * will automatically be #reMax()%ed by a factor of \p factor.
272  *
273  * @param hashfun pointer to hash function.
274  * @param maxsize maximum number of hash elements.
275  * @param hashsize hash size.
276  * @param factor factor for increasing data block.
277  */
278  explicit DataHashTable(
279  int (*hashfun)(const HashItem*),
280  int maxsize = 265,
281  int hashsize = 0,
282  Real factor = 2.0)
283  : m_elem(maxsize)
284  , m_hashfun(hashfun)
285  , m_memfactor(factor)
286  {
287  clear();
288 
289  primes[0] = 1523;
290  primes[1] = 3547;
291  primes[2] = 8011;
292  primes[3] = 17707;
293  primes[4] = 38723;
294  primes[5] = 83833;
295  primes[6] = 180317;
296  primes[7] = 385897;
297  primes[8] = 821411;
298  primes[9] = 1742369;
299  primes[10] = 3680893;
300  primes[11] = 5693959;
301  primes[12] = 7753849;
302  primes[13] = 9849703;
303  primes[14] = 11973277;
304  primes[15] = 14121853;
305  primes[16] = 17643961;
306  primes[17] = 24273817;
307  primes[18] = 32452843;
308  primes[19] = 49979687;
309  primes[20] = 67867967;
310  primes[21] = 86028121;
311  primes[22] = 104395301;
312  primes[23] = 122949823;
313  primes[24] = 141650939;
314  primes[25] = 160481183;
315  primes[26] = 179424673;
316  primes[27] = 198491317;
317  primes[28] = 217645177;
318  primes[29] = 256203161;
319  primes[30] = 314606869;
320  primes[31] = 373587883;
321  primes[32] = 433024223;
322  primes[33] = 492876847;
323  primes[34] = 553105243;
324  primes[35] = 613651349;
325  primes[36] = 694847533;
326  primes[37] = 756065159;
327  primes[38] = 817504243;
328  primes[39] = 879190747;
329  primes[40] = 941083981;
330  primes[41] = 982451653;
331  primes[42] = INT_MAX;
332  nprimes = 43;
333 
334  m_hashsize = (hashsize < 1) ? autoHashSize() : hashsize;
335 
336  assert(m_memfactor > 1.0);
337  assert(isConsistent());
338  }
339 
340  /// assignment operator.
342  {
343  m_elem = base.m_elem;
344  m_hashfun = base.m_hashfun;
345  m_memfactor = base.m_memfactor;
346  m_used = base.m_used;
347  m_hashsize = base.m_hashsize;
348  primes = base.primes;
349  nprimes = base.nprimes;
350 
351  assert(m_memfactor > 1.0);
352  assert(isConsistent());
353  return *this;
354  }
355 
356  /// copy constructor.
358  : m_elem(base.m_elem)
359  , m_hashfun(base.m_hashfun)
360  , m_memfactor(base.m_memfactor)
361  , m_used(base.m_used)
362  , m_hashsize(base.m_hashsize)
363  , primes(base.primes)
364  , nprimes(base.nprimes)
365  {
366  assert(m_memfactor > 1.0);
367  assert(isConsistent());
368  }
369  ///@}
370 
371 private:
372 
373  //-----------------------------------
374  /**@name Helpers */
375  ///@{
376  /// determine a good \ref soplex::DataHashTable::m_hashsize "m_hashsize".
377  /** Determine next larger prime number for new #m_hashsize
378  * @return good value for #m_hashsize
379  */
380  int autoHashSize() const
381  {
382  auto oldsize = m_elem.size();
383 
384  int left = 0;
385  int right = nprimes - 1;
386  int middle;
387 
388  while(left <= right)
389  {
390  middle = (left + right) / 2;
391 
392  if(oldsize < primes[middle])
393  {
394  right = middle - 1;
395  }
396  else if(oldsize > primes[middle])
397  {
398  left = middle + 1;
399  }
400  else
401  {
402  assert(oldsize == primes[middle]);
403  return primes[middle + 1];
404  }
405  }
406 
407  assert(left == right + 1);
408  return primes[left];
409  }
410 
411  /// automatically computes a good \ref soplex::DataHashTable::m_hashsize "m_hashsize".
412  /** Computes a good #m_hashsize as the product of all prime numbers
413  * not divisors of the number of elements that are <=
414  * the maximum divisor of the number of elemens.
415  * @return good value for #m_hashsize
416  */
417  int autoHashSizeold() const
418  {
419  DataArray< bool > prime(m_elem.size());
420  int hashsize = 1;
421  int maxsize = m_elem.size();
422  int i;
423 
424  for(i = 2; i < maxsize; i++)
425  prime[i] = true;
426 
427  for(i = 2; i < maxsize; ++i)
428  {
429  if(prime[i])
430  {
431  for(int j = i; j < maxsize; j += i)
432  prime[j] = false;
433 
434  if(m_elem.size() % i != 0)
435  {
436  hashsize *= i;
437 
438  if(hashsize > maxsize)
439  {
440  hashsize /= i;
441  break;
442  }
443  }
444  }
445  }
446 
447  return hashsize;
448  }
449 
450  /// returns hash index of \a HashItem \p h or -1, if \p h is not present.
451  /** Using the hash function #m_hashfun, the hash value of \p h
452  * is calculated.
453  * Starting with this hash index, every #m_hashsize%-th element is
454  * compared with \p h until \p h is found or all elements have been checked.
455  *
456  * @param h \a HashItem, for which the hash index should be calculated
457  * @return hash index of \p h or -1,
458  * if \p h is not a member of the hash table
459  */
460  int index(const HashItem& h) const
461  {
462  if(m_used == 0)
463  return -1;
464 
465  assert(m_elem.size() > 0);
466 
467  auto i = (*m_hashfun)(&h) % m_elem.size();
468  auto j = i;
469 
470  while(m_elem[i].stat != Elem::FREE)
471  {
472  if((m_elem[i].stat == Elem::USED)
473  && (m_elem[i].item == h))
474  return i;
475 
476  i = (i + m_hashsize) % m_elem.size();
477 
478  if(i == j)
479  break;
480  }
481 
482  return -1;
483  }
484  ///@}
485 
486 };
487 
488 } // namespace soplex
489 #endif // _DATAHASHTABLE_H_
int size() const
return the number of elements.
Definition: array.h:190
bool isConsistent() const
Consistency check.
Definition: array.h:257
element has never been used
Definition: datahashtable.h:96
DataHashTable(int(*hashfun)(const HashItem *), int maxsize=265, int hashsize=0, Real factor=2.0)
default constructor.
DataHashTable(const DataHashTable &base)
copy constructor.
Safe arrays of arbitrary types.Class Array provides safe arrays of arbitrary type. Array elements are accessed just like ordinary C++ array elements by means of the index operator[](). Safety is provided by.
Definition: array.h:63
Element< HashItem, Info > Elem
bool isConsistent() const
checks whether DataHashTable is consistent
Array< Elem > m_elem
stores all elements of the hash table
void clear()
remove all entries from DataHashTable.
#define HASHTABLE_FILLFACTOR
Definition: datahashtable.h:29
int index(const HashItem &h) const
returns hash index of HashItem h or -1, if h is not present.
double Real
Definition: spxdefines.h:227
element had been used, but released
Definition: datahashtable.h:97
template class for elements stored in the hash table
Definition: datahashtable.h:86
int primes[50]
some prime numbers for fast access
int autoHashSizeold() const
automatically computes a good m_hashsize.
int autoHashSize() const
determine a good m_hashsize.
int m_used
current number of entries in the hash table
DataHashTable & operator=(const DataHashTable &base)
assignment operator.
Generic hash table for data objects.Class DataHashTable provides a generic hash table for Data Object...
Definition: datahashtable.h:77
const Info & operator[](const HashItem &h) const
references Info of HashItem h.
Debugging, floating point type and parameter definitions.
states
States of an element.
Definition: datahashtable.h:94
Real m_memfactor
memory is reMax()ed by this factor if a new element does&#39;t fit
Everything should be within this namespace.
void add(const HashItem &h, const Info &info)
adds a new entry to the hash table.
void reSize(int newsize)
reset the number of elements.
Definition: array.h:196
int(* m_hashfun)(const HashItem *)
pointer to hash function (mapping: HashItem -> int)
Save arrays of arbitrary types.
int nprimes
number of stored prime numbers
int m_hashsize
increment added to hash index, if allready used
bool has(const HashItem &h) const
Is item h present in DataHashTable?
#define MSGinconsistent(name)
Definition: spxdefines.h:135
void reMax(int newSize=-1, int newHashSize=0)
reset size of the DataHashTable.
enum soplex::DataHashTable::Element::states stat