Scippy

SoPlex

Sequential object-oriented simPlex

datahashtable.h
Go to the documentation of this file.
1 /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
2 /* */
3 /* This file is part of the class library */
4 /* SoPlex --- the Sequential object-oriented simPlex. */
5 /* */
6 /* Copyright (c) 1996-2023 Zuse Institute Berlin (ZIB) */
7 /* */
8 /* Licensed under the Apache License, Version 2.0 (the "License"); */
9 /* you may not use this file except in compliance with the License. */
10 /* You may obtain a copy of the License at */
11 /* */
12 /* http://www.apache.org/licenses/LICENSE-2.0 */
13 /* */
14 /* Unless required by applicable law or agreed to in writing, software */
15 /* distributed under the License is distributed on an "AS IS" BASIS, */
16 /* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. */
17 /* See the License for the specific language governing permissions and */
18 /* limitations under the License. */
19 /* */
20 /* You should have received a copy of the Apache-2.0 license */
21 /* along with SoPlex; see the file LICENSE. If not email to soplex@zib.de. */
22 /* */
23 /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
24 
25 /**@file datahashtable.h
26  * @brief Generic hash table for data objects.
27  */
28 #ifndef _DATAHASHTABLE_H_
29 #define _DATAHASHTABLE_H_
30 
31 #include <algorithm>
32 #include <iostream>
33 #include <iterator>
34 #include <assert.h>
35 #include <limits.h>
36 
37 #include "soplex/spxdefines.h"
38 #include "soplex/array.h"
39 
40 #define SOPLEX_HASHTABLE_FILLFACTOR 0.7
41 
42 namespace soplex
43 {
44 /**@brief Generic hash table for data objects.
45  @ingroup Elementary
46 
47  Class DataHashTable provides a generic hash table for
48  \ref DataObjects "Data Objects",
49  i.e., a map that maps arguments called \a HashItems to values called \a Infos.
50  HashItem and Info types are passed as template arguments. HashItems
51  must provide a comparison operator==(). Furthermore, both the HashItem and
52  Info must be data objects in the sense that the assignment operator is
53  equivalent to a <tt>memcpy()</tt> of the structure and no destructor is
54  required.
55 
56  The construction of a DataHashTable requires a \em hash \em function that
57  assigns an integer value to every HashItem. Provided this, pairs of a
58  HashItem and a Info can be added to the DataHashTable. No more
59  than one Info can be assigned to the same HashItem at a time. The Info
60  to a HashItem can be accessed through the subscript operator[]() with
61  the Info object as a subscript.
62 
63  The maximum number of elemens a DataHashTable can hold can be
64  specified upon construction and may be reset with reMax() later on.
65  Further, a value hash size value is required. This value must be less then
66  the maximum number of elements and must not have a common dominator with
67  the maximum number of elements. If not specified explicitely, it
68  is set automatically to a reasonable value.
69 
70  The implementation relies on an array of DataHashTable::Element%s, from
71  now on referred to as elements. Upon construction, all elements are
72  marked as \c FREE. When an entry is added
73  to the DataHashTable, the hash value is computed by calling #m_hashfun
74  for its HashItem. If this array element is unused, it is
75  taken right away. Otherwise, the array index is incremented by
76  the hash size (modulo the element array size()) until an unused element
77  is found.
78 
79  Removing elements is simply done by marking it as \c RELEASED. Hence,
80  when searching for an element, the search loop may not stop, when a
81  \c RELEASED element is encountered. However, such an element may be
82  reused when adding a new element to the DataHashTable.
83 
84  Further, memory management with resizing of the element array is
85  straight forward.
86 */
87 template < class HashItem, class Info >
89 {
90 private:
91 
92  //-----------------------------------
93  /**@name Types */
94  ///@{
95  /// template class for elements stored in the hash table
96  template < class ElemHashItem, class ElemInfo >
97  class Element
98  {
99  public:
100  ///
101  ElemHashItem item;
102  ///
103  ElemInfo info;
104  /// States of an element
105  enum states
106  {
107  FREE, ///< element has never been used
108  RELEASED, ///< element had been used, but released
109  USED ///< element is in use
110  } stat;
111  };
112  typedef Element< HashItem, Info > Elem;
113  ///@}
114 
115  //-----------------------------------
116  /**@name Data */
117  ///@{
118  /// stores all elements of the hash table
120  /// increment added to hash index, if allready used
122  /// current number of entries in the hash table
123  int m_used;
124  /// pointer to hash function (mapping: \a HashItem -> int)
125  int (*m_hashfun)(const HashItem*);
126  /// memory is \ref soplex::DataHashTable::reMax() "reMax()"ed by this factor if a new element does't fit
128  /// some prime numbers for fast access
129  int primes[50];
130  /// number of stored prime numbers
131  int nprimes;
132 
133  ///@}
134 
135 public:
136 
137  //-----------------------------------
138  /**@name Access / modification */
139  ///@{
140  /// Is item \p h present in DataHashTable?
141  bool has(const HashItem& h) const
142  {
143  return index(h) >= 0;
144  }
145 
146  /// returns const pointer to \a Info of \a HashItem \p h or 0,
147  /// if item is not found.
148  /** Returns a pointer to \a Info component of hash element \p h or a zero
149  * pointer if element \p h is not in the table.
150  */
151  const Info* get(const HashItem& h) const
152  {
153  int i = index(h);
154 
155  return (i >= 0) ? &m_elem[i].info : nullptr;
156  }
157  /// references \a Info of \a HashItem \p h.
158  /** Index operator for accessing the \a Info associated to
159  * \a HashItem \p h. It is required that \p h belongs to the
160  * DataHashTable, otherwise it core dumps. Methods has() or
161  * get() can be used for inquiring wheater \p h belongs to the
162  * DataHashTable or not.
163  */
164  const Info& operator[](const HashItem& h) const
165  {
166  assert(has(h));
167 
168  return m_elem[index(h)].info;
169  }
170  /// adds a new entry to the hash table.
171  /** Adds a new entry consisting of \a HashItem \p h and \a Info \p info to the
172  * DataHashTable. No entry with HashItem \p h must be in the
173  * DataHashTable yet. After completion, \p info may be accessed via get() or
174  * operator[]() with \p h as parameter. The DataHashTable is #reMax()%ed
175  * if it becomes neccessary.
176  */
177  void add(const HashItem& h, const Info& info)
178  {
179  assert(!has(h));
180 
181  if(m_used >= m_elem.size() * SOPLEX_HASHTABLE_FILLFACTOR)
182  reMax(int(m_memfactor * m_used) + 1);
183 
184  assert(m_used < m_elem.size());
185 
186  decltype(m_elem.size()) i;
187 
188  for(i = (*m_hashfun)(&h) % m_elem.size();
189  m_elem[i].stat == Elem::USED;
190  i = (i + m_hashsize) % m_elem.size())
191  ;
192 
193  assert(m_elem[i].stat != Elem::USED);
194 
195  m_elem[i].stat = Elem::USED;
196  m_elem[i].info = info;
197  m_elem[i].item = h;
198 
199  m_used++;
200 
201  assert(has(h));
202  }
203 
204  /// remove \a HashItem \p h from the DataHashTable.
205  void remove(const HashItem& h)
206  {
207  int i = index(h);
208 
209  if(i < 0)
210  return;
211 
212  m_elem[i].stat = Elem::RELEASED;
213  m_used--;
214  assert(!has(h));
215  }
216 
217  /// remove all entries from DataHashTable.
218  void clear()
219  {
220  for(int i = 0; i < m_elem.size(); i++)
221  m_elem[i].stat = Elem::FREE;
222 
223  m_used = 0;
224  }
225  /// reset size of the DataHashTable.
226  /** Reset the maximum number of elements of a DataHashTable to \p newSize.
227  * However, if \p newSize < #m_used, it is resized to #m_used only.
228  * If \p newHashSize < 1, a new hash size is computed automatically.
229  * Otherwise, the specified value will be taken.
230  */
231  void reMax(int newSize = -1, int newHashSize = 0)
232  {
233  Array< Elem > save(m_elem);
234 
235  m_elem.reSize(newSize < m_used ? m_used : newSize);
236 
237  clear();
238 
239  m_hashsize = (newHashSize < 1) ? autoHashSize() : newHashSize;
240 
241  for(int i = 0; i < save.size(); i++)
242  if(save[i].stat == Elem::USED)
243  add(save[i].item, save[i].info);
244  }
245  ///@}
246 
247  //-----------------------------------
248  /**@name Debugging */
249  ///@{
250  /// checks whether DataHashTable is consistent
251  bool isConsistent() const
252  {
253 #ifdef ENABLE_CONSISTENCY_CHECKS
254  int total = 0;
255 
256  for(int i = 0; i < m_elem.size(); i++)
257  {
258  if(m_elem[i].stat == Elem::USED)
259  {
260  total++;
261 
262  if(!has(m_elem[i].item))
263  return SPX_MSG_INCONSISTENT("DataHashTable");
264  }
265  }
266 
267  if(total != m_used)
268  return SPX_MSG_INCONSISTENT("DataHashTable");
269 
270  return m_elem.isConsistent();
271 #else
272  return true;
273 #endif
274  }
275  ///@}
276 
277  //-----------------------------------
278  /**@name Construction / destruction */
279  ///@{
280  /// default constructor.
281  /** Allocates a DataHashTable for \p maxsize entries using \p hashfun
282  * as hash function. If \p hashsize > 0, #m_hashsize is set to the
283  * specified value, otherwise a suitable hash size is computed
284  * automatically. Parameter \p factor is used for memory management:
285  * If more than \p maxsize entries are added to the DataHashTable, it
286  * will automatically be #reMax()%ed by a factor of \p factor.
287  *
288  * @param hashfun pointer to hash function.
289  * @param maxsize maximum number of hash elements.
290  * @param hashsize hash size.
291  * @param factor factor for increasing data block.
292  */
293  explicit DataHashTable(
294  int (*hashfun)(const HashItem*),
295  int maxsize = 265,
296  int hashsize = 0,
297  Real factor = 2.0)
298  : m_elem(maxsize)
299  , m_hashfun(hashfun)
300  , m_memfactor(factor)
301  {
302  clear();
303 
304  primes[0] = 1523;
305  primes[1] = 3547;
306  primes[2] = 8011;
307  primes[3] = 17707;
308  primes[4] = 38723;
309  primes[5] = 83833;
310  primes[6] = 180317;
311  primes[7] = 385897;
312  primes[8] = 821411;
313  primes[9] = 1742369;
314  primes[10] = 3680893;
315  primes[11] = 5693959;
316  primes[12] = 7753849;
317  primes[13] = 9849703;
318  primes[14] = 11973277;
319  primes[15] = 14121853;
320  primes[16] = 17643961;
321  primes[17] = 24273817;
322  primes[18] = 32452843;
323  primes[19] = 49979687;
324  primes[20] = 67867967;
325  primes[21] = 86028121;
326  primes[22] = 104395301;
327  primes[23] = 122949823;
328  primes[24] = 141650939;
329  primes[25] = 160481183;
330  primes[26] = 179424673;
331  primes[27] = 198491317;
332  primes[28] = 217645177;
333  primes[29] = 256203161;
334  primes[30] = 314606869;
335  primes[31] = 373587883;
336  primes[32] = 433024223;
337  primes[33] = 492876847;
338  primes[34] = 553105243;
339  primes[35] = 613651349;
340  primes[36] = 694847533;
341  primes[37] = 756065159;
342  primes[38] = 817504243;
343  primes[39] = 879190747;
344  primes[40] = 941083981;
345  primes[41] = 982451653;
346  primes[42] = INT_MAX;
347  nprimes = 43;
348 
349  m_hashsize = (hashsize < 1) ? autoHashSize() : hashsize;
350 
351  assert(m_memfactor > 1.0);
352  assert(isConsistent());
353  }
354 
355  /// assignment operator.
357  {
358  m_elem = base.m_elem;
359  m_hashfun = base.m_hashfun;
360  m_memfactor = base.m_memfactor;
361  m_used = base.m_used;
362  m_hashsize = base.m_hashsize;
363  std::copy(std::begin(base.primes), std::end(base.primes), std::begin(primes));
364  nprimes = base.nprimes;
365 
366  assert(m_memfactor > 1.0);
367  assert(isConsistent());
368  return *this;
369  }
370 
371  /// copy constructor.
373  : m_elem(base.m_elem)
374  , m_hashfun(base.m_hashfun)
375  , m_memfactor(base.m_memfactor)
376  , m_used(base.m_used)
377  , m_hashsize(base.m_hashsize)
378  , nprimes(base.nprimes)
379  {
380  std::copy(std::begin(base.primes), std::end(base.primes), std::begin(primes));
381  assert(m_memfactor > 1.0);
382  assert(isConsistent());
383  }
384  ///@}
385 
386 private:
387 
388  //-----------------------------------
389  /**@name Helpers */
390  ///@{
391  /// determine a good \ref soplex::DataHashTable::m_hashsize "m_hashsize".
392  /** Determine next larger prime number for new #m_hashsize
393  * @return good value for #m_hashsize
394  */
395  int autoHashSize() const
396  {
397  auto oldsize = m_elem.size();
398 
399  int left = 0;
400  int right = nprimes - 1;
401  int middle;
402 
403  while(left <= right)
404  {
405  middle = (left + right) / 2;
406 
407  if(oldsize < primes[middle])
408  {
409  right = middle - 1;
410  }
411  else if(oldsize > primes[middle])
412  {
413  left = middle + 1;
414  }
415  else
416  {
417  assert(oldsize == primes[middle]);
418  return primes[middle + 1];
419  }
420  }
421 
422  assert(left == right + 1);
423  return primes[left];
424  }
425 
426  /// automatically computes a good \ref soplex::DataHashTable::m_hashsize "m_hashsize".
427  /** Computes a good #m_hashsize as the product of all prime numbers
428  * not divisors of the number of elements that are <=
429  * the maximum divisor of the number of elemens.
430  * @return good value for #m_hashsize
431  */
432  int autoHashSizeold() const
433  {
434  DataArray< bool > prime(m_elem.size());
435  int hashsize = 1;
436  int maxsize = m_elem.size();
437  int i;
438 
439  for(i = 2; i < maxsize; i++)
440  prime[i] = true;
441 
442  for(i = 2; i < maxsize; ++i)
443  {
444  if(prime[i])
445  {
446  for(int j = i; j < maxsize; j += i)
447  prime[j] = false;
448 
449  if(m_elem.size() % i != 0)
450  {
451  hashsize *= i;
452 
453  if(hashsize > maxsize)
454  {
455  hashsize /= i;
456  break;
457  }
458  }
459  }
460  }
461 
462  return hashsize;
463  }
464 
465  /// returns hash index of \a HashItem \p h or -1, if \p h is not present.
466  /** Using the hash function #m_hashfun, the hash value of \p h
467  * is calculated.
468  * Starting with this hash index, every #m_hashsize%-th element is
469  * compared with \p h until \p h is found or all elements have been checked.
470  *
471  * @param h \a HashItem, for which the hash index should be calculated
472  * @return hash index of \p h or -1,
473  * if \p h is not a member of the hash table
474  */
475  int index(const HashItem& h) const
476  {
477  if(m_used == 0)
478  return -1;
479 
480  assert(m_elem.size() > 0);
481 
482  auto i = (*m_hashfun)(&h) % m_elem.size();
483  auto j = i;
484 
485  while(m_elem[i].stat != Elem::FREE)
486  {
487  if((m_elem[i].stat == Elem::USED)
488  && (m_elem[i].item == h))
489  return i;
490 
491  i = (i + m_hashsize) % m_elem.size();
492 
493  if(i == j)
494  break;
495  }
496 
497  return -1;
498  }
499  ///@}
500 
501 };
502 
503 } // namespace soplex
504 #endif // _DATAHASHTABLE_H_
#define SOPLEX_HASHTABLE_FILLFACTOR
Definition: datahashtable.h:40
int size() const
return the number of elements.
Definition: array.h:200
bool isConsistent() const
Consistency check.
Definition: array.h:267
element has never been used
#define SPX_MSG_INCONSISTENT(name)
Definition: spxdefines.h:175
DataHashTable(int(*hashfun)(const HashItem *), int maxsize=265, int hashsize=0, Real factor=2.0)
default constructor.
DataHashTable(const DataHashTable &base)
copy constructor.
Safe arrays of arbitrary types.Class Array provides safe arrays of arbitrary type. Array elements are accessed just like ordinary C++ array elements by means of the index operator[](). Safety is provided by.
Definition: array.h:72
Element< HashItem, Info > Elem
bool isConsistent() const
checks whether DataHashTable is consistent
Array< Elem > m_elem
stores all elements of the hash table
void clear()
remove all entries from DataHashTable.
int index(const HashItem &h) const
returns hash index of HashItem h or -1, if h is not present.
double Real
Definition: spxdefines.h:269
element had been used, but released
template class for elements stored in the hash table
Definition: datahashtable.h:97
int primes[50]
some prime numbers for fast access
int autoHashSizeold() const
automatically computes a good m_hashsize.
int autoHashSize() const
determine a good m_hashsize.
int m_used
current number of entries in the hash table
OutputIterator copy(const RangeT &range, OutputIterator out)
Definition: ranges.h:58
DataHashTable & operator=(const DataHashTable &base)
assignment operator.
Generic hash table for data objects.Class DataHashTable provides a generic hash table for Data Object...
Definition: datahashtable.h:88
const Info & operator[](const HashItem &h) const
references Info of HashItem h.
Debugging, floating point type and parameter definitions.
states
States of an element.
Real m_memfactor
memory is reMax()ed by this factor if a new element does&#39;t fit
Everything should be within this namespace.
void add(const HashItem &h, const Info &info)
adds a new entry to the hash table.
void reSize(int newsize)
reset the number of elements.
Definition: array.h:206
int(* m_hashfun)(const HashItem *)
pointer to hash function (mapping: HashItem -> int)
Save arrays of arbitrary types.
int nprimes
number of stored prime numbers
int m_hashsize
increment added to hash index, if allready used
bool has(const HashItem &h) const
Is item h present in DataHashTable?
void reMax(int newSize=-1, int newHashSize=0)
reset size of the DataHashTable.
enum soplex::DataHashTable::Element::states stat