Scippy

SoPlex

Sequential object-oriented simPlex

datahashtable.h
Go to the documentation of this file.
1/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
2/* */
3/* This file is part of the class library */
4/* SoPlex --- the Sequential object-oriented simPlex. */
5/* */
6/* Copyright (c) 1996-2024 Zuse Institute Berlin (ZIB) */
7/* */
8/* Licensed under the Apache License, Version 2.0 (the "License"); */
9/* you may not use this file except in compliance with the License. */
10/* You may obtain a copy of the License at */
11/* */
12/* http://www.apache.org/licenses/LICENSE-2.0 */
13/* */
14/* Unless required by applicable law or agreed to in writing, software */
15/* distributed under the License is distributed on an "AS IS" BASIS, */
16/* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. */
17/* See the License for the specific language governing permissions and */
18/* limitations under the License. */
19/* */
20/* You should have received a copy of the Apache-2.0 license */
21/* along with SoPlex; see the file LICENSE. If not email to soplex@zib.de. */
22/* */
23/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
24
25/**@file datahashtable.h
26 * @brief Generic hash table for data objects.
27 */
28#ifndef _DATAHASHTABLE_H_
29#define _DATAHASHTABLE_H_
30
31#include <algorithm>
32#include <iostream>
33#include <iterator>
34#include <assert.h>
35#include <limits.h>
36
37#include "soplex/spxdefines.h"
38#include "soplex/array.h"
39
40#define SOPLEX_HASHTABLE_FILLFACTOR 0.7
41
42namespace soplex
43{
44/**@brief Generic hash table for data objects.
45 @ingroup Elementary
46
47 Class DataHashTable provides a generic hash table for
48 \ref DataObjects "Data Objects",
49 i.e., a map that maps arguments called \a HashItems to values called \a Infos.
50 HashItem and Info types are passed as template arguments. HashItems
51 must provide a comparison operator==(). Furthermore, both the HashItem and
52 Info must be data objects in the sense that the assignment operator is
53 equivalent to a <tt>memcpy()</tt> of the structure and no destructor is
54 required.
55
56 The construction of a DataHashTable requires a \em hash \em function that
57 assigns an integer value to every HashItem. Provided this, pairs of a
58 HashItem and a Info can be added to the DataHashTable. No more
59 than one Info can be assigned to the same HashItem at a time. The Info
60 to a HashItem can be accessed through the subscript operator[]() with
61 the Info object as a subscript.
62
63 The maximum number of elemens a DataHashTable can hold can be
64 specified upon construction and may be reset with reMax() later on.
65 Further, a value hash size value is required. This value must be less then
66 the maximum number of elements and must not have a common dominator with
67 the maximum number of elements. If not specified explicitely, it
68 is set automatically to a reasonable value.
69
70 The implementation relies on an array of DataHashTable::Element%s, from
71 now on referred to as elements. Upon construction, all elements are
72 marked as \c FREE. When an entry is added
73 to the DataHashTable, the hash value is computed by calling #m_hashfun
74 for its HashItem. If this array element is unused, it is
75 taken right away. Otherwise, the array index is incremented by
76 the hash size (modulo the element array size()) until an unused element
77 is found.
78
79 Removing elements is simply done by marking it as \c RELEASED. Hence,
80 when searching for an element, the search loop may not stop, when a
81 \c RELEASED element is encountered. However, such an element may be
82 reused when adding a new element to the DataHashTable.
83
84 Further, memory management with resizing of the element array is
85 straight forward.
86*/
87template < class HashItem, class Info >
89{
90private:
91
92 //-----------------------------------
93 /**@name Types */
94 ///@{
95 /// template class for elements stored in the hash table
96 template < class ElemHashItem, class ElemInfo >
97 class Element
98 {
99 public:
100 ///
101 ElemHashItem item;
102 ///
103 ElemInfo info;
104 /// States of an element
106 {
107 FREE, ///< element has never been used
108 RELEASED, ///< element had been used, but released
109 USED ///< element is in use
111 };
113 ///@}
114
115 //-----------------------------------
116 /**@name Data */
117 ///@{
118 /// stores all elements of the hash table
120 /// increment added to hash index, if allready used
122 /// current number of entries in the hash table
124 /// pointer to hash function (mapping: \a HashItem -> int)
125 int (*m_hashfun)(const HashItem*);
126 /// memory is \ref soplex::DataHashTable::reMax() "reMax()"ed by this factor if a new element does't fit
128 /// some prime numbers for fast access
129 int primes[50];
130 /// number of stored prime numbers
132
133 ///@}
134
135public:
136
137 //-----------------------------------
138 /**@name Access / modification */
139 ///@{
140 /// Is item \p h present in DataHashTable?
141 bool has(const HashItem& h) const
142 {
143 return index(h) >= 0;
144 }
145
146 /// returns const pointer to \a Info of \a HashItem \p h or 0,
147 /// if item is not found.
148 /** Returns a pointer to \a Info component of hash element \p h or a zero
149 * pointer if element \p h is not in the table.
150 */
151 const Info* get(const HashItem& h) const
152 {
153 int i = index(h);
154
155 return (i >= 0) ? &m_elem[i].info : nullptr;
156 }
157 /// references \a Info of \a HashItem \p h.
158 /** Index operator for accessing the \a Info associated to
159 * \a HashItem \p h. It is required that \p h belongs to the
160 * DataHashTable, otherwise it core dumps. Methods has() or
161 * get() can be used for inquiring wheater \p h belongs to the
162 * DataHashTable or not.
163 */
164 const Info& operator[](const HashItem& h) const
165 {
166 assert(has(h));
167
168 return m_elem[index(h)].info;
169 }
170 /// adds a new entry to the hash table.
171 /** Adds a new entry consisting of \a HashItem \p h and \a Info \p info to the
172 * DataHashTable. No entry with HashItem \p h must be in the
173 * DataHashTable yet. After completion, \p info may be accessed via get() or
174 * operator[]() with \p h as parameter. The DataHashTable is #reMax()%ed
175 * if it becomes neccessary.
176 */
177 void add(const HashItem& h, const Info& info)
178 {
179 assert(!has(h));
180
182 reMax(int(m_memfactor * m_used) + 1);
183
184 assert(m_used < m_elem.size());
185
186 decltype(m_elem.size()) i;
187
188 for(i = (*m_hashfun)(&h) % m_elem.size();
189 m_elem[i].stat == Elem::USED;
190 i = (i + m_hashsize) % m_elem.size())
191 ;
192
193 assert(m_elem[i].stat != Elem::USED);
194
195 m_elem[i].stat = Elem::USED;
196 m_elem[i].info = info;
197 m_elem[i].item = h;
198
199 m_used++;
200
201 assert(has(h));
202 }
203
204 /// remove \a HashItem \p h from the DataHashTable.
205 void remove(const HashItem& h)
206 {
207 int i = index(h);
208
209 if(i < 0)
210 return;
211
212 m_elem[i].stat = Elem::RELEASED;
213 m_used--;
214 assert(!has(h));
215 }
216
217 /// remove all entries from DataHashTable.
218 void clear()
219 {
220 for(int i = 0; i < m_elem.size(); i++)
221 m_elem[i].stat = Elem::FREE;
222
223 m_used = 0;
224 }
225 /// reset size of the DataHashTable.
226 /** Reset the maximum number of elements of a DataHashTable to \p newSize.
227 * However, if \p newSize < #m_used, it is resized to #m_used only.
228 * If \p newHashSize < 1, a new hash size is computed automatically.
229 * Otherwise, the specified value will be taken.
230 */
231 void reMax(int newSize = -1, int newHashSize = 0)
232 {
233 Array< Elem > save(m_elem);
234
235 m_elem.reSize(newSize < m_used ? m_used : newSize);
236
237 clear();
238
239 m_hashsize = (newHashSize < 1) ? autoHashSize() : newHashSize;
240
241 for(int i = 0; i < save.size(); i++)
242 if(save[i].stat == Elem::USED)
243 add(save[i].item, save[i].info);
244 }
245 ///@}
246
247 //-----------------------------------
248 /**@name Debugging */
249 ///@{
250 /// checks whether DataHashTable is consistent
251 bool isConsistent() const
252 {
253#ifdef ENABLE_CONSISTENCY_CHECKS
254 int total = 0;
255
256 for(int i = 0; i < m_elem.size(); i++)
257 {
258 if(m_elem[i].stat == Elem::USED)
259 {
260 total++;
261
262 if(!has(m_elem[i].item))
263 return SPX_MSG_INCONSISTENT("DataHashTable");
264 }
265 }
266
267 if(total != m_used)
268 return SPX_MSG_INCONSISTENT("DataHashTable");
269
270 return m_elem.isConsistent();
271#else
272 return true;
273#endif
274 }
275 ///@}
276
277 //-----------------------------------
278 /**@name Construction / destruction */
279 ///@{
280 /// default constructor.
281 /** Allocates a DataHashTable for \p maxsize entries using \p hashfun
282 * as hash function. If \p hashsize > 0, #m_hashsize is set to the
283 * specified value, otherwise a suitable hash size is computed
284 * automatically. Parameter \p factor is used for memory management:
285 * If more than \p maxsize entries are added to the DataHashTable, it
286 * will automatically be #reMax()%ed by a factor of \p factor.
287 *
288 * @param hashfun pointer to hash function.
289 * @param maxsize maximum number of hash elements.
290 * @param hashsize hash size.
291 * @param factor factor for increasing data block.
292 */
294 int (*hashfun)(const HashItem*),
295 int maxsize = 265,
296 int hashsize = 0,
297 Real factor = 2.0)
298 : m_elem(maxsize)
299 , m_hashfun(hashfun)
300 , m_memfactor(factor)
301 {
302 clear();
303
304 primes[0] = 1523;
305 primes[1] = 3547;
306 primes[2] = 8011;
307 primes[3] = 17707;
308 primes[4] = 38723;
309 primes[5] = 83833;
310 primes[6] = 180317;
311 primes[7] = 385897;
312 primes[8] = 821411;
313 primes[9] = 1742369;
314 primes[10] = 3680893;
315 primes[11] = 5693959;
316 primes[12] = 7753849;
317 primes[13] = 9849703;
318 primes[14] = 11973277;
319 primes[15] = 14121853;
320 primes[16] = 17643961;
321 primes[17] = 24273817;
322 primes[18] = 32452843;
323 primes[19] = 49979687;
324 primes[20] = 67867967;
325 primes[21] = 86028121;
326 primes[22] = 104395301;
327 primes[23] = 122949823;
328 primes[24] = 141650939;
329 primes[25] = 160481183;
330 primes[26] = 179424673;
331 primes[27] = 198491317;
332 primes[28] = 217645177;
333 primes[29] = 256203161;
334 primes[30] = 314606869;
335 primes[31] = 373587883;
336 primes[32] = 433024223;
337 primes[33] = 492876847;
338 primes[34] = 553105243;
339 primes[35] = 613651349;
340 primes[36] = 694847533;
341 primes[37] = 756065159;
342 primes[38] = 817504243;
343 primes[39] = 879190747;
344 primes[40] = 941083981;
345 primes[41] = 982451653;
346 primes[42] = INT_MAX;
347 nprimes = 43;
348
349 m_hashsize = (hashsize < 1) ? autoHashSize() : hashsize;
350
351 assert(m_memfactor > 1.0);
352 assert(isConsistent());
353 }
354
355 /// assignment operator.
357 {
358 m_elem = base.m_elem;
359 m_hashfun = base.m_hashfun;
361 m_used = base.m_used;
362 m_hashsize = base.m_hashsize;
363 std::copy(std::begin(base.primes), std::end(base.primes), std::begin(primes));
364 nprimes = base.nprimes;
365
366 assert(m_memfactor > 1.0);
367 assert(isConsistent());
368 return *this;
369 }
370
371 /// copy constructor.
373 : m_elem(base.m_elem)
374 , m_hashfun(base.m_hashfun)
376 , m_used(base.m_used)
377 , m_hashsize(base.m_hashsize)
378 , nprimes(base.nprimes)
379 {
380 std::copy(std::begin(base.primes), std::end(base.primes), std::begin(primes));
381 assert(m_memfactor > 1.0);
382 assert(isConsistent());
383 }
384 ///@}
385
386private:
387
388 //-----------------------------------
389 /**@name Helpers */
390 ///@{
391 /// determine a good \ref soplex::DataHashTable::m_hashsize "m_hashsize".
392 /** Determine next larger prime number for new #m_hashsize
393 * @return good value for #m_hashsize
394 */
395 int autoHashSize() const
396 {
397 auto oldsize = m_elem.size();
398
399 int left = 0;
400 int right = nprimes - 1;
401 int middle;
402
403 while(left <= right)
404 {
405 middle = (left + right) / 2;
406
407 if(oldsize < primes[middle])
408 {
409 right = middle - 1;
410 }
411 else if(oldsize > primes[middle])
412 {
413 left = middle + 1;
414 }
415 else
416 {
417 assert(oldsize == primes[middle]);
418 return primes[middle + 1];
419 }
420 }
421
422 assert(left == right + 1);
423 return primes[left];
424 }
425
426 /// automatically computes a good \ref soplex::DataHashTable::m_hashsize "m_hashsize".
427 /** Computes a good #m_hashsize as the product of all prime numbers
428 * not divisors of the number of elements that are <=
429 * the maximum divisor of the number of elemens.
430 * @return good value for #m_hashsize
431 */
432 int autoHashSizeold() const
433 {
434 DataArray< bool > prime(m_elem.size());
435 int hashsize = 1;
436 int maxsize = m_elem.size();
437 int i;
438
439 for(i = 2; i < maxsize; i++)
440 prime[i] = true;
441
442 for(i = 2; i < maxsize; ++i)
443 {
444 if(prime[i])
445 {
446 for(int j = i; j < maxsize; j += i)
447 prime[j] = false;
448
449 if(m_elem.size() % i != 0)
450 {
451 hashsize *= i;
452
453 if(hashsize > maxsize)
454 {
455 hashsize /= i;
456 break;
457 }
458 }
459 }
460 }
461
462 return hashsize;
463 }
464
465 /// returns hash index of \a HashItem \p h or -1, if \p h is not present.
466 /** Using the hash function #m_hashfun, the hash value of \p h
467 * is calculated.
468 * Starting with this hash index, every #m_hashsize%-th element is
469 * compared with \p h until \p h is found or all elements have been checked.
470 *
471 * @param h \a HashItem, for which the hash index should be calculated
472 * @return hash index of \p h or -1,
473 * if \p h is not a member of the hash table
474 */
475 int index(const HashItem& h) const
476 {
477 if(m_used == 0)
478 return -1;
479
480 assert(m_elem.size() > 0);
481
482 auto i = (*m_hashfun)(&h) % m_elem.size();
483 auto j = i;
484
485 while(m_elem[i].stat != Elem::FREE)
486 {
487 if((m_elem[i].stat == Elem::USED)
488 && (m_elem[i].item == h))
489 return i;
490
491 i = (i + m_hashsize) % m_elem.size();
492
493 if(i == j)
494 break;
495 }
496
497 return -1;
498 }
499 ///@}
500
501};
502
503} // namespace soplex
504#endif // _DATAHASHTABLE_H_
Save arrays of arbitrary types.
Safe arrays of arbitrary types.
Definition: array.h:73
int size() const
return the number of elements.
Definition: array.h:200
template class for elements stored in the hash table
Definition: datahashtable.h:98
enum soplex::DataHashTable::Element::states stat
states
States of an element.
@ RELEASED
element had been used, but released
@ FREE
element has never been used
Generic hash table for data objects.
Definition: datahashtable.h:89
bool has(const HashItem &h) const
Is item h present in DataHashTable?
Array< Elem > m_elem
stores all elements of the hash table
const Info * get(const HashItem &h) const
returns const pointer to Info of HashItem h or 0, if item is not found.
Real m_memfactor
memory is reMax()ed by this factor if a new element does't fit
bool isConsistent() const
checks whether DataHashTable is consistent
int(* m_hashfun)(const HashItem *)
pointer to hash function (mapping: HashItem -> int)
void remove(const HashItem &h)
remove HashItem h from the DataHashTable.
DataHashTable & operator=(const DataHashTable &base)
assignment operator.
int nprimes
number of stored prime numbers
Element< HashItem, Info > Elem
int m_used
current number of entries in the hash table
void reMax(int newSize=-1, int newHashSize=0)
reset size of the DataHashTable.
int m_hashsize
increment added to hash index, if allready used
int autoHashSize() const
determine a good m_hashsize.
const Info & operator[](const HashItem &h) const
references Info of HashItem h.
int primes[50]
some prime numbers for fast access
DataHashTable(int(*hashfun)(const HashItem *), int maxsize=265, int hashsize=0, Real factor=2.0)
default constructor.
void add(const HashItem &h, const Info &info)
adds a new entry to the hash table.
void clear()
remove all entries from DataHashTable.
DataHashTable(const DataHashTable &base)
copy constructor.
int index(const HashItem &h) const
returns hash index of HashItem h or -1, if h is not present.
int autoHashSizeold() const
automatically computes a good m_hashsize.
#define SOPLEX_HASHTABLE_FILLFACTOR
Definition: datahashtable.h:40
Everything should be within this namespace.
double Real
Definition: spxdefines.h:269
Debugging, floating point type and parameter definitions.
#define SPX_MSG_INCONSISTENT(name)
Definition: spxdefines.h:175