| <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" |
| "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> |
| |
| <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> |
| <head> |
| <meta name="generator" content="HTML Tidy for Linux/x86 (vers 12 April 2005), see www.w3.org" /> |
| <title>Hash Skewed Distribution Memory Use Test</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=us-ascii" /> |
| </head> |
| <body> |
| <div id="page"> |
| <h1>Hash-Based Skewed-Distribution Random-Integer <tt>find</tt> |
| Find Timing Test</h1> |
| <h2><a name="description" id="description">Description</a></h2> |
| <p>This test inserts a number of values with a markedly |
| non-uniform i.i.d. integer keys into a container, then performs |
| a series of finds using <tt>find</tt> . It measures the average |
| time for <tt>find</tt> as a function of the number of values in |
| the containers. The keys are generated as follows. First, a |
| uniform integer is created; it is then shifted left 8 bits.</p> |
| <p>(The test was executed with <a href="../../../../testsuite/performance/ext/pb_ds/hash_zlob_random_int_find_timing.cc"><tt>hash_zlob_random_int_find_timing_test</tt></a> |
| 200 200 2100)</p> |
| <h2><a name="purpose" id="purpose">Purpose</a></h2> |
| <p>The test checks the effect of different range-hashing |
| functions and trigger policies (see <a href="hash_based_containers.html#hash_policies">Design::Associative |
| Containers::Hash-Based Containers::Hash Policies</a> and |
| <a href="hash_based_containers.html#resize_policies">Design::Associative |
| Containers::Hash-Based Containers::Resize Policies</a>).</p> |
| <h2><a name="results" id="results">Results</a></h2> |
| <p>Figures <a href="#NHG">NHG</a>, <a href="#NHM">NHM</a>, and |
| <a href="#NHL">NHL</a> show the results for various hash-based |
| associative-containers in <a href="assoc_performance_tests.html#gcc"><u>g++</u></a>, <a href="assoc_performance_tests.html#msvc"><u>MSVC++</u></a>, and |
| <a href="assoc_performance_tests.html#local"><u>local</u></a>, |
| respectively.</p> |
| <div id="NHG_res_div"> |
| <div id="NHG_gcc"> |
| <div id="NHG_hash_zlob_random_int_find_timing_test"> |
| <div id="NHG_assoc"> |
| <div id="NHG_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style="border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHG" id="NHG"><img src="hash_zlob_random_int_find_timing_test_gcc.png" alt="no image" /></a></h6>NHG: Skewed-distribution random int find timing test using <tt>find</tt> - <a href="assoc_performance_tests.html#gcc">g++</a><p>In the above figure, the names in the legends have the following meaning:</p> |
| <ol> |
| <li> |
| cc_hash_mask_exp_nea_lc_1div8_1div1_nsth_map- |
| <a href="cc_hash_table.html"><tt>cc_hash_table</tt></a> |
| with <tt>Comb_Hash_Fn</tt> = <a href="direct_mask_range_hashing.html"><tt>direct_mask_range_hashing</tt></a> |
| , and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a> |
| with <tt>Size_Policy</tt> = <a href="hash_exponential_size_policy.html"><tt>hash_exponential_size_policy</tt></a> |
| , and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a> |
| with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/1</i></li> |
| <li> |
| gp_hash_mod_quadp_prime_nea_lc_1div8_1div2_nsth_map- |
| <a href="gp_hash_table.html"><tt>gp_hash_table</tt></a> |
| with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a> |
| , <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a> |
| with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a> |
| , and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a> |
| with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/2</i>, and <tt>Probe_Fn</tt> = <a href="quadratic_probe_fn.html"><tt>quadratic_probe_fn</tt></a> |
| </li> |
| <li> |
| n_hash_map_ncah- |
| <tt>std::tr1::unordered_map</tt> with <tt>cache_hash_code</tt> = <tt><b>false</b></tt></li> |
| <li> |
| cc_hash_mod_prime_nea_lc_1div8_1div1_nsth_map- |
| <a href="cc_hash_table.html"><tt>cc_hash_table</tt></a> |
| with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a> |
| , and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a> |
| with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a> |
| , and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a> |
| with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/1</i></li> |
| </ol> |
| </div><div style="width: 100%; height: 20px"></div></div> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div id="NHM_res_div"> |
| <div id="NHM_msvc"> |
| <div id="NHM_hash_zlob_random_int_find_timing_test"> |
| <div id="NHM_assoc"> |
| <div id="NHM_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style="border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHM" id="NHM"><img src="hash_zlob_random_int_find_timing_test_msvc.png" alt="no image" /></a></h6>NHM: Skewed-distribution random int find timing test using <tt>find</tt> - <a href="assoc_performance_tests.html#msvc">msvc++</a><p>In the above figure, the names in the legends have the following meaning:</p> |
| <ol> |
| <li> |
| n_hash_map_ncah- |
| <tt>stdext::hash_map</tt></li> |
| <li> |
| cc_hash_mask_exp_nea_lc_1div8_1div1_nsth_map- |
| <a href="cc_hash_table.html"><tt>cc_hash_table</tt></a> |
| with <tt>Comb_Hash_Fn</tt> = <a href="direct_mask_range_hashing.html"><tt>direct_mask_range_hashing</tt></a> |
| , and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a> |
| with <tt>Size_Policy</tt> = <a href="hash_exponential_size_policy.html"><tt>hash_exponential_size_policy</tt></a> |
| , and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a> |
| with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/1</i></li> |
| <li> |
| gp_hash_mod_quadp_prime_nea_lc_1div8_1div2_nsth_map- |
| <a href="gp_hash_table.html"><tt>gp_hash_table</tt></a> |
| with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a> |
| , <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a> |
| with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a> |
| , and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a> |
| with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/2</i>, and <tt>Probe_Fn</tt> = <a href="quadratic_probe_fn.html"><tt>quadratic_probe_fn</tt></a> |
| </li> |
| <li> |
| cc_hash_mod_prime_nea_lc_1div8_1div1_nsth_map- |
| <a href="cc_hash_table.html"><tt>cc_hash_table</tt></a> |
| with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a> |
| , and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a> |
| with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a> |
| , and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a> |
| with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/1</i></li> |
| </ol> |
| </div><div style="width: 100%; height: 20px"></div></div> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div id="NHL_res_div"> |
| <div id="NHL_local"> |
| <div id="NHL_hash_zlob_random_int_find_timing_test"> |
| <div id="NHL_assoc"> |
| <div id="NHL_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style = "border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHL" id= "NHL"><img src="hash_zlob_random_int_find_timing_test_local.png" alt="no image" /></a></h6>NHL: Skewed-distribution random int find timing test using <tt>find</tt> - <a href = "assoc_performance_tests.html#local">local</a></div><div style = "width: 100%; height: 20px"></div></div> |
| </div> |
| </div> |
| </div> |
| </div> |
| <h2><a name="observations" id="observations">Observations</a></h2> |
| <p>In this setting, the keys' distribution is so skewed that |
| the unerlying hash-table type affects performance marginally. |
| (This is in contrast with <a href="hash_text_find_find_timing_test.html">Hash-Based Text |
| <tt>find</tt> Find Timing Test</a> , <a href="hash_random_int_find_find_timing_test.html">Hash-Based |
| Random-Integer <tt>find</tt> Find Timing Test</a> , <a href="hash_random_int_subscript_find_timing_test.html">Hash-Based |
| Random-Integer Subscript Find Timing Test</a> and <a href="hash_random_int_subscript_insert_timing_test.html">Hash-Based |
| Random-Integer Subscript Insert Timing Test</a> .)</p> |
| <p>The range-hashing scheme affects performance dramatically. A |
| mask-based range-hashing scheme effectively maps all values |
| into the same bucket. Access degenerates into a search within |
| an unordered linked-list. In Figures <a href="#NHG">NHG</a> and |
| <a href="#NHM">NHM</a> , it should be noted that |
| <tt>std::tr1::unordered_map</tt> and <tt>stdext::hash_map</tt> |
| are hard-wired currently to mod-based and mask-based schemes, |
| respectively.</p> |
| <p>When observing the settings of this test, it is apparent |
| that the keys' distribution is far from natural. One might ask |
| if the test is not contrived to show that, in some cases, |
| mod-based range hashing does better than mask-based range |
| hashing. This is, in fact just the case. We did not encounter a |
| more natural case in which mod-based range hashing is better. |
| In our opnion, real-life key distributions are handled better |
| with an appropriate hash function and a mask-based |
| range-hashing function. (<a href="../../../../testsuite/ext/pb_ds/example/hash_shift_mask.cc"><tt>shift_mask.cc</tt></a> |
| shows an example of handling this a-priori known skewed |
| distribution with a mask-based range-hashing function). If hash |
| performance is bad, a <i>Χ<sup>2</sup></i> test can be used |
| to check how to transform it into a more uniform |
| distribution.</p> |
| <p>For this reason, <tt>pb_ds</tt>'s default range-hashing |
| function is mask-based.</p> |
| <p><a href="assoc_performance_tests.html#hash_based_types">Observations::Hash-Based |
| Container Types</a> summarizes some observations on hash-based |
| containers; <a href="assoc_performance_tests.html#hash_based_policies">Observations::Hash-Based |
| Containers' Policies</a> summarizes some observations on |
| hash-based containers' policies.</p> |
| </div> |
| </body> |
| </html> |