实验二 哈希表的设计
哈希表简介:
![This is an test image](/2022/11/29/%E9%9A%8F%E6%9C%BA%E7%AE%97%E6%B3%95%E5%AE%9E%E9%AA%8C%E4%BA%8C-%E5%93%88%E5%B8%8C%E8%A1%A8%E7%9A%84%E8%AE%BE%E8%AE%A1/1.jpg)
实验内容:
1.编写属于2-通用簇的某个哈希函数代码(30 分)
•输入:键值
•输出:哈希值
2.给定一个较大的m值,例如10^9,和一个较小的n值,例如1000,通过实验观察多次插入操作后链表的平均长度(键值随机采样),并与理论结果进行对比分析(40 分)
3.对于(2)的结果,使用 2-通用哈希函数簇中不同的哈希函数,观察并分析结果的差异(30 分)
第一问:
一下是我编写的两个哈希函数:
1 2 3 4
| long long get_hash1(long long x){ assert(mod>0); return ((((long long)x<<1)%mod*(long long)x%mod)+x)%mod; }
|
1 2 3 4
| long long get_hash2(long long x){ assert(mod>0); return x%mod; }
|
第二问:
![This is an test image](/2022/11/29/%E9%9A%8F%E6%9C%BA%E7%AE%97%E6%B3%95%E5%AE%9E%E9%AA%8C%E4%BA%8C-%E5%93%88%E5%B8%8C%E8%A1%A8%E7%9A%84%E8%AE%BE%E8%AE%A1/2.jpg)
经过实验发现在m较大,n较小的情况下,平均长度趋于1.0,与理论结果相近
在m,n较为接近的情况下,平均长度因哈希函数而异。
第三问:
如图是哈希函数1的实验结果:
![This is an test image](/2022/11/29/%E9%9A%8F%E6%9C%BA%E7%AE%97%E6%B3%95%E5%AE%9E%E9%AA%8C%E4%BA%8C-%E5%93%88%E5%B8%8C%E8%A1%A8%E7%9A%84%E8%AE%BE%E8%AE%A1/4.jpg)
如图是哈希函数2的实验结果:
![This is an test image](/2022/11/29/%E9%9A%8F%E6%9C%BA%E7%AE%97%E6%B3%95%E5%AE%9E%E9%AA%8C%E4%BA%8C-%E5%93%88%E5%B8%8C%E8%A1%A8%E7%9A%84%E8%AE%BE%E8%AE%A1/3.jpg)
经过对比可以发现不论是大模数小数据抑或是大模数大数据哈希函数2的表现要优于哈希函数1.
Code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85
| #include<unordered_set> #include<unordered_map> #include<functional> #include<algorithm> #include<string.h> #include<iostream> #include<iterator> #include<cstring> #include<numeric> #include<assert.h> #include<random> #include<cstdio> #include<vector> #include<bitset> #include<queue> #include<stack> #include<cmath> #include<ctime> #include<set> #include<map>
const long long N = 200010; std::mt19937 rnd(time(0)); namespace MyHash{ long long mod,idx,hash_number; struct Node{ Node(long long _key,long long _val,long long _next):key(_key),val(_val),next(_next){}; long long key,val,next; }; std::vector<Node> node; std::vector<long long> head; void init(long long _mod){ assert(_mod!=0); hash_number=idx=0,mod=_mod; node.clear(); head.resize(mod, -1); } long long get_hash1(long long x){ assert(mod>0); return ((((long long)x<<1)%mod*(long long)x%mod)+x)%mod; } long long get_hash2(long long x){ assert(mod>0); return x%mod; } long long get_node(long long key,long long val){ Node now=Node(key,val,-1); node.push_back(now); return idx++; } void insert(long long key, long long val){ long long _hash = get_hash1(key); long long now = get_node(key, val); if(head[_hash] == -1) hash_number ++; node[now].next = head[_hash]; head[_hash]=now; } double average_len(){ if(!hash_number) return 0; assert(hash_number); return (double)idx/hash_number; } }
void solve(){ long long n,m; std::cout << "请输入n(哈希实验个数) m(模数):" << std::endl;; std::cin >> n >> m; MyHash::init(m); for(long long i=1;i<=n;++i){ auto key = rnd(),val = rnd(); MyHash::insert(key, val); } auto ans = MyHash::average_len(); if(ans<1e-9) std :: cout << "哈希表为空" << std :: endl; else std::cout << ans << std::endl; }
signed main(){ solve(); return 0; }
|