Counting
Task description
As input to your program, there is a binary file containing 1 billion 32-bit integer numbers, stored as uint32_t. The size of the file is 4,000,000,000 bytes.
а) count the unique numbers. For example:
given input -> output
0x100 0x100 0xfff 0xfff -> 2 unique numbers
0x100 0x100 0x100 0x100 -> 1 unique number
0x100 0x100 0x800 0xfff -> 3 unique numbers
b) count how many numbers are seen ONLY once. For example:
given input -> output
0x100 0x100 0xfff 0xfff -> 0 numbers seen only once
0x100 0x100 0x100 0x100 -> 0 numbers seen only once
0x100 0x100 0x800 0xfff -> 2 numbers seen only once
Implementation
The program reads the contents of a binary file passed as an argument --input_file
and stores the number of occuarance of each 12 bit value into a frequency table numbers
.
The numbers
array is evaluated after the entire file has been read.
Note
If the file is smaller than 12 bits (2 bytes) it is discarted due to insuficient data. Leftover bits from files with size is not multiple of 12 bits are not processed and considered insucient data since they are not enough to form a valid 12 bit value.
Tests
The test counting/test/counting.py generates 5 predefined patterns repeated n
number of times and compares the ouput from the binary counting
with the expected values
for the input patterns.
Test output
Test pattern: 0x123, 0x456, 0x789, 0xABC, 0xDEF, repeated: 1
C binary /home/iliya/Work/StorPool/StorPool/build/counting/counting output:
5 unique numbers
5 numbers seen only once
Test pattern: 0x123, 0x456, 0x789, 0xABC, 0xDEF, repeated: 2
C binary /home/iliya/Work/StorPool/StorPool/build/counting/counting output:
5 unique numbers
0 numbers seen only once
Test pattern: 0x123, 0x456, 0x789, 0xABC, 0xDEF, repeated: 1000
C binary /home/iliya/Work/StorPool/StorPool/build/counting/counting output:
5 unique numbers
0 numbers seen only once
Test pattern: 0x123, 0x123, 0xABC, 0xABC, 0xDEF, repeated: 1
C binary /home/iliya/Work/StorPool/StorPool/build/counting/counting output:
3 unique numbers
1 numbers seen only once
Test pattern: , repeated: 1
C binary /home/iliya/Work/StorPool/StorPool/build/counting/counting output:
0 unique numbers
0 numbers seen only once
Ok