Here is the leet code problem. In case you don't want to read the problem, in essence you are given an array of numbers
and a number k. You want to find the number of subsets possible, given a restriction (numbers
exactly k apart can not be in the same subset) preventing some subsets. For a simple example
with k = 10, and array={1,2}, 1 and 2 are not exactly k apart, so they can appear in
subset [1,2]. This subset also contains 2 other subsets: [1] and [2], making 3 in total. If k
were 1 instead of 10, the subset [1,2] would be invalid leaving only 2 subsets. There is an
assumption I made reading the problem description that messed me up (I'll be getting to that
soon).
When I first came across this problem I knew it was supposed to be a DP problem,
the hard part being how to setup the memozation properly. As I took my time to read over the
problem I couldn't help but think that this was solvable using combinatronics. For example, if
the provided array contains no elements that are exactly k apart, then the solution is
relatively straight forward: 2^n - 1.
Calculating the amount of subsets in an array with conflicting elements is a lot
more difficult. The first step I took was focusing on calculating the subsets provided by a row
of conflicting elements (i.e k=1 and array={1,2,3,4}). Here some assumptions were made
on my part: only the length of the chain matters, order of the array does not matter, nor does
the exact k matter. Using this as a foundation all I needed to was write a function that given
the length of the chain produced the subsets.
int chain_subsets(int n) {
int a = 1;
int b = 2;
for (int i = 1; i < n; i++) {
int tmp = a + b + 1;
a = b;
b = tmp;
}
return a;
}total += chain_length + total * chain_length;Now just to submit it!
Oh wait, it doesn't work.
When I saw multiple repeating numbers in the input I felt snubbed of a cool victory. The
problem never mentioned array would include only unique elements, but my interperation of set
and subset was that of no repeating elements, so I was left high and dry. Trying to adapt the
solution when elements could repeat was tough, since the internally linked hashmap no longer
worked, as every value would need to link to the correct bounds, and my function was useless. So
I came up with a 3 loop approach. First, populate the hashmap with repeats, this helps solves
the bounds issue later on as you won't have interior accesses to the chain. Then connect
elements together in the chain (while calculating subsets at the same time). Lastly, calculate
total like before. This solution actually works, I submitted it and it is within the fastest
solutions, but with some variance: being faster than 100% of submissions, or 93%.
I was curious if the best algorithm that Leet provided (written by a user called Shivam Aggarwal)
was faster. I suspected that since the problem had a small n, and the fact that Shivam's algorithm
was nlogn it would be faster. Well, I wanted to actually analysis and see if I could get my solution
to be faster, but to make things "fair" I knew n had to be larger than the max 18 specified by leet.
Doing some mental math I was indexing into a hashmap 5n times, along with more pointer derefences
to random memory locations in the map. An n of atleast 64 such that log n would be equal to my map
indexing would make a more equal field. So I converted both algorithms to rust and used i128 and
capped n at 126 (to avoid any integer issues). Now time to compare and improve. I made some tests,
setup criterion for some finer benchmarking, and ran the code. My algorithm was significantly slower,
around half the speed. So first thing first, the easier pickings. Rust's hashmap implementation for
getting the keys loops over ALL memory buckets allocated in the hash, I stored unique keys into an
array and used that to loop over the map instead. My next thought was to minimize branches, but in
my connect function they were used to prevent null pointer derefences, so it felt hard to remove,
so I thought to use a zeroed out struct, and in a parallel thought I considered moving more of the
accessess to an array, in an attempt to better cache results. The result of this started to approach
Shivam's implementation being faster in certain situations and worse in others (varying chain length).
At this point I felt it hard to think of any more major improvements that could be made. Perhaps
I could reorder some math expressions and simplify some parts, but I had a gut feeling this would
be a minimal gain. Just to be sure I ran a flamegraph and most of the time was spent on hash calculations.
As Rust's default hash has some focus on HashDoS attack resistance, that slow it down, so swapping
it out brought the solution to twice the speed of Shivam's.