This is a weighted variation of interval scheduling; it's solvable in O(N log N)
with dynamic programming.
Let an interval be g(start, stop, score)
, and let them be sorted by stop
. For simplicity, let's assume for now that all stop
is unique.
Let best[i]
be the best score we can get when we're allowed to use g[1], ..., g[i]
. We don't have to use them all, of course, and generally we can't because the subset of intervals we use must be non-overlapping.
- Clearly
best[0] = 0
. That is, since we can't use any interval, the best score we can get is 0.
- For any
1 <= k <= N
, we have:
best[k] = max( best[k-1], best[j] + g[k].score )
, where
j
is the largest index such that g[j].stop < g[k].start
(j
may be zero)
That is, given that we're allowed to use g[1], ... g[k]
, the best we can do is the better scoring of these two options:
- We do not include
g[k]
. Thus, the score of this option is best[k-1]
.
- ... because that's the best we can do with
g[1], ... g[k-1]
- We include
g[k]
, and to its left we do the best we can with all the genes that don't overlap with g[k]
, i.e. all g[1], ..., g[j]
, where g[j].stop < g[k].start
and j
is as large as possible. Thus, the score of this option is best[j] + g[k].score
.
(Note the optimal substructure and overlapping subproblems components of dynamic programming embodied in the above equation).
The overall answer to the question is best[N]
, i.e. the best score we can get when we're allowed to use all the genes. Oops, did I say genes? I mean intervals.
This is O(N log N)
because:
- Sorting all the intervals takes
O(N log N)
- Finding
j
for each k
is O(log N)
using binary search
If several genes can have the same stop
values, then nothing changed: you still have to search for the rightmost j
. In e.g. Python this is easy with bisect_right
. In Java where the standard library binary search doesn't guarantee which index is returned in case of ties, you can (among many options) follow it with a linear search (for O(N)
worst-case performance), or another series of binary searches to find the right most index.
Oops did I say genes again? I mean intervals.
Related questions
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…