[Algo] Range Query

Range query is a type of task to calculate a value based on a subarray of an array.

Static array queries

  • Sum-type queries: when calculating values, range cannot be overlapping
    • calculate the prefix sum array.
    • One dimension: sum(a,b) = sum(0,b)-sum(0,a) = prefix_sum(b)-prefix_sum(a)
    • Two dimension: sum(rectangle_abcd) = sum(a)-sum(b)-sum(c)+sum(d)
partial_sum(arr.begin(), arr.end(), prefix_sum.begin());
  • Min-type queries: when calculating values, range can overlap
    • calculate the sparse table: use binary lifting method to cache values for sub queries
// Cache the log2 value for sizes of a range
std::vector<int> lg2(n+1);
for (int i=2;i<=n;++i) lg2[i] = lg2[i/2]+1;

// Sparse table
int max_pw2=lg2[n];
std::vector<std::vector<int>> st(max_pw2+1, std::vector<int>(n, INT_MAX));

// Initialization: range=2^0
st[0] = input_arr;

// Calculate values for range=2^pw2
for (int pw2=1;pw2<=max_pw2;++pw2) {
  for (int i=0;i+(1<<pw2)<=n;++i) {
    int prev_pw2=pw2-1;
    st[pw2][i] = std::min(st[prev_pw2][i], st[prev_pw2][i+(1<<prev_pw2)]);
  }
}

// Query function
auto query_fn = [&](int l, int r){
  int pw2=lg2[r-l+1];
  return std::min(st[pw2][l], st[pw2][r-(1<<pw2)+1]);
};

Binary Indexed Tree (BIT)

  • Binary Index Tree is a dynamic variant of a prefix sum array.
  • Update array value in O(log n) time.
  • Get range query value in O(log n) time.
  • The prefix sum range length at position idx is
// Binary Index Tree starts from index 1
// bit[0] is not used, or you can handle it manually
std::vector<int> bit(n+1);

// For a sum bit, update a val means add val to the original array[idx]
void update(int idx, int val) {
  while (idx<=n) {
    bit[idx] += val;
    idx += idx&-idx;
  }
}

// Query the sum for range [1, idx]
int query(int idx) {
  int ans = 0;
  while (idx>=1) {
    ans += bit[idx];
    idx -= idx&-idx;
  }
}

// Query the sum for range [l, r]
int range(int l, int r) {
  return query[r]-query[l-1];
}

Segment Tree

  • Segment Tree is a complete tree, the left and right child of node is 2*node+1 and 2*node+2.
  • Segment Tree support value update and range query in O(log n) time too.
  • Compare to Binary Indexed Tree, it can support more flexible range query, e.g. range minimum query (RMQ), Greatest Common Divisor (GCD) etc. But it use more memory than BIT.
// The original array
std::vector<int> arr;
int n=arr.size();

// Segment tree for RMQ, index starts from 0
// Required size = 2*pow(2, ceil(log2(n)) ~ 4*n
std::vector<int> st(4*n, INT_MAX);

// (l, r, node): the current node in the segment tree
// and its corresponding range [l, r] in the array
void build(int l, int r, int node) {
  // When l==r, the node is a leaf node
  // Set leaf node's value as in the original array
  if (l==r) {
    st[node] = arr[l];
    return;
  }

  // When l!=r, we set children nodes first,
  // then the current node
  int mid = (l+r)/2;
  int lnode = 2*node+1;
  int rnode = 2*node+2;
  build(l, mid, lnode);
  build(mid+1, r, rnode);
  st[node] = std::min(st[lnode], st[rnode]);
}

// (l, r, node): the current node in the segment tree
// (idx, val): set the original array with val at idx
void update(int l, int r, int node, int idx, int val) {
  // If idx is not in current node's range, return
  if (idx<l || idx>r) return;
  
  // Set the leaf node for idx with val
  if (l==r && l==idx) {
    st[node] = val;
    return;
  }

  // For non-leaf nodes, update with children's values
  int min=(l+r)/2;
  int lnode = 2*node+1;
  int rnode = 2*node+2;
  update(l, mid, lnode, idx, val);
  update(mid+1, r, rnode, idx, val);
  st[node] = std::min(st[lnode], st[rnode]);
}

// (l, r, node): the current node in the segment tree
// [lbound, rbound]: the queried range 
int query(int l, int r, int node, int lbound, int rbound) {
  // If current node's range is not in the bound, return
  if (l>rbound || r<lbound) return INT_MAX;

  // If current node's range is within the bound,
  // return val in tree.
  if (l>=lbound && r<=rbound) return st[node];

  // If current node's range overlaps with the bound,
  // split current node's range.
  int mid=(l+r)/2;
  int lnode = 2*node+1;
  int rnode = 2*node+2;
  int lval = query(l, mid, lnode, lbound, rbound);
  int rval = query(mid+1, r, rnode, lbound, rbound);
  return std::min(lval, rval);  
}

Others

  • Range update: We can use difference array to do range update. The prefix sum array of a difference array is the original array.
adjacent_difference(arr.begin(), arr.end(), diff.begin());

Leave a Reply

Your email address will not be published. Required fields are marked *