AJ Shankar

aj at cs dot berkeley dot edu

Presentation Slides

Incremental computation is an optimization technique that takes advantage of the resuability of successive, iterative computations that vary slightly from one another. For instance, if you wanted to compute the successive sums of each m-element window in an n-element array (with m < n) then you might use the following code:

for (int i = 0 ; i < n-m ; i++) { for (int j = i ; j < m ; j++) { sum[i] += ary[j]; } }which is an O(n

for (int j = 0 ; j < m ; j++) { sum[0] += ary[j]; } for (int i = 1 ; i < n-m ; i++) { sum[i] = sum[i-1] - ary[i-1] + ary[i+m-1]; }taking advatange of the fact that we've already computed the previous window, which differs from the current one in only two values. The new algorithm is O(n).

I'd like to explore optimization algorithms that identify opportunities for incremental computation, as well as some of its more interesting applications.

**Two notes
**There is an entire area of study devoted to creating incremental

Also, considerable research has been done in developing an incremental model of computation; code is rendered incremental at run-time via function caching (storing the results of previous calls to the same function) and other methods. This approach relies on a runtime mechanism and as such never explicitly constructs incremental code that can be run by conventional means.

**Papers**

**J. Earley, "High Level Iterators and a Method of Automatically Designing
Data Structure representation," ERL-M416, Computer Science Division, University
of California, Berkeley, February, 1974
**The original paper that introduces the idea; couldn't find it online.

**Robert Paige, J. T. Schwartz, Expression continuity
and the formal differentiation of algorithms, Proceedings of the 4th ACM
SIGACT-SIGPLAN symposium on Principles of programming languages, p.58-71, January
17-19, 1977, Los Angeles, California
**(14 pages) This paper explores the idea of incremental computation at an
extremely high level, with respect to sets. It introduces some preliminary heuristics
to determine if an expression can be optimized. It also discusses the feasibility
of an system that would automatically

**Robert Paige, Shaye Koenig, Finite Differencing
of Computable Expressions, ACM Transactions on Programming Languages and
Systems (TOPLAS), v.4 n.3, p.402-454, July 1982
**(53 pages) A more comprehensive look at the ideas presented in the previous
paper. It goes through several step-by-step examples of pseudo-mechanical transformation
to incremental code; however, the paper still only deals with a very high-level
language.

**Yanhong Liu, Tim Teitelbaum, Deriving Incremental
Programs, Technical Report TR 93-1384, Department of Computer Science, Cornell
University, Ithaca, New York, September 1993.
**(41 pages) The first systematic approach to incrementalizing programs written
in common functional languages. This paper defines many of the terms used in
incremental computation, formalizes the notion of partial and complete incremental
functions, and provides a step-by-step automated approach to deriving incremental
functions. It also introduces a time model by which one can determine when to
use a cached result, and when it might be faster to recompute the whole function.
Finally, it provides a proof of correctness for the deriviations it suggests,
as well as some potential future improvements.

**Yanhong Liu, Tim Teitelbaum, Systematic derivation
of incremental programs, Science of Computer Programming, v.24 n.1, p.1-39,
February 1995
**(30 pages) The published version of the above technical report. More formal,
less readable =). I skimmed it, so I might be missing something...

**Yanhong Liu, Tim Teitelbaum, Caching Intermediate
Results for Program Improvement, Proceedings of the Symposium of Partial
Evaluation and Semantics-Based Program Manipulation, June 21-23, 1995, La Jolla,
California**

(12 pages) Extends the basic "cached-result" framework with a second
phase, one which exploits intermediate results computed in the original computation
of f(x). Describes the "cache-and-prune" algorithm for identifying
which intermediate results are useful, and employing them in the incrementalized
code.

**Yanhong Liu, Scott Stoller, Tim Teitelbaum, Discovering
Auxiliary Information for Incremental Computation, Proceedings of POPL'96,
January 21-24, 1996, St. Petersburg Beach, Florida
**(14 pages) This paper describes phase three of of the overall incremental
computation problem: discovering, computing, maintaining, and exploiting auxiliary
information about the input value. (Note that this information is by definition
not computed by f(x) and therefore is not considered an intermediate result.)
The usual step-by-step analysis follows.

**Yanhong Liu, Efficient Computation via Incremental
Computation, Pacific-Asia Conference on Knowledge Discovery and Data Mining,
December 1999**

(24 pages) This summary paper provides an excellent overview of the basic ideas
behind incremental computation, and also covers some of the additional research
done to date, including the discovery and use of intermediate results and auxiliary
information. It also steps through several simple examples and describes the
author's first go of an actual implementation, CACHET. This paper is a good
place to start.

**Yanhong Liu, Scott Stoller, From
recursion to iteration: what are the optimizations?, Proceedings of PEPM'00,
January 22-23, 2000, Boston, Massachusetts
**(19 pages) Liu again expands the scope of incremental computation to tackle
recursive functions. Describes algorithm to convert recursion to iteration that
handles multiple base cases and multiple recursive cases. Also provides some
interesting stats on how tail recursion can actually result in

**Yanhong Liu, Scott Stoller, Ning Li, Tom Rothamel, Optimizing
Aggregate Array Computations in Loops, Technical Report TR 02-xxxx, Computer
Science Department, State University of New York at Stony Brook, Stony Brook,
New York, July 2002
**(34 pages) I only had time to skim this paper. It tackles the specific task
of applying incremental computation techniques to aggregate array computations
in loops. Unlike previous papers, it includes comprehensive performance measurements
(running time, cache misses, etc) taken from a variety of real-world algorithms.