Experimental Data for "On Complexity Bounds and Confluence of Parallel Term Rewriting"

We revisit parallel-innermost term rewriting as a model of parallel computation on inductive data structures and provide a corresponding notion of runtime complexity parametric in the size of the start term. We propose automatic techniques to derive both upper and lower bounds on parallel complexity of rewriting that enable a direct reuse of existing techniques for sequential complexity. Our approach to find lower bounds requires confluence of the parallel-innermost rewrite relation, thus we also provide effective sufficient criteria for proving confluence. The applicability and the precision of the method are demonstrated by the relatively light effort in extending the program analysis tool AProVE and by experiments on numerous benchmarks from the literature.

On this web page, we provide the experimental data supporting our claims.

Literature

Here, we provide a link to our paper which describes our approach.

Thaïs Baudon, Carsten Fuhs, Laure Gonnord
On Complexity Bounds and Confluence of Parallel Term Rewriting

Benchmarks

Here we provide the benchmark collections that we used for our experiments. The source of our benchmarks is the Termination Problem Data Base (TPDB), version 11.3.

As initial benchmark set, we considered the 663 benchmarks for (sequential) runtime complexity of innermost rewriting in the subdirectory Runtime_Complexity_Innermost_Rewriting. Out of the above 663 TRSs, 294 TRSs have potential for parallelism, indicated by a Maximal Structural Dependency Chain Set of size at least two for the right-hand side of at least one rewrite rule. This corresponds to function calls at parallel positions. If this is not given, the parallel-innermost rewrite relation and the (sequential) innermost rewrite relation from the considered start terms (i.e., basic terms) are identical. Details of the corresponding analysis are available here. Therefore, we now focus on this subset of 294 parallelisable TRSs, with the data taken from the same experiments as above. Our benchmark files are available here:

benchmarks_tpdb_parallelisable.zip - the 294 TRSs from the first set with potential for parallelism

These files are in the (human-readable) format used in the Termination Competition in earlier years.

The line

(GOAL COMPLEXITY)

denotes that complexity bounds are supposed to be analysed (rather than checking termination of the input).

The line

(STARTTERM CONSTRUCTOR-BASED)

denotes that only basic start terms are supposed to be considered (runtime complexity).

The line

(STRATEGY INNERMOST)

denotes that only (sequential) innermost rewrite strategies are considered.

To indicate that runtime complexity of parallel-innermost rewrite strategies is supposed to be analysed, replace this line by:

(STRATEGY PARALLELINNERMOST)

For our experiments on confluence, in addition to the above benchmarks, we used the 577 unsorted unconditional TRSs of COPS (version of 29 May 2023), the benchmark collection used in the Confluence Competition. The file format is the same as above, except that the above annotations are absent. The TRSs in these files are normally analysed for confluence of full rewriting. In contrast, we analyse them for confluence of parallel-innermost rewriting.

Our benchmark files from COPS are available here: benchmarks_cops.zip

Experiments and Tool Configuration

We conducted our experiments on the StarExec platform, a cross-community computing platform for conducting experiments with automatic deduction tools. In particular, this is also the platform on which the annual Termination and Complexity Competition is run to assess the state of the art in automated analysis of tools for termination and complexity analysis of programs and term rewrite systems.

We used the all.q computer queue and a timeout of 300 seconds per example.

As far as we are aware, prior to our work there have not been any automated tools able to analyse runtime complexity of parallel-innermost rewriting. However, an upper bound on runtime complexity of innermost rewriting is always also an upper bound on runtime complexity of parallel-innermost rewriting: pirc_R(n) ≤ irc_R(n) holds for all TRSs R.

Thus, as the state of the art for upper bounds of runtime complexity of parallel-innermost rewriting with which to compare our contributions, we used the complexity analysis tools AProVE and TcT (Starexec URL for TcT) from the Termination and Complexity Competitions in 2021 and 2022 to analyse runtime complexity for innermost rewriting. We used the "tct_rci" configuration of TcT from StarExec. The corresponding tool configurations are labelled "AProVE irc" and "TcT irc" in our tables.

We have assessed the following configurations of AProVE and TcT as the state of the art in searching for upper bounds of parallel-innermost complexity. You can find the resulting data from these experiments downloaded from StarExec here:

AProVE irc: job info, job output
TcT irc: job info, job output

Note that the lower bounds found in the above experiments are not valid for runtime complexity of parallel-innermost rewriting!

We implemented our contributions in the complexity analysis tool AProVE. The configuration "AProVE pirc_R Section 3" includes the techniques from Section 3, and the configuration "AProVE pirc_R Sections 3 & 4" (in two variants for different confluence criteria, to assess the impact of the criteria for complexity analysis) includes the techniques from both Section 3 and Section 4. We used these configurations to obtain upper and lower bounds on the runtime complexity of parallel-innermost rewriting for the TRSs from our benchmark suite. You can find the data from these experiments here:

AProVE pirc Section 3: job info, job output
AProVE pirc Sections 3 & 4, confluence proved via Corollary 5.10: job info, job output
AProVE pirc Sections 3 & 4, confluence proved via Theorem 5.18: job info, job output

We also extracted each of the TRSs PDT(R)/R obtained by the transformation presented in Section 4 and used the files as inputs for the analysis of irc_PDT(R)/R. Upper bounds for irc_PDT(R)/R are always also legitimate upper bounds for pirc_R. Lower bounds for irc_PDT(R)/R are lower bounds for pirc_R as well if R is confluent. This allows us to use existing tools for analysis of irc to find non-trivial upper bounds and, for confluent TRSs, also lower bounds for pirc_R.

Thus, we used the above configurations of AProVE and TcT from the Termination Competitions 2021 and 2022 for irc_R also to analyse PDT(R)/R. You can find the data from these experiments here:

AProVE pirc Section 4: job info, job output
TcT pirc Section 4: job info, job output

Result summary: complexity

Upper bounds for runtime complexity of (parallel-)innermost rewriting

The first table gives an overview over our experimental results for upper bounds. For each configuration, we state the number of examples for which the corresponding asymptotic complexity bound could be inferred. A row "≤ O(n^k)" means that the corresponding tools proved a bound "≤ O(n^k)" (e.g., in Table 1 right below, the configuration "AProVE irc" proved constant or linear upper bounds in 50 cases).

Tool	O(1)	≤ O(n)	≤ O(n²)	≤ O(n³)	≤ O(n^≥4)
TcT irc	4	32	51	62	67
AProVE irc	5	50	111	123	127
AProVE pirc Section 3	5	70	125	139	141
AProVE pirc Sections 3 and 4	5	70	125	140	142
TCT pirc Section 4	4	46	66	79	80
AProVE pirc Section 4	5	64	99	108	108

Here we use upper bounds on irc_R as a correct but imprecise over-approximation for pirc_R. We observe that upper complexity bounds improve in a noticeable number of cases, e.g., linear bounds on pirc can now be inferred for 70 TRSs rather than for 50 TRSs, an improvement by 40%.

Improvements from irc to pirc can be drastic: for example, for the TRS TCT_12/recursion_10, the bounds found by AProVE change from an upper bound of sequential complexity of O(n¹⁰) to a (tight) upper bound for parallel complexity of O(n). (This TRS models a specific recursion structure, with rules { f₀(x) → a } ∪ { f_i(x) → g_i(x,x), g_i(s(x), y) → b(f_i-1(y), g_i(x,y)) | 1 ≤ i ≤ 10 }, and is highly amenable to parallelisation.) We observe that adding the techniques from Section 4 to the techniques from Section 3 leads to only few examples for which better upper bounds can be found.

Lower bounds for runtime complexity of parallel-innermost rewriting

The second table shows our results for lower bounds on pirc. Here we could not directly apply existing tools for irc. The reason is that a lower bound on irc is not necessarily also a lower bound for irc (the whole point of performing innermost rewriting in parallel is to reduce the asymptotic complexity!), so using results by tools that compute lower bounds on irc for comparison would not make sense.

Tool	Confluent	≥ Ω(n)	≥ Ω(n²)	≥ Ω(n³)	≥ Ω(n^≥4)
AProVE pirc Sections 3 and 4, confluence by Corollary 5.10	165	116	22	5	1
AProVE pirc Sections 3 and 4, confluence by Theorem 5.18	190	133	22	5	1
TcT pirc Section 4, confluence by Corollary 5.10	165	112	0	0	0
TcT pirc Section 4, confluence by Theorem 5.18	190	131	0	0	0
AProVE pirc Section 4, confluence by Corollary 5.10	165	140	21	5	1
AProVE pirc Section 4, confluence by Theorem 5.18	190	159	21	5	1

We observe that non-trivial lower bounds can be inferred for 140 out of the 165 examples proved confluent via Corollary 5.10, and for 159 out of the 190 examples proved confluent via Theorem 5.18. This shows that our transformation from Section 4 has practical value since it produces relative TRSs that are generally amenable to analysis by existing program analysis tools. (A standard concern with transformations for program analysis is that, although correct, their output might be hard to analyse for existing tools. Our experimental results alleviate this concern for the contributions of this paper, both for upper and for lower bounds on pirc.) It also shows that the more powerful confluence analysis by Theorem 5.18 improves also the inference of lower complexity bounds.

Tight bounds for runtime complexity of parallel-innermost rewriting

Tool	Θ(1)	Θ(n)	Θ(n³)	Total
AProVE pirc Sections 3 and 4, confluence by Corollary 5.10	5	27	3	35
AProVE pirc Sections 3 and 4, confluence by Theorem 5.18	5	33	3	41
TcT pirc Section 4, confluence by Corollary 5.10	4	19	0	23
TcT pirc Section 4, confluence by Theorem 5.18	4	22	0	26
AProVE pirc Section 4, confluence by Corollary 5.10	5	33	3	41
AProVE pirc Section 4, confluence by Theorem 5.18	5	41	3	49

Finally, the third table shows that for overall 49 of the 294 TRSs, the bounds that were found are asymptotically precise.

Result summary: confluence

The criteria from

Corollary 5.10 and
Theorem 5.18

for confluence of parallel-innermost rewriting were implemented in the program analysis tool AProVE.

In our first experiment using the 294 TRSs obtained from the TPDB, confluence of 165 of these TRSs (about 56.1% of the benchmark set) was proved by Corollary 5.10, and confluence of a superset of 190 TRSs (about 64.6% of the benchmark set) was proved by Theorem 5.18. This indicates that the search for more powerful criteria for proving confluence of parallel-innermost term rewriting is worthwhile. Details of our experiments on the TRSs from the TPDB are available here: confluence_analysis_tpdb.txt

Thus, with about 64.6%, we have achieved a good "baseline" for confluence analysis of parallel-innermost rewriting. At the same time, these results also indicate that there is still work to be done to develop more powerful techniques for proving (and disproving!) confluence of parallel-innermost rewriting.

However, the nature of the benchmark set also plays a significant role for assessing the applicability of criteria for proving confluence. In our second experiment using the 577 TRSs obtained from COPS, only 60 out of the 577 TRSs (about 10.4%) were proved confluent by Corollary 5.10. In contrast, using Theorem 5.18, confluence proofs for 274 TRSs (about 47.5%) were found, a much higher number. Details of our experiments on the TRSs from COPS are available here: confluence_analysis_cops.txt

The significantly lower success rate for this benchmark set is not surprising: COPS collects TRSs that provide a challenge to confluence analysis tools, whereas the analysed subset of the TPDB contains TRSs which are interesting specifically for runtime complexity analysis. Towards a possible explanation for the substantially higher success rate of Theorem 5.18 over Corollary 5.10, consider that the benchmark set is specifically about confluence of the full rewrite relation, whereas we assess the parallel-innermost rewrite relation. Both Corollary 5.10 and Theorem 5.18 take the more restricted choice of (innermost!) redexes for rewrite steps into account, but Theorem 5.18 does so to a much higher degree.