-
Notifications
You must be signed in to change notification settings - Fork 281
Open
Description
There are a bunch of functions that compute a vector containing divtab. The code looks roughly like this:
for (i = 2; i <= N; i++)
{
if (divtab[2 * i] == 1)
u[i] = nmod_pow_ui(i, n, mod);
else
u[i] = nmod_mul(u[divtab[2 * i]], u[divtab[2 * i + 1]], mod);
}
A better way to do this might be to partition divtab in advance into a list of primes and a list of composites, then do all the powers in one batch followed by all the multiplies in one batch. This not only avoids a branch; the multiplications could then also be better vectorized, and one could have a vector version of nmod_pow_ui that does several bases simultaneously with the same exponent n to get some speedup there as well.
Metadata
Metadata
Assignees
Labels
No labels