Validation & benchmarks¶

cuPeriod ships a reproducible validation + benchmark suite (benchmarks/ in the repository). This page summarizes the headline results; the full report, with figures, lives in the repo and regenerates from bundled data with no network access.

Validation data — 72 real ASAS-SN g-band light curves across six variability classes (eclipsing binaries, RR Lyrae, Cepheids, δ Scuti, long-period and rotational variables), each with an established VSX literature period. TLS is validated on confirmed Kepler KOIs. The curves and their literature periods ship with the suite, so §1–2 and §4 are fully reproducible offline.

1. Numerical validation¶

Every method runs the same grid through cuPeriod’s CPU and GPU backends and through an independent reference implementation. Two checks: CPU↔GPU parity (the backends must agree to round-off) and cuPeriod↔reference (must match an established implementation).

Method	N	CPU↔GPU parity	same P	reference	ref. agreement
GLS	72	2.1e-06	100%	astropy LS	3.4e-10
BLS	72	1.2e-11	100%	astropy BLS	1.1e-09
PDM	72	2.1e-11	100%	PyAstronomy	r ≥ 0.948
CE	72	1.7e-15	100%	Graham 2013	0.0e+00
String-Length	72	1.4e-02	100%	Dworetsky 1983	5.2e-11
MHAOV	72	3.6e-07	100%	Sch.-Czerny	8.0e-05
TLS	22	8.1e-10	100%	—	—

parity is the worst-case relative difference between the CPU and GPU statistic over all stars (GLS/MHAOV GPU paths are single precision, hence ~1e-6/1e-7; the rest are double). String-Length’s looser parity is isolated trial frequencies where near-equal phases sort in a different order on the GPU — the recovered period is unaffected (same P = 100%).

2. Period recovery on real light curves¶

Method	harmonic-aware	exact (≤ 2%)
GLS	99%	75%
BLS	100%	96%
PDM	99%	81%
CE	99%	76%
String-Length	99%	60%
MHAOV	99%	75%
TLS	100%	95%

harmonic-aware accepts the method-appropriate fold ambiguity (e.g. Fourier methods recover P/2 for contact binaries); exact requires the VSX literature period itself within 2%.

3. Performance¶

Single light curve (~900 points), NVIDIA RTX 5070 Ti vs the CPU backends:

Method	CPU backend	CPU	GPU	vs reference	GPU speed-up
GLS	finufft	0.019 s	0.007 s	3× astropy	~3×
BLS	numba	0.187 s	0.091 s	20× astropy	~2×
PDM	numpy	1.79 s	0.010 s	4× PyAstronomy	178×
CE	numpy	0.53 s	0.012 s	—	45×
String-Length	numpy	0.92 s	0.023 s	—	39×
MHAOV	numpy	3.56 s	0.111 s	—	32×
TLS	numpy	4.49 s	0.042 s	—	106×

Two takeaways:

cuPeriod’s CPU path already beats every reference tool it was checked against — most dramatically BLS, where the multicore numba box search is 20× faster than astropy’s compiled BoxLeastSquares while matching it to floating point.
The GPU delivers 30–180× on the methods whose CPU path is plain numpy (PDM, CE, String-Length, MHAOV, TLS), and a smaller single-curve margin on GLS/BLS — whose CPU backends are already specialized. The GPU’s decisive win for GLS/BLS is at catalog scale.

4. Batch throughput & transits¶

Batch: ~620 light curves/second on one GPU for GLS on short survey curves — over 2.2 million light curves/hour.
Kepler transits (TLS): on 12 confirmed KOIs with a blind 0.5–12 d search, cuPeriod recovers 10/12 within 2% (the transitleastsquares reference recovers 11/12); both miss only the shallowest, where a blind search aliases — a shared, honest failure mode, not a backend defect. On the CPU-timed subset the GPU is a median 109× faster at CPU↔GPU agreement ≤ 2e-14.

See the full report for the figures, per-KOI detail, and reproduction commands.