SymbolicRegression.jl is a state-of-the-art symbolic regression library written from scratch in Julia using a custom evolutionary algorithm. The software emphasizes high-performance distributed computing, and can find arbitrary symbolic expressions to optimize a user-defined objective – thus offering a very interpretable type of machine learning. SymbolicRegression.jl and its Python frontend PySR have been used for model discovery in over 30 research papers, from astrophysics to economics.
SymbolicRegression.jl is an open-source library for practical symbolic regression, a type of machine learning that discovers human-interpretable symbolic models. SymbolicRegression.jl was developed to democratize and popularize symbolic regression for the sciences, and is built on a high-performance distributed backend, a flexible search algorithm, and interfaces with several deep learning packages. The hand-rolled internal search algorithm is a mixed evolutionary algorithm, which consists of a unique evolve-simplify-optimize loop, designed for optimization of unknown real-valued constants in newly-discovered empirical expressions. The backend is highly optimized, capable of fusing user-defined operators into SIMD kernels at runtime with LoopVectorization.jl, performing automatic differentiation with Zygote.jl, and distributing populations of expressions to thousands of cores across a cluster using ClusterManagers.jl. In describing this software, I will also share a new benchmark, “EmpiricalBench,” to quantify the applicability of symbolic regression algorithms in science. This benchmark measures recovery of historical empirical equations from original and synthetic datasets.
In this talk, I will describe the nuts and bolts of the search algorithm, its efficient evaluation scheme, DynamicExpressions.jl, and how SymbolicRegression.jl may be used in scientific workflows. I will review existing applications of the software (https://astroautomata.com/PySR/papers/). I will also discuss interfaces with other Julia libraries, including SymbolicUtils.jl, as well as SymbolicRegression.jl's PyJulia-enabled link to the ScikitLearn ecosystem in Python.