cl-ana aims to provide the basic tools necessary to analyze large/medium-large datasets.

The wiki serves as the manual/tutorial for cl-ana.

cl-ana is highly modular, and is intended to be an extensible collection of utilities as opposed to a monolithic application

Supported so far:

  • Tabular data: Provides unified interface for working with HDF5, ntuples (like PAW from CERN), CSVs (using cl-csv), and new formats can be added via specializing on generic functions.
  • Histogramming/Binned data analysis: Contiguous, sparse and categorical histograms are provided along with integration/projection along any axes of the histogram, slicing, arithmetic, and functional access (map, filter, reduce).
  • Plotting: gnuplot is used for plotting/visualization; supports plotting LISP functions, formulae as strings, alists, histograms, and can be extended.
  • Fitting: Nonlinear least squares fitting is handled via a frontend to gsll; allows LISP functions to be fitted against alists, histograms and can be extended to allow other data sources.
  • Generic mathematics: CL doesn't come with extensible math functions/operations, so generic versions are provided. On top of this is built e.g. the tensor sublibrary, which treats nested sequences of arbitrary depth as tensors, and defines element-wise functions for all of the generic math functions you create automatically (think matlab). Also included are utilities for error propogation and quantities (numbers with units).

statistics language extension