2013 Sang Han

A collection of notes and useful articles on numerical algorithms.

ml

Definitions

Markov Chains: A special type of stochastic process. The standard definition of a stochastic process is an ordered collection of random variables:

Self-Organized Criticality (SOC): “Self-Organized” means that from any initial condition, the system tends to move toward a critical state, and stay there, without external control. A system is “critical” if it is in transition between two phases.

Kernel Density Estimation: Because fuck histograms.

Regression:

  • Least Squares
  • Ridge Regression
  • Last Angle Regression
  • Elastic Net
  • Kernel Ridge Regression
  • Support Vector Machines (SVR)
  • Partial Least Squares (PLS)

Classification:

  • Linear Discriminant Analysis (LDA)
  • Basic Perceptron,
  • Elastic Net
  • Logistic Regression (Kernel)
  • Support Vector Machines (SVM)
  • Diagonal Linear
  • Discriminant Analysis (DLDA),
  • Golub Classifier
  • Fisher Discriminant Classifier
  • k-Nearest-Neighbor
  • Classification Tree
  • Maximum Likelihood Classifier

Clustering:

  • Hierarchical Clustering
  • Memory-saving
  • Hierarchical Clustering
  • K-m`eans

Dimensionality Reduction:

  • Fisher Discriminant (FDA),
  • Spectral Regression Discriminant Analysis (SRDA)
  • Principal Component Analysis (PCA)

Critical systems demonstrate common behaviors

  1. Long-tailed distributions of some physical quantities.
  2. Fractal geometries:
  3. Variations in time that exhibit pink noise or a time series with many frequency components.
    • In white noise, all of the components have equal power.
    • In “pink” noise, low-frequency components have more power than high-frequency components.

Models

Reductionist:

  • A reductionist model describes a system by describing its parts and their interactions.
  • When a reductionist model is used as an explanation, it depends on an analogy between the components of the model and the components of the system.

Holistic:

  • Holistic models are more focused on similarities between systems and less interested in analogous parts.
  • Usually fits the simplest model that demonstrates that behavior Optimization

Agent-Based Models

  • Agents that model intelligent behavior, usually with a simple set of rules.
  • Useful for modeling the dynamics of systems that are not in equilibrium.
  • Agent-based models are useful for modeling the dynamics of systems that are not in equilibrium.
  • Particularly useful for understanding relationships between individual decisions and system behavior.

Stochastic Volatility Model:

  • Models the latent volatility variable, modeled as a stochastic process.

Analysis

Detection of peaks in data

A naiive but simple optimization algorithm and performance analysis.

Detection of onset in data

Introduces detection of change, using electromyography to model the usage of a threshold potential to signify a change in the enviornment.

Detection of changes using the Cumulative Sum (CUSUM)

Introduces the Cumulative sum (CUSUM) algorithm.

Time Normalization

Uses one-dimensional linear interpolation in temporal alignment of cyclic data.

Algorithms

K Nearest Neighbor Dynamic Time Warping

The DTW algorithm finds the optimum alignment between two sequences of observations by warping the time dimension with certain constraints.

DTW is good for classifying sequences that have different frequences or that are out of phase.

When it comes to timeseries classification, 1 Nearest Neighbor (K=1) and Dynamic Timewarping is very difficult to beat.

Gaussian Probabilities

Introduces using Gaussians to represent beliefs in the Bayesian sense. Gaussians allow us to implement the algorithms used in the Discrete Bayes Filter to work in continuous domains.

One Dimensional Kalman Filters

Implements a Kalman filter by modifying the Discrete Bayesian Filter to use Gaussians. This is a full featured Kalman filter, albeit only useful for 1D problems.

Ensemble Kalman Filters

Discusses the ensemble Kalman Filter, which uses a Monte Carlo approach to deal with very large Kalman filter states in nonlinear systems.

Markov Chain Monte Carlo

Thorough explanation of MCMC(Markov Chain Monte Carlo), operates and diagnostic tools.

Introduces the PyMCMC library which implements the MCMC exploring algorithm and exends a public Model class.

The Model class has very simple internals: just a list of unobserved variables and a list of factors which go into computing the posterior density.

Digital Signal Processing

DSP Primer: The State of Determinism

A brief description about the basic properties of signals. Introduces Nyquist-Shannon Theorem for sampling, and discusses linear systems, issues with quantization and RNG’s.

Fourier Series

The original fourier series implemented in Python.

Fast Fourier Transform and Power Spectral Density

FFT and it’s application to PSD. Also implements Short-Time Fourier Transform when you just want an estimate.

Data filtering in signal processing

A thorough introduction to data filtering and the most basic filters typically used in signal processing.

Reproducible academic publications

The probability of improvement in Fisher’s geometric model: a probabilistic approach

Stress-induced mutagenesis and complex adaptation

Automatic segmentation of odor maps in the mouse olfactory bulb using regularized non-negative matrix factorization

Multi-tiered genomic analysis of head and neck cancer ties TP53 mutation to 3p loss, by A. Gross et al. (Nature Genetics 2014)

powerlaw: a Python package for analysis of heavy-tailed distributions, by J. Alstott et al.

Collaborative cloud-enabled tools allow rapid, reproducible biological insights, by B. Ragan-Kelley et al.

A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data, by C.T. Brown et al.

The kinematics of the Local Group in a cosmological context by J.E. Forero-Romero et al.

Warming Ocean Threatens Sea Life

Extrapolating Weak Selection in Evolutionary Games

Using neural networks to estimate redshift distributions

Mechanisms for stable, robust, and adaptive development of orientation maps in the primary visual cortex

Accelerated Randomized Benchmarking

Dynamics and associations of microbial community types across the human body

Variations in submarine channel sinuosity as a function of latitude and slope

Frontoparietal representations of task context support the flexible control of goal directed cognition

pyparty: Intuitive Particle Processing in Python

Indication of family-specific DNA methylation patterns in developing oysters


Comments

comments powered by Disqus