The Anatomy of a Recommendation Engine

A recommendation engine is a software system that analyzes large amounts of transactional data and distills personal profiles to present its users with relevant products/information/content.

We see them in a wide variety of domains and applications and they help us navigate the overwhelming choice that we face everyday.

This tutorial will formally introduce the concepts and definitions of the recommendation systems literature and will quickly move on to an iterative process for building a minimal reco engine.

Recommendation Graph

Recommenders have been around since at least 1992. Today we see different flavours of recommenders, deployed across different verticals:

  • Amazon
  • Netflix ...
Read More....

Useful Python Tricks

Originally inspired from this blog post. But many have been customized and rewritten into an IPython Notebook.

Note: Written for Python 3


a, b, c = 1, 2, 3

a, b, c

(1, 2, 3)
a, b, c = [1, 2, 3]

a, b, c
(1, 2, 3)
a, b, c = (2 * i + 1 for i in range(3))
a, b, c
(1, 3, 5)
a, (b, c), d = [1, (2, 3), 4]

a, b, c, d
(1, 2, 3, 4)

Iterating over list index and value pairs (enumerate)

a = ['Hello', 'world', '!']

for i, x in enumerate(a):
    print('{}: {}'.format(i ...
Read More....

Prime Sieves and Python Data Structures

Recently, I’ve been working through Project Euler in order to improve my core programming skills.

One of those recurring problems requires efficiently calculating and testing for prime numbers. The first algorithm that comes to mind is The Sieve of Eratosthenes. The Sieve, is one of many prime sieves, and is a simple yet time efficient algorithm for finding all the primes below a certain limit.

The Algorithm

  1. Make a table one entry for every number \(2 \leq n \leq limit\)
  2. Starting at 2, cross out all multiples of 2, not counting 2 itself.
  3. Move up to the next number ...
Read More....

Pelican: A Blogging Engine Written in Python

There are many static site generators out there.

The most famous and well known being, Jekyll and Octopress.

Jekyll is used by Github Pages as the default generator, while Octopress is a framework for Jekyll geared specifically for blogging. However, the reason why I decided to use Pelican over both, is because Pelican is written in Python, and therefore support the reStructuredText markup by default and is a language I feel comfortable in case I need to get down and fix the engine. One of my primary forcuses of blogging is so that I can also use it as a ...

Read More....