Subset checking in PostgreSQL with SQLAlchemy

Let’s assume we have a system where events are stored for multiple services and tenants. Let’s also assume that our fictional system has a means of updating many of these events at a time, for instance to mark them as unimportant. And for the sake of keeping things relevant, let’s assume that this service is available via some authenticated public API.

Given all of the above, and the knowledge that we can’t just trust anyone to limit themselves to events that are theirs to edit, we’ll have to verify that all of the events selected for editing are within the scope of editing for the user.

The simplest way to do this would be to load every item from the database and check whether it’s eligible for modification. However, this is something that scales terribly past a few dozen records, so let’s not even consider that.

Set theory to the rescue

If the phrase “verify a set of given IDs are all part of a set of valid IDs” makes you think of sets and subset checking, you already know what this is about. Python has a set type that provides a bunch of useful operations that allow us to check whether a given set A ({1, 4}) has all of its values present in set B ({2, 4, 6}; it does not). We can use this to solve our problem:

user_selected_ids = {1, 4}
permissible_ids_q = session.query(
permissible_event_ids = { for row in permissible_ids_q}
assert user_selected_ids <= permissible_event_ids
more ...

Storing and verifying passwords with SQLAlchemy

I really enjoy abstractions.

Abstractions are the lifeblood of programming. They take complex operations and make them easy to work with through accessible interfaces. This article will be about doing that with the way we store and verify passwords (or rather their cryptographic hashes) in (web-)applications based on SQLAlchemy. And in such a way that we can upgrade the security of passwords as old encryption schemes are broken or proven insufficient and new ones get introduced.

The ways of verifying passwords

There are many different ways to deal with password verification, but they can be classified in three rough categories:

Ad-hoc verification

This approach is often used for one-off scripts or trivial applications that have a single verification and have very little to gain from reusable components. The following shows a very basic User model and password verification code:

import bcrypt
from sqlalchemy import Column, Integer, Text

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(Text)
    password = Column(Text)

pwhash = bcrypt.hashpw(login_data['password'], user.password)
if user.password == pwhash:
    print 'Access granted'

The snippet above uses the Python bcrypt package for key derivation, but you may well use another key derivation method, like PBKDF2 or scrypt. To be sure: It should not have a static salt, and it should definitely not be a single round of optimized-for-speed SHA1 or SHA2 functions. [1] In short, use proven and existing methods.

Assuming that the hashing algorithm is secure, this code is perfectly fine for the stated goal. However, most applications are not trivial and most one-off scripts tend to find their way into repeated use. And over time you find yourself confronted with more and more repeats of the same code to verify a user’s password. Something needs to be done, it’s time to …

more ...

Creating a mostly-random color generator

Grid of randomly colored hexagons

Fully randomized colors

A coloring like this may be desired for some situations, but for most it’s too random.

In the previous two posts we’ve explored how to draw and tile hexagons, creating images that can be seamlessly repeated. We also briefly covered coloring them using a random color generator. The coloring process itself worked fine, but many times the created tiling turned out not very visually appealing, the colors of too harsh a contrast in tone and brightness.

In this post we’ll cover the creation of a mostly-random color generator. One that creates random colors within a certain fraction of the available colorspace.

Picking a color representation

There are a few different ways to describe colors in RGB colorspace, providing us with different approaches on how to restrict the available portion to select colors randomly from:

  • RGB: Red, Green and Blue values — A direct representation of the intensity of each of the color components.
  • HSV: Hue, Saturation, Value (brightness) — A cylindrical color mapping, common in color wheels in various graphics programs
  • HSL: Hue, Saturation and Lightness — Another cylindrical color mapping, similar to HSV but with a few different behaviors which we’ll discuss in a moment

Any of the above could be used, and all three have potentially interesting behavior when certain values are kept constant, or only allowed to vary by a small amount. If we would like a behavior where we can restrict the hue of the color but have full variation in lightness and color intensity, the direct RGB mode is ruled out.

The choice between HSV and HSL mostly comes down to a matter of taste. Intuitively, when I read a color where all three channels are maxed, I would expect that to be a fully saturated and maximally intense color. HSV …

more ...

Hexagon tilings with Python - Part 2

In part one, we covered the drawing of smooth, anti-aliased hexagons in a grid-like fashion. In this post, we will extend that beginning into a program to draw hexagon fills that can be used for a tiled background. The key improvements that were identified in the previous post:

  1. Canvas sizing and shape wrapping
  2. Color wrapping around the edges
  3. Configurable, mostly random coloring

Canvas sizing and shape wrapping

To be able to use the generated image as a tiled background, the shapes must wrap from left to right. That is, if a row contains a hexagon that is cut off on the left edge, the portion that was cut off should be the end of that row on the right edge, and vice-versa.

This is a complex way of saying that the width of the image must be an exact multiple of the pattern size. The pattern in this case is the smallest shape from which a repeating pattern can be constructed. For the hexagons generator we have, this pattern is one column wide and two rows tall.

Because the pattern size is very closely coupled to the dimensions of the hexagon, we make this information available as a property of class HexagonGenerator. We then modify the existing script to draw only outlines so we can see how things are constructed:

class HexagonGenerator(object):
  # Rest of class as previously defined
  def pattern_size(self):
    return self.col_width, self.row_height * 2

def main():
  hexagon_generator = HexagonGenerator(40)
  width, height = hexagon_generator.pattern_size
  image ='RGB', (int(width * 2), int(height * 3)), 'white')
  draw = Draw(image)
  draw.rectangle((0, 0, width, height), Brush('black', opacity=64))
  colors = (('red', 'blue'), ('green', 'purple'))
  for row in range(5):
    for column in range(3):
      hexagon = hexagon_generator(row, column)
      color = colors[row % 2][column % 2]
      draw.polygon(list(hexagon), Pen(color, width=3))
more ...

Hexagon tilings with Python

A few days ago I ended up on a website (I forget where) which featured a very nice and subtle square tiling on the background of the page. Now, this is in itself is not astonishing, plenty of folks and businesses out there will present you with all sorts of tessellated backgrounds. However, it got me thinking about doing some tiling background myself. Not because I particularly need one (it certainly wouldn’t look right on this blog), but just to experiment. To make things slightly more interesting I opted for another shape: the hexagon.

A hexagon is a polygon with six edges and six vertices. The regular hexagon is equilateral and all internal angles are 120°. It is also one of three polygons with which you can create a regular tiling. That is, using only hexagons you can fill a plane without any gaps or overlaps. The other two regular shapes with which this can be done are the square and the equilateral triangle.

Drawing a hexagon with PIL

The first step towards tiling hexagons is to create a single hexagon. The first stop when it comes to images in Python is the Python Imaging Library (better known as PIL). Using this library and knowledge of basic math, the following code will generate a single hexagon:

import math
from PIL import Image, ImageDraw

def hexagon_generator(edge_length, offset):
  """Generator for coordinates in a hexagon."""
  x, y = offset
  for angle in range(0, 360, 60):
    x += math.cos(math.radians(angle)) * edge_length
    y += math.sin(math.radians(angle)) * edge_length
    yield x, y

def main():
  image ='RGB', (100, 100), 'white')
  draw = ImageDraw.Draw(image)
  hexagon = hexagon_generator(40, offset=(30, 15))
  draw.polygon(list(hexagon), outline='black', fill='red')
more ...

Mordac the (query) preventer

A small update, a much delayed update, but still an update.

A few days ago, cwillu asked the following question in #sqlalchemy [1]:

<cwillu_> is there a way to entirely disable autocommit?
<cwillu_> i.e., any insert outside of an explicit transaction
          will either error out, or just immediately rollback?

What eventually worked for this case was simply disabling autocommit on the connection level. Explicit rollbacks were issued at the end of each test and all worked fine. But given that SQLAlchemy features a pretty nifty event system, I was pretty sure there was a better solution available. Also, I was waiting for an excuse to experiment with that part of SQLAlchemy, and this offered exactly that.

As it turns out, preventing queries from happening is about as straightforward as it gets. A quick look at the list of SQLAlchemy Core events identifies two likely candidates: 'before_cursor_execute' and 'before_execute'. Both events are triggered when queries are executed, but the latter is documented to work on a higher level: it’s signaled before the query is compiled for the appropriate dialect. Given that we want to entirely prevent queries outside of transactions, stopping them before doing any unnecessary work seems best, so we’ll use that one.

When the event triggers, our function is called and the caller provides us with the connection and a few other arguments, which we’ll collect in *args. The connection has a method in_transaction(), which indicates whether a transaction has explicitly been started for it. That is, even with autocommit turned off, it will return False after the first execute() call. This is exactly what we need to know and the last thing to do is raise an appropriate error. And so Mordac the Preventer [2] is born:

more ...