Datatyping with PL/Perl

PostgreSQL supports custom datatypes through either "full datatypes" or "DOMAIN"s. Doing full datatypes requires C-language coding that is loaded into the backend, whereas DOMAINs are SQL level. This makes DOMAINs a lot easier.

Next topic, procedural languages. PostgreSQL supports multiple procedural languages in the same installations - pl/pgsql which is "almost oracle pl/sql", pl/tcl, pl/python and... pl/perl. I like pl/perl mainly because it gives me the power of perl regexps :-)

Final topic before summary, Swedish SSNs (that's social security numbers, not nuclear attack submarines - we don't have those). All citizens in Sweden are given a "personnummer" (which is about the same as a SSN in the US) at birth, on the form YYMMDD-PPKX. Year, month, day should be self-explaining. PPK is a serial number, in which PP represents which area of the country the person is born, and K is even for females and odd for males. Finally, X is a checksum. You need a personnummer for almost everything. And of course, we need to store this in databases all the time.

Which brings me to the point of this post. Using a combination of DOMAIN and custom functions, you can create a simple datatype for personnummer that will automatically validate that the entered number is correct. Sure, if you have access you can validate it against a table of all valid personnummer, but that table is very restricted in availability. So we'll settle for validating the checksum. Here's how to do it:

CREATE OR REPLACE FUNCTION validate_pnr( text) RETURNS bool AS
$BODY$

# Basic validation
return 'f' unless ($_[0] =~ /^(\d{2})([01]\d)([0123]\d)-(\d|T)(\d|F)\d{2}$/);

return 'f' if ($2>12);

return 'f' if ($3>31);

# No checksum check for temporary numbers

return 't' if ($4 eq 'T');

# Checksum

my $s = substr($_[0],0,6) . substr($_[0],7,3);
my $m = 2;
my $c = 0;

while (length($s)) {

  my $d = chop($s) * $m;
  $c += ($d%2510)+int($d/10);
  $m = ($m>1)?1:2;

}

$c = 10-($c %25 10);



return ($c == int(chop($_[0])))?'t':'f';

$BODY$

LANGUAGE 'plperl' IMMUTABLE STRICT;



CREATE DOMAIN persnr

  AS varchar(11)
  NOT NULL
    CONSTRAINT persnr_check CHECK validate_pnr((VALUE)::text);

Simple as that. You will notice there is an exception for personnummer with the final four numbers as TFxx. These are used for temporary numbers for people who are not Swedish citizen but need to be registered in official databases that require a personnummer. Now we can use this like:

CREATE TABLE people (personnr persnr PRIMARY KEY, name varchar(128));

Or whatever. And the database will prevent us from putting in an invalid personnummer.

One could of course use the CHECK constraint on the table, but then we'd have to write that one once for each table. Using a DOMAINs we can re-use this information in multiple tables.

I'm sure this can be written in more efficient perl. If anybody wants to do it for me, please let me know and I'll happily update my installations /mha/templates/default/img/emoticons/smile.png alt=":-)" style="display: inline; vertical-align: bottom;" class="emoticon" /

(Sidenote to anybody blogging on worldpress (possibly only worldpress-pg? I'm not really sure): The thing eats backslashes in the editor. If you save multiple times, one level of backslashes is lost every time.. Also, the preview box is broken with this.)


Conferences

I speak at and organize conferences around Open Source in general and PostgreSQL in particular.

Upcoming

Stockholm PUG 2016/3
Jun 16, 2016
Stockholm, Sweden
PGDay UK 2016
Jul 05, 2016
London, UK
PG Day'16 Russia
Jul 6-8, 2016
St Petersburg, Russia
Stockholm PUG 2016/4
Aug 31, 2016
Stockholm, Sweden
Postgres Open
Sep 13-16, 2016
Dallas, USA
PGConf.EU 2016
Nov 1-Jan 4, 2016
Tallinn, Estonia

Past

PGCon
May 17-21, 2016
Ottawa, Canada
Pycon Sweden
May 9-10, 2016
Stockholm, Sweden
Stockholm PUG 2016/2
Apr 26, 2016
Stockholm, Sweden
PGConf.US
Apr 18-20, 2016
New York, USA
pgDay Paris
Mar 31, 2016
Paris, France
More past conferences