SQL: Storing and comparing colors with PostGIS
Given the challenge of comparing and sorting colours by how similar they are to a reference colour, we decided on PostGIS as a decent approach as it allows for a pure SQL solution - rather than having to load all the data into PHP.
Creating the TABLE
For the following examples we're going to use this TABLE structure:
# CREATE TABLE colours (
id serial,
name character varying UNIQUE NOT NULL,
colour geometry,
PRIMARY KEY (id),
CONSTRAINT enforce_dims_point CHECK (st_ndims(colour) = 3)
);
ERROR: type "geometry" does not exist
LINE 4: colour geometry,
Whoops, we need PostGIS installed and enabled before we can use geometry.
Installing PostGIS
For it to work with PostgreSQL the package we need to install is postgresql-9.6-postgis-scripts (assuming you're running 9.6), but be aware that this comes with a lot of baggage:
# apt-get -u install postgresql-9.6-postgis-scripts
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
i965-va-driver libaacs0 libaec0 libarmadillo7 libarpack2 libasound2
libasound2-data libass5 libasyncns0 libatomic1 libaudio2 libavc1394-0
libavcodec57 libavdevice57 libavfilter6 libavformat57 libavresample3
libavutil55 libbdplus0 libblas-common libblas3 libbluray1
libboost-chrono1.62.0 libboost-date-time1.62.0 libboost-filesystem1.62.0
libboost-program-options1.62.0 libboost-serialization1.62.0
libboost-system1.62.0 libboost-test1.62.0 libboost-thread1.62.0
libboost-timer1.62.0 libbs2b0 libcaca0 libcdio-cdda1 libcdio-paranoia1
libcdio13 libcgal12 libchromaprint1 libcoin80v5 libcroco3 libcrystalhd3
libdap23 libdapclient6v5 libdapserver7v5 libdc1394-22 libebur128-1
libegl1-mesa libepsilon1 libfaad2 libflac8 libflite1 libfreexl1 libfribidi0
libgbm1 libgdal20 libgeos-3.5.1 libgeos-c1v5 libgeotiff2 libgfortran3
libgif7 libgme0 libgraphicsmagick-q16-3 libgsm1 libhdf4-0-alt libhdf5-100
libiec61883-0 libiso9660-8 libjack-jackd2-0 libjson-c3 libkmlbase1
libkmlconvenience1 libkmldom1 libkmlengine1 libkmlregionator1 libkmlxsd1
liblapack3 libldb1 liblwgeom-2.3-0 libmad0 libminizip1 libmng1 libmodplug1
libmp3lame0 libmpcdec6 libmpg123-0 libnetcdf11 libnuma1 libodbc1 libogdi3.2
libogg0 libopenal-data libopenal1 libopencv-core2.4v5 libopencv-imgproc2.4v5
libopenmpt0 libopenscenegraph100v5 libopenthreads20 libopus0 libpgm-5.2-0
libpoppler-glib8 libpostproc54 libproj12 libpulse0 libqhull7 libqt4-dbus
libqt4-opengl libqt4-xml libqtcore4 libqtdbus4 libqtgui4 libquadmath0
libraw1394-11 librsvg2-2 librsvg2-common librubberband2 libsamplerate0
libsdl2-2.0-0 libsfcgal1 libshine3 libsmbclient libsnappy1v5 libsndfile1
libsndio6.1 libsodium18 libsoxr0 libspatialite7 libspeex1 libssh-gcrypt-4
libsuperlu5 libswresample2 libswscale4 libsz2 libtalloc2 libtbb2 libtdb1
libtevent0 libtheora0 libtwolame0 liburiparser1 libusb-1.0-0 libv4l-0
libv4lconvert0 libva-drm1 libva-x11-1 libva1 libvcdinfo0 libvdpau-va-gl1
libvdpau1 libvorbis0a libvorbisenc2 libvorbisfile3 libvpx4 libwavpack1
libwayland-client0 libwayland-cursor0 libwayland-egl1-mesa
libwayland-server0 libwbclient0 libwebpmux2 libx264-148 libx265-95
libxcb-xfixes0 libxerces-c3.1 libxine2 libxine2-bin libxine2-doc
libxine2-ffmpeg libxine2-misc-plugins libxine2-plugins libxkbcommon0
libxvidcore4 libzmq5 libzvbi-common libzvbi0 mesa-va-drivers
mesa-vdpau-drivers odbcinst odbcinst1debian2 postgresql-9.6-postgis-2.3
postgresql-9.6-postgis-2.3-scripts postgresql-9.6-postgis-scripts proj-bin
proj-data python-talloc qdbus qt-at-spi qtchooser qtcore4-l10n samba-libs
va-driver-all vdpau-driver-all
0 upgraded, 191 newly installed, 0 to remove and 0 not upgraded.
Need to get 88.0 MB/88.7 MB of archives.
After this operation, 389 MB of additional disk space will be used.
Do you want to continue? [Y/n]
Once it's installed, you need to activate the extension in your database. Only then we can create our TABLE from earlier:
# CREATE EXTENSION postgis;
CREATE EXTENSION
# CREATE TABLE colours (
id serial,
name varchar UNIQUE NOT NULL,
colour geometry,
PRIMARY KEY (id),
CONSTRAINT enforce_dims_point CHECK (st_ndims(colour) = 3)
);
CREATE TABLE
The CONSTRAINT here is optional. It just ensures that all values in the geometry field have three dimensions, which we're going to use for red, green and blue color values.
It has been pointed out that using HSV or CIELAB colours might be preferable to RGB, which is true, but this doesn't change the SQL table or query syntax, unless we need to apply weightings.
'geometry' is a fundamental postgis spatial data type used to represent a feature in the Euclidean coordinate system.
Storing colours as geometry
You can refer to our earlier article for instructions on extracting colour values from an image. Our requirement in this case was to extract colours from an uploaded 'swatch' and then to find and display the nearest matches.
To cut a long story short, once you have your colour values you can convert them to a geometry object in SQL using:
ST_GeomFromText('POINT($red $green $blue)')
So your query becomes:
# INSERT INTO colours (name, colour) VALUES ('test', ST_GeomFromText('POINT(182 167 135)'));
INSERT 0 1
What is actually recorded in the database is something different:
SELECT * FROM colours;
id | name | colour
----+------+------------------------------------------------------------
1 | test | 01010000800000000000C066400000000000E064400000000000E06040
(1 row)
To convert this back to something readable we use ST_AsText:
# SELECT name, ST_AsText(colour) from colours;
name | st_astext
------+-----------------------
test | POINT Z (182 167 135)
Similar options apply for storing other spatial data types.
Querying by colour match
The exciting part of the project is being able to compare colours. What we're actually doing is measuring the distance between points in 3D space, but with the axes defined as red, green and blue.
The following query will return all colours in our TABLE ordered by how close they are to a particular reference colour:
SELECT name, colour, ST_3DDistance(colour, '0101000080241CC7711C476C4047388EE338CE684035AAAAAAAACA6240') AS distance
FROM colours
ORDER BY ST_3DDistance(colour, '0101000080241CC7711C476C4047388EE338CE684035AAAAAAAACA6240');
The output - and you may be able to guess which industry was involved - would be:
name | distance
----------------------+------------------
Creamy Beige | 2.05179836807052
Sand Beige | 2.36225462505863
Sand | 3.66834978204199
Rose Beige | 4.98020774023042
Soft Beige | 5.81505071472323
... | ...
Obviously we've only scratched the surface of PostGIS in this example. There are any number of possible data types and functions to be explored - even other extensions.
Uninstalling PostGIS
You can disable the extension in the database using:
DROP EXTENSION postgis;
ERROR: cannot drop extension postgis because other objects depend on it
DETAIL: table colours column colour depends on type geometry
HINT: Use DROP ... CASCADE to drop the dependent objects too.
As you can see, this throws a warning if there are objects in the database such as our geomety field that rely on PostGIS.
# ALTER TABLE colours ADD COLUMN saved_colour varchar;
ALTER TABLE
# UPDATE colours SET saved_colour = ST_AsText(colour);
UPDATE 100
# DROP EXTENSION postgis CASCADE;
NOTICE: drop cascades to table product_colours column colour
DROP EXTENSION
This leaves our colours TABLE intact except for the colour column which has been dropped, but our new saved_colour column contains a backup of the values in a human-readable format - only we lose our ability to query by distance.