Skip to content
This repository has been archived by the owner on May 3, 2023. It is now read-only.

RhodiumToad/ip4r-historical

Repository files navigation

IP4R  - IPv4/v6 and IPv4/v6 range index type for PostgreSQL
===========================================================

CHANGES from version 1.x:
=========================

 * Support for PostgreSQL versions older than 8.1 is REMOVED

 * New types for ip6, ip6r, ipaddress, iprange

 * ip4 input no longer accepts spurious leading whitespace


RATIONALE
=========

While PostgreSQL already has builtin types 'inet' and 'cidr', the
authors of this module found that they had a number of requirements
that were not addressed by the builtin type.

Firstly and most importantly, the builtin types have no support for
index lookups of the form (column >>= parameter), i.e. where you have
a table of IP address ranges and wish to find which ones include a
given IP address.  This requires an rtree or gist index to do
efficiently, and also requires a way to represent IP address ranges
that do not fall precisely on CIDR boundaries.

Secondly, the builtin inet/cidr are somewhat overloaded with
semantics, with inet combining two distinct concepts (a netblock, and
a specific IP within that netblock). Furthermore, they are variable
length types (to support ipv6) with non-trivial overheads, and the
authors (whose applications mainly deal in large volumes of single
IPv4 addresses) wanted a more lightweight representation.

IP4R therefore supports six distinct data types:

  ip4   - a single IPv4 address
  ip4r  - an arbitrary range of IPv4 addresses
  ip6   - a single IPv6 address
  ip6r  - an arbitrary range of IPv6 addresses
  ipaddress  - a single IPv4 or IPv6 address
  iprange    - an arbitrary range of IPv4 or IPv6 addresses

Simple usage examples:

CREATE TABLE ipranges (range ip4r primary key, description text not null);
CREATE INDEX ipranges_range_idx ON ipranges USING gist (range);
INSERT INTO ipranges VALUES ('10.0.0.0/8','rfc1918 block 1');
INSERT INTO ipranges VALUES ('172.16.0.0/12','rfc1918 block 2');
INSERT INTO ipranges VALUES ('192.168.0.0/16','rfc1918 block 3');
INSERT INTO ipranges VALUES ('0.0.0.0/1','classical class A space');
INSERT INTO ipranges VALUES ('10.0.1.10-10.0.1.20','my internal network');
INSERT INTO ipranges VALUES ('127.0.0.1','localhost');

CREATE TABLE access_log (id serial primary key, ip ip4 not null);
CREATE INDEX access_log_ip_idx ON access_log (ip);
INSERT INTO access_log(ip) VALUES ('10.0.1.15');
INSERT INTO access_log(ip) VALUES ('24.1.2.3');
INSERT INTO access_log(ip) VALUES ('192.168.10.20');
INSERT INTO access_log(ip) VALUES ('127.0.0.1');

-- find all accesses from 10.0.0.0/8
SELECT * FROM access_log WHERE ip BETWEEN '10.0.0.0' AND '10.255.255.255';

-- find all applicable descriptions for all entry in the access log
-- returns multiple rows for each entry if there are overlapping ranges
SELECT id,ip,range,description FROM access_log, ipranges WHERE ip <<= range;

-- find only the most specific description for all IPs in the access log
SELECT DISTINCT ON (ip) ip,range,description
  FROM access_log, ipranges
 WHERE ip <<= range
 ORDER BY ip, @ range;


INSTALLATION
============

ip4r can be installed via the pgxs mechanism (which is now the default).
Unpack the distribution and do:

make
make install

(as with PostgreSQL itself, this requires GNU Make. The second command
will usually need to be run as root.)

The 9.1+ "create extension" mechanism is not supported yet, for
compatibility with previous pg versions.

If for some reason you can't install via pgxs, then the module will
still build in the old-school way under contrib/. First unpack the
PostgreSQL distribution itself, and run ./configure with the same
parameters you used when installing PostgreSQL originally. If you
installed from a binary package, you will have to discover what
options the packager used.  Then create a subdirectory contrib/ip4r
and copy the contents of the ip4r distribution there, and run make
install.

Installing via either of the above methods creates the ip4r.so module,
and installs an SQL script ip4r.sql in the usual place for contrib
modules ($prefix/share/contrib by default,
/usr/local/share/postgresql/contrib on FreeBSD, etc.). In order to use
the types in a given database, the ip4r.sql script must be run in that
database, while connected as a database superuser.  For example:

psql -U pgsql -f /usr/local/share/postgresql/contrib/ip4r.sql mydatabase

(on systems other than FreeBSD, the username is usually "postgres"
rather than "pgsql")

This installs the type and its support functions in the catalog of the
specified database. If you install it in template1, then newly-created
databases will have the type pre-installed.

The type, functions, etc. are by default installed in the "public"
schema.  This can be changed by editing the ip4r.sql file.


USAGE
=====

Types "ip4", "ip6", "ipaddress"
-------------------------------

"ip4" accepts input in the form 'nnn.nnn.nnn.nnn' in decimal base only
(no hex, octal, etc.).  An ip4 value is a single IP address, and is
stored as a 32-bit unsigned integer.

"ip6" accepts input in the standard hexadecimal representation for
IPv6 addresses, e.g. '2001:1234:aa55::2323'. "Mixed" format input
(using an IPv4 dotted-decimal for the last two words) is accepted. An
ip6 value is a single IP address, and is stored as two 64-bit values
for convenience. Output is represented according to the specification
in RFC 5952 (including output in mixed format for v4-mapped addresses).

"ipaddress" accepts any input which is valid for either ip4 or ip6. An
ipaddress value is a single IP address, either v4 or v6. The v4 and v6
ranges are treated as disjoint - all v4 addresses are considered lower
than all v6 addresses, and '1.2.3.4' and '::ffff:1.2.3.4' are not equal.

"ipX" will be used below to represent any of the above three types.

The following type conversions are supported:

  Source type   | Dest type  |  Form
----------------|------------|------------------------------------------------
  ipX           |  text      |  text(ipX)  or  ipX::text  (explicit)
  text          |  ipX       |  ipX(text)  or  text::ipX  (explicit)
  ipX           |  cidr      |  cidr(ipX)  or  ipX::cidr  (assignment)
  inet          |  ipX       |  ipX(inet)  or  inet::ipX  (assignment)
  ipX           |  numeric   |  to_numeric(ipX) or  ipX::numeric (explicit)
  numeric       |  ipX       |  ipX(numeric)    or  bigint::ipX  (explicit)
  ip4           |  bigint    |  to_bigint(ip4)  or  ip4::bigint  (explicit)
  bigint        |  ip4       |  ip4(bigint)     or  bigint::ip4  (explicit)
  ip4           |  float8    |  to_double(ip4)  or  ip4::float8  (explicit)
  float8        |  ip4       |  ip4(float8)     or  float8::ip4  (explicit)
  ipX           |  ipXr      |  ipXr(ipX)  or  ipX::ipXr  (implicit)
  ip4           |  ipaddress |  ipaddress(ip4)  or  ip4::ipaddress (implicit)
  ip6           |  ipaddress |  ipaddress(ip6)  or  ip6::ipaddress (implicit)
  ipaddress     |  ip4       |  ip4(ipaddress)  or  ipaddress::ip4 (explicit)
  ipaddress     |  ip6       |  ip6(ipaddress)  or  ipaddress::ip6 (explicit)

The conversions from bigint and float8 are available only for ip4, and
accept values which are exact integers in the range 0 .. 2^32-1, which
are converted to IPs in the range 0.0.0.0 - 255.255.255.255 in the
obvious way. This is useful for conversions from applications which
store IPs in numeric form, as is often done for performance in certain
other databases.

Conversions to and from the 'numeric' type are available for all
formats with the obvious behaviour.

The conversion to cidr always results in a /32 (for v4) or /128 (for v6).
The conversion from inet ignores any prefix length and just takes the
specific IP address.

An ipX value implicitly converts to either the corresponding range
type (ip4 -> ip4r, ip6 -> ip6r), or to the iprange type, producing a
range containing only the single IP address.

ipX supports the following operators with the conventional meanings:
=, <>, <, >, <=, >=, and supports ORDER BY and btree indexes in the
obvious fashion. However, the planner does not understand how to
transform a query of the form

  WHERE ipcolumn <<= value

into a btree range scan (it does this transformation for the builtin
inet type using a function which is not extensible by plugins). As a
workaround, use the following form instead:

  WHERE ipcolumn BETWEEN lower(value) AND upper(value)

which will use a btree range scan.

ipX supports the following additional operators and functions:

 family(ipX) returns integer
 | returns the value 4 or 6 depending on address family

 ip4_netmask(integer) returns ip4
 | returns an ip4 value that represents a netmask for a prefix length

 ip6_netmask(integer) returns ip6
 | returns an ip6 value that represents a netmask for a prefix length

 ipX_net_lower(ipX, integer) returns ipX
 | returns the lowest address in the cidr block of the specified prefix
 | length, containing the specified IP
 | equivalent to: network(set_masklen(cidr(ipX),integer))

 ipX_net_upper(ipX, integer) returns ipX
 | returns the highest address in the cidr block of the specified prefix
 | length, containing the specified IP
 | equivalent to: broadcast(set_masklen(cidr(ip4),integer))

  Operator        | Description
------------------|--------------------------------------------------------
 ipX + integer    | add the given integer to the IP 
 ipX - integer    | subtract the given integer from the IP 
 ipX + bigint     | add the given integer to the IP 
 ipX - bigint     | subtract the given integer from the IP 
 ipX + numeric    | add the given integer to the IP 
 ipX - numeric    | subtract the given integer from the IP 
 ipX - ipX        | (returns bigint or numeric) difference between two IPs
 ipX & ipX        | bitwise-AND the two values
 ipX | ipX        | bitwise-OR the two values
 ipX # ipX        | bitwise-XOR the two values
 ~ ipX            | bitwise-NOT the value

Arithmetic on ip4 values does not wrap below 0.0.0.0 or above
255.255.255.255 - attempting to go beyond these limits raises an
error.

More complex arithmetic on IP addresses can be performed by converting
the IPs to numeric first; the above are only intended to cover the
common cases without requiring casts.


Types "ip4r", "ip6r", "iprange"
-------------------------------

An "ip4r" value denotes a single range of one or more IPv4 addresses,
for example '192.0.2.100-192.0.2.200'. Arbitrary ranges are allowed,
though input can also be in the form of CIDR netblocks, e.g.
'192.0.2.0/24' is equivalent to '192.0.2.0-192.0.2.255'. A single
value such as '192.0.2.25' represents a range containing only that
value.

An "ip6r" value denotes a single range of one or more IPv6 addresses,
for example '2001::1234-2001::2000:0000'. Arbitrary ranges are
allowed, though input can also be in the form of CIDR netblocks, e.g.
'2001::/112' is equivalent to '2001::-2001::ffff'. A single value such
as '2001::1234' represents a range containing only that value. Output
formatting is as specified in RFC 5952.

An "iprange" value denotes either an IPv4 range or an IPv6 range, or
the special value '-' which includes all of both IPv4 and IPv6 space.
Mixing of address families is not otherwise supported.

For all of the above types, values are displayed in CIDR form if they
represent a CIDR range, otherwise in range form.

Currently, abbreviated CIDR forms for IPv4 are not accepted at all,
i.e. all octets must be supplied. For IPv6, words may only be omitted
from the address as permitted by the zero-compression rules of RFC 5952.

"ipXr" will be used below to represent any one of the above three types.

An ipXr value can be constructed from two IPs explicitly using the
function ipXr(ipX,ipX). The ends of the range can be specified in
either order.

An ipXr value can be constructed from an IP and a prefix length
using the / operator (see below). For backward compatibility, the
function names ipXr_net_prefix and ipXr_net_mask are still accepted
for this operator.

ipXr supports the following type conversions:

  Source type   | Dest type |  Form
----------------|-----------|----------------------------------------------
  ipX           |  ipXr     |  ipXr(ipX)  or  ipX::ipXr  (implicit)
  ipXr          |  text     |  text(ipXr) or  ipXr::text (explicit)
  text          |  ipXr     |  ipXr(text) or  text::ipXr (explicit)
  ipXr          |  cidr     |  cidr(ipXr) or  ipXr::cidr (explicit)
  cidr          |  ipXr     |  ipXr(cidr) or  cidr::ipXr (assignment)

The conversion cidr(ipXr) returns NULL if the ipXr value does not
represent a valid CIDR range.

In addition, type conversions between ip4r, ip6r and iprange are permitted
in all valid combinations.

ipXr supports the following functions:

  family(ipXr) returns integer
  |  returns 4 or 6 according to address family, or NULL for '-'::iprange

  is_cidr(ipXr) returns boolean
  |  returns TRUE if the ipXr value is a valid CIDR range

  lower(ipXr) returns ipX
  |  returns the lower end of the ipXr range, as an ipX value

  upper(ipXr) returns ipX
  |  returns the upper end of the ipXr range, as an ipX value

ipXr supports the following operators:

  Operator        | Description
------------------|--------------------------------------------------------
  a = b           | exact equality
  a <> b          | exact inequality
  a < b           | note [1]
  a <= b          | note [1]
  a > b           | note [1]
  a >= b          | note [1]
  a >>= b         | a contains b or is equal to b
  a >> b          | a strictly contains b
  a <<= b         | a is contained in b or is equal to b
  a << b          | a is strictly contained in b
  a && b          | a and b overlap
  @ a             | approximate size of a (returns double)
  @@ a            | exact size of a (returns numeric)
  a / n           | construct CIDR range from address a length n
  a / b           | construct CIDR range from address a netmask b

[1]: the operators <, <=, >, >= implement an ordering for the purposes of
btree indexes, DISTINCT and ORDER BY; the ordering is not necessarily
useful for applications. The ordering used is a lexicographic ordering
of (lower,upper).

For testing whether an ipXr range contains a specified single ip, use the
>>= operator, i.e.  ipXr >>= ipX.  The implicit conversion from ipX to ipXr
handles this case.


ipXr Indexes
------------

ipXr values can be indexed in several ways.

A conventional btree index on ipXr values will work for the purposes of
unique/primary key constraints, ordering, and equality lookups (i.e.
WHERE column = value). Btree indexes are created in the usual way and
are the default index type.

However, ipXr's utility comes from its ability to use gist indexes to
support the following lookup types:

  WHERE column >>= value      (or >>)
  WHERE column <<= value      (or <<)
  WHERE column && value

These lookups require a GiST index. This can be created as follows:

CREATE INDEX indexname ON tablename USING gist (column);

It is also possible to create a functional ip4r index over a column of
'cidr' type as follows:

CREATE INDEX indexname ON tablename USING gist (iprange(cidrcolumn));

(ip4r(column) or ip6r(column) can also be used if the column is constrained
to contain only values of the specified address family)

This can then be used for queries of the form:

  WHERE iprange(cidrcolumn) >>= value    (or >>, <<=, && etc)

One advantage of this method is that the ip4r type can be dropped and
recreated without losing data. This is useful for accelerating queries
on an existing table designed without ip4r in mind.

Another idiom sometimes seen for representation of ranges of IP
addresses is for applications to create two integer columns, and do
range queries of the form:

  WHERE value BETWEEN column1 and column2

This is an attempt to get some use out of a btree index, but it performs
poorly in most cases. This can also be converted to use a functional ip4r
index as follows:

CREATE INDEX indexname ON tablename
   USING gist (ip4r(column1::ip4,column2::ip4));

and then doing queries of the form:

  WHERE ip4r(column1::ip4,column2::ip4) >>= value

This method is not usually practical for IPv6.

A common requirement is to get the longest-prefix (most specific)
match to an IP address from a table of ranges or CIDR prefixes.
This can usually be best achieved using ORDER BY @ column,
for example:

SELECT * FROM tablename
 WHERE column >>= value
 ORDER BY @ column
 LIMIT 1

The use of @ column (approximate size) is sufficient if the values are
IPv4 ranges or are always CIDR prefixes. If arbitrary IPv6 ranges are
present, then overlapping ranges with small size differences might
compare equal; in this case use ORDER BY @@ column.

When looking up multiple IPs, one can do queries of the following
form:

SELECT DISTINCT ON (ips.ip) ips.ip, ranges.range
  FROM ips, ranges
 WHERE ranges.range >>= ips.ip
 ORDER BY ips.ip, @ ranges.range



AUTHORS
=======

this code by andrew@tao11.riddles.org.uk Oct 2004 - Nov 2011
derived from 'ipr' by Steve Atkins <steve@blighty.com> August 2003
derived from the 'seg' type distributed with PostgreSQL.

Distributed under the same terms as PostgreSQL itself.

Currently maintained at:
  http://pgfoundry.org/projects/ip4r/

About

Historical ip4r versions up to and including the non-extension-packaged 2.0 release. Migrated from CVS.

Resources

Stars

Watchers

Forks

Packages

No packages published