Does any one know how to create crosstab queries in PostgreSQL?
For example I have the following table:

Section    Status    Count
A          Active    1
A          Inactive  2
B          Active    4
B          Inactive  5

I would like the query to return the following crosstab:

Section    Active    Inactive
A          1         2
B          4         5

Is this possible?

5 Answers 11

Sorry this isn't complete because I can't test it here, but it may get you off in the right direction. I'm translating from something I use that makes a similar query:

select mt.section, mt1.count as Active, mt2.count as Inactive
from mytable mt
left join (select section, count from mytable where status='Active')mt1
on mt.section = mt1.section
left join (select section, count from mytable where status='Inactive')mt2
on mt.section = mt2.section
group by mt.section,
         mt1.count,
         mt2.count
order by mt.section asc;

The code I'm working from is:

select m.typeID, m1.highBid, m2.lowAsk, m1.highBid - m2.lowAsk as diff, 100*(m1.highBid - m2.lowAsk)/m2.lowAsk as diffPercent
from mktTrades m
   left join (select typeID,MAX(price) as highBid from mktTrades where bid=1 group by typeID)m1
   on m.typeID = m1.typeID
   left join (select typeID,MIN(price) as lowAsk  from mktTrades where bid=0 group by typeID)m2
   on m1.typeID = m2.typeID
group by m.typeID, 
         m1.highBid, 
         m2.lowAsk
order by diffPercent desc;

which will return a typeID, the highest price bid and the lowest price asked and the difference between the two (a positive difference would mean something could be bought for less than it can be sold).

1 upvote
  flag
You're missing a from clause, otherwise this is correct. The explain plans are wildly different on my system - the crosstab function has a cost of 22.5 while the LEFT JOIN approach is about 4 times as expensive with a cost of 91.38. It also produces about twice as many physical reads and performs hash joins - which can be quite expensive compared to other join types. – Jeremiah Peschka
upvote
  flag
Thanks Jeremiah, that's good to know. I've upvoted the other answer, but your comment is worth keeping so I won't delete this one. – LanceH

You can use the crosstab() function of the additional module tablefunc - which you have to install once per database. Since PostgreSQL 9.1 you can use CREATE EXTENSION for that:

CREATE EXTENSION tablefunc;

In your case, I believe it would look something like this:

CREATE TABLE t (Section CHAR(1), Status VARCHAR(10), Count integer);

INSERT INTO t VALUES ('A', 'Active',   1);
INSERT INTO t VALUES ('A', 'Inactive', 2);
INSERT INTO t VALUES ('B', 'Active',   4);
INSERT INTO t VALUES ('B', 'Inactive', 5);

SELECT row_name AS Section,
       category_1::integer AS Active,
       category_2::integer AS Inactive
FROM crosstab('select section::text, status, count::text from t',2)
            AS ct (row_name text, category_1 text, category_2 text);
upvote
  flag
In case you use a parameter in the crosstab query, you have to escape it properly. Example: (from above) say you want only the active ones: SELECT ... FROM crosstab('select section::text, status, count::text from t where status=''active''', 2) AS ... (notice the double quotes). In case the parameter is passed at runtime by the user (as a function parameter for example) you can say: SELECT ... FROM crosstab('select section::text, status, count::text from t where status=''' || par_active || '''', 2) AS ... (triple quotes here!). In BIRT this also works with the ? placeholder. – Wim Verhavert
SELECT section,
       SUM(CASE status WHEN 'Active' THEN count ELSE 0 END) AS active,
       SUM(CASE status WHEN 'Inactive' THEN count ELSE 0 END) AS inactive
FROM t
GROUP BY section
1 upvote
  flag
Can someone explain what the crosstab function in the tablefunc module adds to this answer, which both does the job at hand, and to my mind is easier to understand? – John Powell aka Barça
3 upvote
  flag
@JohnBarça: A simple case like this can easily be solved with CASE statements. However, this gets unwieldy very quickly with more attributes and / or other data types than just integers. As an aside: this form uses the aggregate function sum(), it would be better to use min() or max() and no ELSE which works for text also. But this has subtly different effects than corosstab(), which only uses the "first" value per attribute. Doesn't matter as long as there can only be one. Finally, performance is relevant, too. crosstab() is written in C and optimized for the task. – Erwin Brandstetter
upvote
  flag
@ErwinBrandstetter, thanks for the explanation. – John Powell aka Barça
upvote
  flag
This does not work for me, for postgresql. I get the error ERROR: 42803: aggregate function calls may not be nested – Audrey
1 upvote
  flag
@Audrey you aren't running the same SQL then? – user533832
upvote
  flag
Consider adding explaination vs just a block of code – Daniel L. VanDenBosch
up vote 220 down vote accepted

Install the additional module tablefunc once per database, which provides the function crosstab(). Since Postgres 9.1 you can use CREATE EXTENSION for that:

CREATE EXTENSION tablefunc;

Improved test case

CREATE TEMP TABLE t (
  section   text
, status    text
, ct        integer  -- don't use "count" as column name.
);

INSERT INTO t VALUES 
  ('A', 'Active', 1), ('A', 'Inactive', 2)
, ('B', 'Active', 4), ('B', 'Inactive', 5)
                    , ('C', 'Inactive', 7);  -- 'C' with 'Active' is missing
  • count is a reserved word in standard SQL. Postgres allows it but I'd rather avoid those as identifiers.

Simple form - not fit for missing attributes

crosstab(text) with 1 input parameter:

SELECT *
FROM   crosstab(
   'SELECT section, status, ct
    FROM   t
    ORDER  BY 1,2'  -- needs to be "ORDER BY 1,2" here
   ) AS ct ("Section" text, "Active" int, "Inactive" int);

Returns:

 Section | Active | Inactive
---------+--------+----------
 A       |      1 |        2
 B       |      4 |        5
 C       |      7 |           -- !!
  • No need for casting and renaming.
  • Note the incorrect result for C: the value 7 is filled in for the first column. Sometimes, this behavior is desirable, but not for this use case.
  • The simple form is also limited to exactly three columns in the provided input query: row_name, category, value. There is no room for extra columns like in the 2-parameter alternative below.

Safe form

crosstab(text, text) with 2 input parameters:

SELECT * FROM crosstab(
       'SELECT section, status, ct
        FROM   t
        ORDER  BY 1,2'  -- could also just be "ORDER BY 1" here

      ,$$VALUES ('Active'::text), ('Inactive')$$)
AS ct ("Section" text, "Active" int, "Inactive" int);

Returns:

 Section | Active | Inactive
---------+--------+----------
 A       |      1 |        2
 B       |      4 |        5
 C       |        |        7  -- !!
  • Note the correct result for C.

  • The second parameter can be any query that returns one row per attribute matching the order of the column definition at the end. Often you will want to query distinct attributes from the underlying table like this:

    'SELECT DISTINCT attribute FROM tbl ORDER BY 1'
    

    That's in the manual.

    Since you have to spell out all columns in a column definition list anyway (except for pre-defined crosstabN() variants), it is regularly more efficient to provide a short list in a VALUES expression like I demonstrate:

    $$VALUES ('Active'::text), ('Inactive')$$)
    

    Or (not in the manual):

    $$SELECT unnest('{Active,Inactive}'::text[])$$ -- shorter for long lists
    
  • I used dollar quoting to make quoting easier.

  • You can even output columns with different data types with crosstab(text, text) - as long as the text representation of the value column is valid input for the target type. This way you might have attributes of different kind and output text, date, numeric etc. for respective attributes. There is a code example at the end of the chapter crosstab(text, text) in the manual.

Advanced examples


\crosstabview in psql

Postgres 9.6 added this meta-command to its default interactive terminal psql. You can run the query you would use as first crosstab() parameter and feed it to \crosstabview (immediately or in the next step). Like:

db=> SELECT section, status, ct FROM t \crosstabview

Similar result as above, but it's a representation feature on the client side exclusively. Input rows are treated slightly differently, hence ORDER BY is not required. Details for \crosstabview in the manual. There are more code examples at the bottom of that page.

Related answer on dba.SE by Daniel Vérité (the author of the psql feature):



The previously accepted answer is outdated.

  • The variant of the function crosstab(text, integer) is outdated. The second integer parameter is ignored. I quote the current manual:

    crosstab(text sql, int N) ...

    Obsolete version of crosstab(text). The parameter N is now ignored, since the number of value columns is always determined by the calling query

  • Needless casting and renaming.

  • It fails if a row does not have all attributes. See safe variant with two input parameters above to handle missing attributes properly.

  • ORDER BY is required in the one-parameter form of crosstab(). The manual:

    In practice the SQL query should always specify ORDER BY 1,2 to ensure that the input rows are properly ordered

1 upvote
  flag
Thanks for providing an updated answer. I wasn't aware of the changes to the crosstab function. – Jeremiah Peschka
2 upvote
  flag
+1, good writeup, thanks for noticing In practice the SQL query should always specify ORDER BY 1,2 to ensure that the input rows are properly ordered – ChristopheD
upvote
  flag
I've some problems using $$VALUES .. $$. I've used instead 'VALUES (''<attr>'':: <type>), .. ' – Marco Fantasia
upvote
  flag
Can we specify parameter binding in crosstab query ? I am getting this error => could not determine data type of parameter $2 – Ashish
1 upvote
  flag
Is it possible to set default value for column in crosstab query ? – Ashish
2 upvote
  flag
@Ashish: Please start a new question. Comments are not the place. You can always link to this one for context. – Erwin Brandstetter

Solution with JSON aggregation:

CREATE TEMP TABLE t (
  section   text
, status    text
, ct        integer  -- don't use "count" as column name.
);

INSERT INTO t VALUES 
  ('A', 'Active', 1), ('A', 'Inactive', 2)
, ('B', 'Active', 4), ('B', 'Inactive', 5)
                   , ('C', 'Inactive', 7); 


SELECT section,
       (obj ->> 'Active')::int AS active,
       (obj ->> 'Inactive')::int AS inactive
FROM (SELECT section, json_object_agg(status,ct) AS obj
      FROM t
      GROUP BY section
     )X

Not the answer you're looking for? Browse other questions tagged or ask your own question.