I have this table for documents (simplified version here):

+------+-------+--------------------------------------+
| id   | rev   | content                              |
+------+-------+--------------------------------------+
| 1    | 1     | ...                                  |
| 2    | 1     | ...                                  |
| 1    | 2     | ...                                  |
| 1    | 3     | ...                                  |
+------+-------+--------------------------------------+

How do I select one row per id and only the greatest rev?
With the above data, the result should contain two rows: [1, 3, ...] and [2, 1, ..]. I'm using MySQL.

Currently I use checks in the while loop to detect and over-write old revs from the resultset. But is this the only method to achieve the result? Isn't there a SQL solution?

Update
As the answers suggest, there is a SQL solution, and here a sqlfiddle demo.

Update 2
I noticed after adding the above sqlfiddle, the rate at which the question is upvoted has surpassed the upvote rate of the answers. That has not been the intention! The fiddle is based on the answers, especially the accepted answer.

upvote
  flag
Do you need the corresponding content field for the row? – Mark Byers
upvote
  flag
Yes, and that would pose no problem, I have cut out many columns which I'd be adding back. – Majid Fouladpour
1 upvote
  flag
@MarkByers I have edited my answer to comply with OP needs. Since I was at it, I decided to write a more comprehensive answer on the greatest-n-per-group topic. – Adrian Carneiro
upvote
  flag
This is common greatest-n-per-group problem, which has well tested and optimized solutions. I prefer the left join solution by Bill Karwin (the original post). Note that bunch of solutions to this common problem can surprisingly be found in the one of most official sources, MySQL manual! See Examples of Common Queries :: The Rows Holding the Group-wise Maximum of a Certain Column. – TMS
1 upvote
  flag
upvote
  flag

28 Answers 11

up vote 1223 down vote accepted

At first glance...

All you need is a GROUP BY clause with the MAX aggregate function:

SELECT id, MAX(rev)
FROM YourTable
GROUP BY id

It's never that simple, is it?

I just noticed you need the content column as well.

This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job's technical interview.

It is, actually, so common that StackOverflow community has created a single tag just to deal with questions like that: .

Basically, you have two approaches to solve that problem:

Joining with simple group-identifier, max-value-in-group Sub-query

In this approach, you first find the group-identifier, max-value-in-group (already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier and max-value-in-group:

SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
    SELECT id, MAX(rev) rev
    FROM YourTable
    GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev

Left Joining with self, tweaking join conditions and filters

In this approach, you left join the table with itself. Equality, of course, goes in the group-identifier. Then, 2 smart moves:

  1. The second join condition is having left side value less than right value
  2. When you do step 1, the row(s) that actually have the max value will have NULL in the right side (it's a LEFT JOIN, remember?). Then, we filter the joined result, showing only the rows where the right side is NULL.

So you end up with:

SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
    ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;

Conclusion

Both approaches bring the exact same result.

If you have two rows with max-value-in-group for group-identifier, both rows will be in the result in both approaches.

Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its "flavor".

Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.

1 upvote
  flag
The first version was much simpler, wouldn't it yeald the result by adding more columns? I also should cater for a where: select id, max(rev), content, etc., etc., from the_table where proj_id = $pid group by id – Majid Fouladpour
5 upvote
  flag
I know that MySQL allows you to add non aggregate fields to a "grouped by" query, but I find that kinda pointless. Try running this select id, max(rev), rev from YourTable group by id and you see what I mean. Take your time and try to understand it – Adrian Carneiro
upvote
  flag
Thanks for the explanation. – Majid Fouladpour
upvote
  flag
watch out in the subselect not to include any other column names, since they arent gonna be the ones from the max row. mysql will allow u this although other db-s like oracle will give u an exception. For most of the aggregate function it doesn't make sense although for max function would seem logical. So be careful not to trip on this ;) Cheers! – despot
1 upvote
  flag
First, thanks for the awesome answer. Something to add, make sure you join on what you would have grouped by for the max or min. Generally you only join by the id, but if you were grouping by name and age, then your join will be with name = name, and age = age. It took me a few minutes to realize that was my issue. – Jason McCarrell
2 upvote
  flag
@JasonMcCarrell I'm glad this answer helped you! I get your point, this is why I called it group_identifier, which could be one or more columns. In your case, group_identifier is the combination of name and age – Adrian Carneiro
upvote
  flag
and how do I do when I don't have a unique identifier field per row? – Totty.js
upvote
  flag
Totty.. you can't.. all database tables require a primary key if you want to perform intelligent queries on them – whiteatom
upvote
  flag
Your second solution is a really interesting approach. On my data set the it takes 4 mins to run, vs 0.23 of a second for the subquery. Is there any indexing that you could suggest that would help this along? – whiteatom
upvote
  flag
@whiteatom Absolutely: indexes over id and rev (adjust names accordingly). It might be the case that an index over both should also be helpful – Adrian Carneiro
1 upvote
  flag
Hmm no dice. It took 6 mins before hand. I added the index to id and rev individually and it took 4 mins.. together it takes 3:53 and both takes 3:52. Interesting the 2 methods are that different. I wonder what circumstance the searching for row with nothing greater is faster? – whiteatom
upvote
  flag
@whiteatom funny, that should have done it. have you tried creating DESC index for rev? Perhaps you should post your data structure and estimated data load in a new question. Please let me know – Adrian Carneiro
upvote
  flag
Where is null is sometimes slow. Regardless, I find the 2nd solution interesting and clever. Great answer! – Lonnie Best
3 upvote
  flag
How do I get it to return only one row per group though? Don't these answers return every row in each group that has a compare value equal to the maximum value? For instance, suppose there was a second row in the OP's dataset with id = 1, rev = 3. Wouldn't it return both rows with id=1, rev=3? – Michael Lang
upvote
  flag
@MichaelLang You are talking about ties. This solution does not do tie-breaking: "If you have two rows with max-value-in-group for group-identifier, both rows will be in the result in both approaches." That's by design, since this solution is generalist. you are the one responsible for doing the tie breaker, meaning you have to decide what you are going to use as tie breaker when two max values are equal (another field perhaps?). Then you can tweak the second approach to take that into consideration. – Adrian Carneiro
upvote
  flag
I am just wondering if the query I entered in my answer below is less efficient or not equivalent to the queries entered in this answer, and if so, why? – inor
upvote
  flag
@AdrianCarneiro so you're saying that its not possible to do tie breaking with the first version? – Robert Christ
upvote
  flag
Worth mentioning that if you are using the code for random lookups (e.g.: as a basis of a view), the subquery based solution is painfully slow (as subquery is executed for each lookup), while the Left join solution plays nice... There would also be a correlated subquery based solution for the problem (which avoids leftjoin's problems with large group sizes). See my answer. Feel free to add it to your answer... – Vajk Hermecz
2 upvote
  flag
@RobertChrist to arbitrarily break ties with the first version, just add DISTINCT ON (yt.id) after the initial SELECT. That made my query take twice as long though. So, I don't tie-break since ties are practically impossible in my case. – ma11hew28
2 upvote
  flag
Why would the first solution work? Won't max function run per each group consisting of a single row instead of all the rows as a whole. – Gherman
1 upvote
  flag
I appreciate that second solution, because Doctrine DQL won't support subqueries in JOIN statements (like the first solution), so this is a very helpful work around! – Chadwick Meyer
upvote
  flag
@German I'm struggling to understand the query myself, but I believe the GROUP BY clause causes all rows to be returned instead of just one. – Nate
upvote
  flag
Using MariaDB here. Not sure why using HAVING does not yield correct results. i.e. SELECT a, b, max(b) as max_b FROM table GROUP BY a HAVING b=max_b. The query works but certain rows are simply missing – Mikhail
upvote
  flag
@adrian-carneiro : Thanks so much for your answer, but for your first solution, which I believe is supposed to be SELECT id, MAX(rev), rev, I get: "Column 'rev' must be in the GROUP BY list". Is my database lacking (not mysql), is this not standard sql, or is something else going on? Thanks! – rjcarr
1 upvote
  flag
I would have triple-upvoted this if we lived in a world where that is possible without making fake accouns! Thank you. – Tom
1 upvote
  flag
second approach is not working to me. it display the same result as if i just run simple select from my table – Akmal Salikhov
1 upvote
  flag
What is so problematic in adding the column content? why isnt that working? – TheLogicGuy
upvote
  flag
I've used both of these solutions in the past, and for small data sets, they work fine - however - they do not scale well. Even with indexing, etc... It's far better to just break the query down into two smaller steps and populate a temp table. – photocode

My preference is to use as little code as possible...

You can do it using IN try this:

SELECT * 
FROM t1 WHERE (id,rev) IN 
( SELECT id, MAX(rev)
  FROM t1
  GROUP BY id
)

to my mind it is less complicated... easier to read and maintain.

16 upvote
  flag
Curious - which database engine can we use this type of WHERE clause in? This is not supported in SQL Server. – Kash
11 upvote
  flag
oracle & mysql (not sure about other databases sorry) – Kevin Burton
12 upvote
  flag
Works on PostgreSQL too. – lcguida
6 upvote
  flag
Confirmed working in DB2 – coderatchet
8 upvote
  flag
Does not work with SQLite. – Marcel Pfeiffer
upvote
  flag
and the answer supplied is valid ANSI\ISO SQL !!! – Kevin Burton
1 upvote
  flag
Awesome, thanks. Worked in SQLDeveloper (Oracle) – Arthur Collé
upvote
  flag
Good answer :^) – Andrew
1 upvote
  flag
Remember if you're using Mysql and this query returns something like SELECT list is not in GROUP BY clause and contains nonaggregated column 'db.t1.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by you need to disable the ONLY_FULL_GROUP_BY sql mode (eg: SET sql_mode = '') – Adriel Santos
upvote
  flag
In MySQL, this construct performs poorly: WHERE (a,b) IN ( SELECT ... ) for two reasons -- the (a,b) and the IN(SELECT. – Rick James
1 upvote
  flag
Update: Percona 5.6.22 optimizes this in a reasonable way - Order(N), where N is number of rows in table. MySQL 5.5.43 does a terrible job - Order(N*N). MariaDB 10.0.28 does a superior job -- no table scan! – Rick James
upvote
  flag
What about adding an additional: GROUP BY id at the end of the query? This seems to prevent multiple records per id. Is this a safe approach? – Oriol
upvote
  flag
Works on spark sql too. :) – Hassaan Salik

Something like this?

SELECT yourtable.id, rev, content
FROM yourtable
INNER JOIN (
    SELECT id, max(rev) as maxrev FROM yourtable
    WHERE yourtable
    GROUP BY id
) AS child ON (yourtable.id = child.id) AND (yourtable.rev = maxrev)
upvote
  flag
The join-less ones wouldn't cut it? – Majid Fouladpour
1 upvote
  flag
If they work, then they're fine too. – Marc B
7 upvote
  flag
What does WHERE yourtable do? – Brian McCutchon
SELECT * FROM t1 ORDER BY rev DESC LIMIT 1;
1 upvote
  flag
This is totally incorrect. It's just getting the first row where the result set was sorted in descending order. – man910
upvote
  flag
The OP was after "one row per id", not only the single row with the highest rev. – Ignitor

I can't vouch for the performance, but here's a trick inspired by the limitations of Microsoft Excel. It has some good features

GOOD STUFF

  • It should force return of only one "max record" even if there is a tie (sometimes useful)
  • It doesn't require a join

APPROACH

It is a little bit ugly and requires that you know something about the range of valid values of the rev column. Let us assume that we know the rev column is a number between 0.00 and 999 including decimals but that there will only ever be two digits to the right of the decimal point (e.g. 34.17 would be a valid value).

The gist of the thing is that you create a single synthetic column by string concatenating/packing the primary comparison field along with the data you want. In this way, you can force SQL's MAX() aggregate function to return all of the data (because it has been packed into a single column). Then you have to unpack the data.

Here's how it looks with the above example, written in SQL

SELECT id, 
       CAST(SUBSTRING(max(packed_col) FROM 2 FOR 6) AS float) as max_rev,
       SUBSTRING(max(packed_col) FROM 11) AS content_for_max_rev 
FROM  (SELECT id, 
       CAST(1000 + rev + .001 as CHAR) || '---' || CAST(content AS char) AS packed_col
       FROM yourtable
      ) 
GROUP BY id

The packing begins by forcing the rev column to be a number of known character length regardless of the value of rev so that for example

  • 3.2 becomes 1003.201
  • 57 becomes 1057.001
  • 923.88 becomes 1923.881

If you do it right, string comparison of two numbers should yield the same "max" as numeric comparison of the two numbers and it's easy to convert back to the original number using the substring function (which is available in one form or another pretty much everywhere).

10 upvote
  flag
Upvoted for inspired hackiness. – Barry Kelly
upvote
  flag
Great solution, it performs much faster than join and other proposed solutions. – danial
upvote
  flag
This SO post made my month. I've used it a few times now, and it is much faster than the join method. Very clever. – vamin
select * from yourtable
group by id
having rev=max(rev);
upvote
  flag
This is not working for me in mysql. Could you point out in the documentation how should this work? Thx – Vajk Hermecz
1 upvote
  flag
This doesn't work in PostgreSQL. It returns: ERROR: column "yourtable.content" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: select * from messages – ma11hew28
upvote
  flag
Also does not work in SQL Server. That having clause is bogus, you can't reference the column rev in it. This answer should be deleted. – Anssssss

How about this:

select all_fields.*  
from  (select id, MAX(rev) from yourtable group by id) as max_recs  
left outer join yourtable as all_fields  
on max_recs.id = all_fields.id

Yet another solution is to use a correlated subquery:

select yt.id, yt.rev, yt.contents
    from YourTable yt
    where rev = 
        (select max(rev) from YourTable st where yt.id=st.id)

Having an index on (id,rev) renders the subquery almost as a simple lookup...

Following are comparisons to the solutions in @AdrianCarneiro's answer (subquery, leftjoin), based on MySQL measurements with InnoDB table of ~1million records, group size being: 1-3.

While for full table scans subquery/leftjoin/correlated timings relate to each other as 6/8/9, when it comes to direct lookups or batch (id in (1,2,3)), subquery is much slower then the others (Due to rerunning the subquery). However I couldnt differentiate between leftjoin and correlated solutions in speed.

One final note, as leftjoin creates n*(n+1)/2 joins in groups, its performance can be heavily affected by the size of groups...

5 upvote
  flag
+1 for simplicity and clarity – xagyg
upvote
  flag
This is the only one so far that worked in the way I needed it, thanks (needed to match by name, not by id) – Doomed Mind
upvote
  flag
I dont think this works if rev is not unique. – Pita
upvote
  flag
@Pita no. it works even if rev is not unique – Pradeep Kumar Prabaharan
upvote
  flag
Good point for mentioning index required for simple lookup (apparently cannot plus 1 in comments anymore) – Jared Becksfort

This solution makes only one selection from YourTable, therefore it's faster. It works only for MySQL and SQLite(for SQLite remove DESC) according to test on sqlfiddle.com. Maybe it can be tweaked to work on other languages which I am not familiar with.

SELECT *
FROM ( SELECT *
       FROM ( SELECT 1 as id, 1 as rev, 'content1' as content
              UNION
              SELECT 2, 1, 'content2'
              UNION
              SELECT 1, 2, 'content3'
              UNION
              SELECT 1, 3, 'content4'
            ) as YourTable
       ORDER BY id, rev DESC
   ) as YourTable
GROUP BY id
upvote
  flag
This doesn't appear to work for the general case. And, it doesn't work at all in PostgreSQL, returning: ERROR: column "your table.reb" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT * – ma11hew28
upvote
  flag
Sorry I didn't clarify the first time at which language it worked. – plavozont

NOT mySQL, but for other people finding this question and using SQL, another way to resolve the problem is using Cross Apply in MS SQL

WITH DocIds AS (SELECT DISTINCT id FROM docs)

SELECT d2.id, d2.rev, d2.content
FROM DocIds d1
CROSS APPLY (
  SELECT Top 1 * FROM docs d
  WHERE d.id = d1.id
  ORDER BY rev DESC
) d2

Here's an example in SqlFiddle

Since this is most popular question with regard to this problem, I'll re-post another answer to it here as well:

It looks like there is simpler way to do this (but only in MySQL):

select *
from (select * from mytable order by id, rev desc ) x
group by id

Please credit answer of user Bohemian in this question for providing such a concise and elegant answer to this problem.

EDIT: though this solution works for many people it may not be stable in the long run, since MySQL doesn't guarantee that GROUP BY statement will return meaningful values for columns not in GROUP BY list. So use this solution at your own risk

5 upvote
  flag
Except that it's wrong, as there is no guarantee that the order of the inner query means anything, nor is the GROUP BY always guaranteed to take the first encountered row. At least in MySQL and I would assume all others. In fact I was under the assumption that MySQL would simply ignore the whole ORDER BY. Any future version or a change in configuration might break this query. – Jannes
upvote
  flag
@Jannes this is interesting remark :) I welcome you to answer my question providing proofs: //allinonescript.com/questions/26301877/… – Yura
1 upvote
  flag
@Jannes concerning GROUP BY not guaranteed to take the first encountered row - you are totally right - found this issue bugs.mysql.com/bug.php?id=71942 which asks to provide such guarantees. Will update my answer now – Yura
upvote
  flag
I think I remember where I got the ORDER BY being discarded from: MySQL does that with UNIONs if you ORDER BY the inner queries, it's just ignore: dev.mysql.com/doc/refman/5.0/en/union.html says "If ORDER BY appears without LIMIT in a SELECT, it is optimized away because it will have no effect anyway." I haven't seen such a statement for the query in question here, but I don't see why it couldn't do that. – Jannes
upvote
  flag
@Jannes hmmm, that's interesting – Yura

I like to use a NOT EXIST-based solution for this problem:

SELECT id, rev
FROM YourTable t
WHERE NOT EXISTS (
   SELECT * FROM YourTable t WHERE t.id = id AND rev > t.rev
)
1 upvote
  flag
yes, not exists like this has generally been the preferred way rather than a left join. In older versions of SQL server it was faster, although i think now it makes no difference. I normally do SELECT 1 instead of SELECT *, again because in prior versions it was faster. – EGP

A third solution I hardly ever see mentioned is MySQL specific and looks like this:

SELECT id, MAX(rev) AS rev
 , 0+SUBSTRING_INDEX(GROUP_CONCAT(numeric_content ORDER BY rev DESC), ',', 1) AS numeric_content
FROM t1
GROUP BY id

Yes it looks awful (converting to string and back etc.) but in my experience it's usually faster than the other solutions. Maybe that just for my use cases, but I have used it on tables with millions of records and many unique ids. Maybe it's because MySQL is pretty bad at optimizing the other solutions (at least in the 5.0 days when I came up with this solution).

One important thing is that GROUP_CONCAT has a maximum length for the string it can build up. You probably want to raise this limit by setting the group_concat_max_len variable. And keep in mind that this will be a limit on scaling if you have a large number of rows.

Anyway, the above doesn't directly work if your content field is already text. In that case you probably want to use a different separator, like \0 maybe. You'll also run into the group_concat_max_len limit quicker.

Here is a nice way of doing that

Use following code :

with temp as  ( 
select count(field1) as summ , field1
from table_name
group by field1 )
select * from temp where summ = (select max(summ) from temp)

This works for me in sqlite3:

SELECT *, MAX(rev) FROM t1 GROUP BY id

With *, you get a duplicate rev column, but that's not much of a problem.

I would use this:

select t.*
from test as t
join
   (select max(rev) as rev
    from test
    group by id) as o
on o.rev = t.rev

Subquery SELECT is not too eficient maybe, but in JOIN clause seems to be usable. I'm not an expert in optimizing queries, but I've tried at MySQL, PostgreSQL, FireBird and it does work very good.

You can use this schema in multiple joins and with WHERE clause. It is my working example (solving identical to yours problem with table "firmy"):

select *
from platnosci as p
join firmy as f
on p.id_rel_firmy = f.id_rel
join (select max(id_obj) as id_obj
      from firmy
      group by id_rel) as o
on o.id_obj = f.id_obj and p.od > '2014-03-01'

It is asked on tables having teens thusands of records, and it takes less then 0,01 second on really not too strong machine.

I wouldn't use IN clause (as it is mentioned somewhere above). IN is given to use with short lists of constans, and not as to be the query filter built on subquery. It is because subquery in IN is performed for every scanned record which can made query taking very loooong time.

upvote
  flag
I think using that subquery as a CTE might at least improve performance – mmcrae

I like to do this by ranking the records by some column. In this case, rank rev values grouped by id. Those with higher rev will have lower rankings. So highest rev will have ranking of 1.

select id, rev, content
from
 (select
    @rowNum := if(@prevValue = id, @rowNum+1, 1) as row_num,
    id, rev, content,
    @prevValue := id
  from
   (select id, rev, content from YOURTABLE order by id asc, rev desc) TEMP,
   (select @rowNum := 1 from DUAL) X,
   (select @prevValue := -1 from DUAL) Y) TEMP
where row_num = 1;

Not sure if introducing variables makes the whole thing slower. But at least I'm not querying YOURTABLE twice.

upvote
  flag
Only tried approach in MySQL. Oracle has a similar function for ranking records. Idea should work too. – user5124980

If you have many fields in select statement and you want latest value for all of those fields through optimized code:

select * from
(select * from table_name
order by id,rev desc) temp
group by id 
upvote
  flag
This works OK for small tables, but takes 6 passes over the entire dataset, so not fast for large tables. – Rick James

If anyone is looking for a Linq verson, this seems to work for me:

public static IQueryable<BlockVersion> LatestVersionsPerBlock(this IQueryable<BlockVersion> blockVersions)
{
    var max_version_per_id = blockVersions.GroupBy(v => v.BlockId)
        .Select( v => new { BlockId = v.Key, MaxVersion = v.Max(x => x.Version) } );    

    return blockVersions.Where( v => max_version_per_id.Any(x => x.BlockId == v.BlockId && x.MaxVersion == v.Version) );
}

Sorted the rev field in reverse order and then grouped by id which gave the first row of each grouping which is the one with the highest rev value.

SELECT * FROM (SELECT * FROM table1 ORDER BY id, rev DESC) X GROUP BY X.id;

Tested in http://sqlfiddle.com/ with the following data

CREATE TABLE table1
    (`id` int, `rev` int, `content` varchar(11));

INSERT INTO table1
    (`id`, `rev`, `content`)
VALUES
    (1, 1, 'One-One'),
    (1, 2, 'One-Two'),
    (2, 1, 'Two-One'),
    (2, 2, 'Two-Two'),
    (3, 2, 'Three-Two'),
    (3, 1, 'Three-One'),
    (3, 3, 'Three-Three')
;

This gave the following result in MySql 5.5 and 5.6

id  rev content
1   2   One-Two
2   2   Two-Two
3   3   Three-Two
upvote
  flag
This technique used to work, but no longer. See mariadb.com/kb/en/mariadb/… – Rick James
upvote
  flag
The original question tag is "mysql" and I have stated very clearly that my solution was tested with both Mysql 5.5 and 5.6 in sqlfiddle.com. I have provided all steps to independently verify the solution. I have not made any false claims that my solution works with Mariadb. Mariadb is not Mysql, its just a drop-in replacement for Mysql, owned by 2 different companies. Your comment will help anyone that is trying to implement it in Mariadb but my post in no way deserve a negative vote as it clearly answers the question that was asked. – blokeish
1 upvote
  flag
Yes, it works in older versions. And I have used that technique in the past, only to be burned when it stopped working. Also MySQL (in 5.7?) will also be ignoring the ORDER BY in a subquery. Since lots of people will read your answer, I am trying to steer them away from a technique that will break in their future. (And I did not give you the -1 vote.) – Rick James

I am flabbergasted that no answer offered SQL window function solution:

SELECT a.id, a.rev, a.contents
  FROM (SELECT id, rev, contents,
               ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rank
          FROM YourTable) a
 WHERE a.rank = 1 

Added in SQL standard ANSI/ISO Standard SQL:2003 and later extended with ANSI/ISO Standard SQL:2008, window (or windowing) functions are available with all major vendors now. There are more types of rank functions available to deal with a tie issue: RANK, DENSE_RANK, PERSENT_RANK.

upvote
  flag
I think it is less intuitive and potentially less clear - but it can definitely work/be a solution. – mmcrae
2 upvote
  flag
intuition is tricky thing. I find it more intuitive than other answers as it builds explicit data structure that answers the question. But, again, intuition is the other side of bias... – topchef
2 upvote
  flag
This might work in MariaDB 10.2 and MySQL 8.0.2, but not before. – Rick James
2 upvote
  flag
At last, I was beginning to wonder why this wasn't here. This is far more "intuitive" than the vast majority of the "old hat" answers on this page, and way more efficient in almost all cases as it requires just a single pass of the data. Most databases now support these standard window functions (MySQL is late but will from v8 onward). – Used_By_Already

I think this is the easiest solution :

SELECT *
FROM
    (SELECT *
    FROM Employee
    ORDER BY Salary DESC)
AS employeesub
GROUP BY employeesub.Salary;
  • SELECT * : Return all fields.
  • FROM Employee : Table searched on.
  • (SELECT *...) subquery : Return all people, sorted by Salary.
  • GROUP BY employeesub.Salary: : Force the top-sorted, Salary row of each employee to be the returned result.

If you happen to need just the one row, it's even easier :

SELECT *
FROM Employee
ORDER BY Employee.Salary DESC
LIMIT 1

I also think it's the easiest to break down, understand, and modify to other purposes:

  • ORDER BY Employee.Salary DESC: Order the results by the salary, with highest salaries first.
  • LIMIT 1: Return just one result.

Understanding this approach, solving any of these similar problems becomes trivial: get employee with lowest salary (change DESC to ASC), get top-ten earning employees (change LIMIT 1 to LIMIT 10), sort by means of another field (change ORDER BY Employee.Salary to ORDER BY Employee.Commission), etc..

upvote
  flag
This does not answer the question. The question is asking how to get the data for one row (as was asked, "one row per ID") in a group query where value x is the max within each group of rows. For example a customer order table with multiple orders per customer where you want to retrieve the largest order for each customer. Your query might very well return more than one row per customer (if, for example, the two largest orders were placed by the same customer). – Aaron J Spetner
upvote
  flag
"one row per ID" <-- keep reading, please, and you'll see "and only the greatest". That is logically equivalent to just the greatest. – HoldOffHunger
upvote
  flag
Yes, but it says "and". Which means the requirements are BOTH one row per ID AND only the greatest. Using this answer will not satisfy the first requirement. Additionally, the question implies the need to retrieve a single record for ALL of the IDs. This answer requires knowledge of the number of IDs beforehand (in order to configure the LIMIT), which will require additional code. The question's goal is stated specifically as seeking a SQL-only solution. Finally, even if you know the number of unique IDs, if there are multiple occurrences of the MAX value, the LIMIT clause will be wrong. – Aaron J Spetner
1 upvote
  flag
I did not have the exact same situation like in the original post but this is the most easy to understand and straightforward and working solution i came across so far for my problem. I am amazed how all the geeks and freaks try to overtake each other by bragging with complex / weird queries. – sba
upvote
  flag
@Aaron J Spetner: I have updated with a solution that directly addresses OP's needs. – HoldOffHunger

Many, if not all, of the other answers here are fine for small datasets. For scaling, more care is needed. See here.

It discusses multiple faster ways to do groupwise max and top-N per group.

here is another solution hope it will help someone

Select a.id , a.rev, a.content from Table1 a
inner join 
(SELECT id, max(rev) rev FROM Table1 GROUP BY id) x on x.id =a.id and x.rev =a.rev

None of these answers have worked for me.

This is what worked for me.

with score as (select max(score_up) from history)
select history.* from score, history where history.score_up = score.max

SELECT * FROM Employee where Employee.Salary in (select max(salary) from Employee group by Employe_id) ORDER BY Employee.Salary

1 upvote
  flag
You should format your query and add explanation with it. – JRG

Here's another solution to retrieving the records only with a field that has the maximum value for that field. This works for SQL400 which is the platform I work on. In this example, the records with the maximum value in field FIELD5 will be retrieved by the following SQL statement.

SELECT A.KEYFIELD1, A.KEYFIELD2, A.FIELD3, A.FIELD4, A.FIELD5
  FROM MYFILE A
 WHERE RRN(A) IN
   (SELECT RRN(B) 
      FROM MYFILE B
     WHERE B.KEYFIELD1 = A.KEYFIELD1 AND B.KEYFIELD2 = A.KEYFIELD2
     ORDER BY B.FIELD5 DESC
     FETCH FIRST ROW ONLY)

I used the below to solve a problem of my own. I first created a temp table and inserted the max rev value per unique id.

CREATE TABLE #temp1
(
    id varchar(20)
    , rev int
)
INSERT INTO #temp1
SELECT a.id, MAX(a.rev) as rev
FROM 
    (
        SELECT id, content, SUM(rev) as rev
        FROM YourTable
        GROUP BY id, content
    ) as a 
GROUP BY a.id
ORDER BY a.id

I then joined these max values (#temp1) to all of the possible id/content combinations. By doing this, I naturally filter out the non-maximum id/content combinations, and am left with the only max rev values for each.

SELECT a.id, a.rev, content
FROM #temp1 as a
LEFT JOIN
    (
        SELECT id, content, SUM(rev) as rev
        FROM YourTable
        GROUP BY id, content
    ) as b on a.id = b.id and a.rev = b.rev
GROUP BY a.id, a.rev, b.content
ORDER BY a.id

Not the answer you're looking for? Browse other questions tagged or ask your own question.