ITKeyword,专注技术干货聚合推荐

注册 | 登录

解决sql - Grouped LIMIT in PostgreSQL: show the first N rows for each group?

itPublisher 分享于

2020腾讯云双十一活动,全年最低!!!(领取3500元代金券),
地址https://cloud.tencent.com/act/cps/redirect?redirect=1073

【阿里云】双十一活动,全年抄底价,限时3天!(老用户也有),
入口地址https://www.aliyun.com/1111/home

推荐:How to select the first least max top N rows from each group in SQL

从分组中选择极值查询/前N项查询 是经常会遇到的问题 ,下面通过简单举例展示这种SQL的写法 举例表 type variety price apple gala 2.79 apple fuji 0.24 apple l

I need to take the first N rows for each group, ordered by custom column.

Given the following table:

db=# SELECT * FROM xxx;
 id | section_id | name
----+------------+------
  1 |          1 | A
  2 |          1 | B
  3 |          1 | C
  4 |          1 | D
  5 |          2 | E
  6 |          2 | F
  7 |          3 | G
  8 |          2 | H
(8 rows)

I need the first 2 rows (ordered by name) for each section_id, i.e. a result similar to:

 id | section_id | name
----+------------+------
  1 |          1 | A
  2 |          1 | B
  5 |          2 | E
  6 |          2 | F
  7 |          3 | G
(5 rows)

I am using PostgreSQL 8.3.5.

sql postgresql
|
  this question
edited Jul 14 '09 at 10:41 asked Jul 14 '09 at 10:37 Kouber Saparev 3,399 1 14 22

 | 

5 Answers
5

解决方法

New solution (PostgreSQL 8.4)

SELECT
  * 
FROM (
  SELECT
    ROW_NUMBER() OVER (PARTITION BY section_id ORDER BY name) AS r,
    t.*
  FROM
    xxx t) x
WHERE
  x.r <= 2;

|
  this answer
edited Aug 11 '16 at 9:08 ngspkinga 381 4 16 answered May 19 '11 at 19:55 Dave 1,366 1 8 2 8   This works with PostgreSQL 8.4 too (window functions start with 8.4). –  Bruno Mar 2 '12 at 15:48 2   Too good to use his example :) –  Justin Thomas Oct 10 '13 at 22:34      This is definitely the right answer - Mind blown –  dineth Nov 9 '14 at 21:22 1   Textbook answer to do grouped limit –  piggybox Jan 6 '15 at 22:57 1   Awesome! It works flawlessly. I am curious though, is there a way to do this with group by? –  NurShomik Oct 19 '16 at 18:06

 | 

Here's another solution (PostgreSQL <= 8.3).

SELECT
  *
FROM
  xxx a
WHERE (
  SELECT
    COUNT(*)
  FROM
    xxx
  WHERE
    section_id = a.section_id
  AND
    name <= a.name
) <= 2

|
  this answer
edited Feb 15 '16 at 11:26 answered Jul 17 '09 at 14:41 Kouber Saparev 3,399 1 14 22

 | 
SELECT  x.*
FROM    (
        SELECT  section_id,
                COALESCE
                (
                (
                SELECT  xi
                FROM    xxx xi
                WHERE   xi.section_id = xo.section_id
                ORDER BY
                        name, id
                OFFSET 1 LIMIT 1
                ),
                (
                SELECT  xi
                FROM    xxx xi
                WHERE   xi.section_id = xo.section_id
                ORDER BY 
                        name DESC, id DESC
                LIMIT 1
                )
                ) AS mlast
        FROM    (
                SELECT  DISTINCT section_id
                FROM    xxx
                ) xo
        ) xoo
JOIN    xxx x
ON      x.section_id = xoo.section_id
        AND (x.name, x.id) <= ((mlast).name, (mlast).id)

|
  this answer
edited Jul 14 '09 at 15:35 answered Jul 14 '09 at 10:48 Quassnoi 276k 55 446 504      I am getting: ERROR: syntax error at or near "JOIN" –  Kouber Saparev Jul 14 '09 at 12:16      @Kouber: see post update –  Quassnoi Jul 14 '09 at 12:29      The query is very close to the one I need, except that it is not showing sections with less than 2 rows, i.e. the row with ID=7 isn't returned. Otherwise I like your approach. –  Kouber Saparev Jul 14 '09 at 15:29      @Kouber: see post update –  Quassnoi Jul 14 '09 at 15:35      Thank you, I just came to the same solution with COALESCE, but you were faster. :-) –  Kouber Saparev Jul 14 '09 at 15:41  |  show more comments
        -- ranking without WINDOW functions
-- EXPLAIN ANALYZE
WITH rnk AS (
        SELECT x1.id
        , COUNT(x2.id) AS rnk
        FROM xxx x1
        LEFT JOIN xxx x2 ON x1.section_id = x2.section_id AND x2.name <= x1.name
        GROUP BY x1.id
        )
SELECT this.*
FROM xxx this
JOIN rnk ON rnk.id = this.id
WHERE rnk.rnk <=2
ORDER BY this.section_id, rnk.rnk
        ;

        -- The same without using a CTE
-- EXPLAIN ANALYZE
SELECT this.*
FROM xxx this
JOIN ( SELECT x1.id
        , COUNT(x2.id) AS rnk
        FROM xxx x1
        LEFT JOIN xxx x2 ON x1.section_id = x2.section_id AND x2.name <= x1.name
        GROUP BY x1.id
        ) rnk
ON rnk.id = this.id
WHERE rnk.rnk <=2
ORDER BY this.section_id, rnk.rnk
        ;

|
  this answer
answered Dec 7 '12 at 20:53 wildplasser 25.4k 3 28 49      CTEs and Window functions were introduced with the same version, so I don't see the benefit of the first solution. –  a_horse_with_no_name Dec 7 '12 at 20:56      The post is three years old. Besides, there may still be implementations that lack them (nudge nudge say no more). It could also be considered an exercise in old-fashoned querybuilding. (though CTEs are not very old-fashoned) –  wildplasser Dec 7 '12 at 21:00      The post is tagged "postgresql" and the PostgreSQL version that introduced CTEs also introduced windowing functions. Hence my comment (I did see it's that old - and PG 8.3 did have neither) –  a_horse_with_no_name Dec 7 '12 at 21:01      The post mentions 8.3.5, and I believe they were introduced in 8.4. Besides: it is also good to know about alternative scenarios, IMHO. –  wildplasser Dec 7 '12 at 21:03      That's exactly what I mean: 8.3 neither had CTEs nor windowing functions. So the first solution won't work on 8.3 –  a_horse_with_no_name Dec 7 '12 at 21:05  |  show more comment

Since v9.3 you can do a lateral join

select distinct t_outer.section_id, t_top.id, t_top.name from t t_outer
join lateral (
    select * from t t_inner
    where t_inner.section_id = t_outer.section_id
    order by t_inner.name
    limit 2
) t_top on true
order by t_outer.section_id;

It might be faster but, of course, you should test performance specifically on your data and use case.


|
  this answer
answered Jun 16 '16 at 14:25 poshest 2,078 1 13 25      I does not really work. I get exactly the same output when I execute rest of you query excluding the lateral join part, i.e. taking only the first and last lines of your query and executing them. These two following return me the same result set: select distinct a.customer_name, a.order_date from sales a join lateral ( select * from sales b where b.customer_name = a.customer_name order by order_date desc limit 3 ) c on true order by a.customer_name; and select distinct customer_name, order_date from sales order by customer_name; –  NurShomik Oct 19 '16 at 18:37      Oh, my mistake. It worked, I had to use c.order_date instead of a.order_date. –  NurShomik Oct 19 '16 at 18:53

 | 


相关阅读排行


相关内容推荐

最新文章

×

×

请激活账号

为了能正常使用评论、编辑功能及以后陆续为用户提供的其他产品,请激活账号。

您的注册邮箱: 修改

重新发送激活邮件 进入我的邮箱

如果您没有收到激活邮件,请注意检查垃圾箱。