A couple of days ago someone posted a question on the forum which at the first glance seemed old, boring, beaten up and down. Here it is:
There is a news feed. All news are divided into 10 categories (Politics, sport, auto, real estate, etc).
I need to get top 4 news sorted by time descending for each category with 1 query.
If you sort the results – you get 4 politics news, then 4 sport news etc.
But the task was to make it optimal, and the standard solution with usual TopN using row_number can not be called optimal in any way, especially in case of big tables, relatively small number of categories and uneven distribution or just overall low selectivity.
So after getting several more or less acceptable variants, and spotting the solution using PostgreSQL (I didn’t dig deep into it, as I saw recursion, min and predicate, and that was enough), I got a great variant.
But first things first:
1. Getting distinct values from the index Continue reading


