top of page
  • Writer's picturekyle Hailey

Exploring GCP's Postgres Throughput: Insights from Explain Analyze

Was wondering what the the throughput was on Cloud SQL so ran some tests.

I have about 20 partitions ranging from 10-100GB.

I iterated through them without parallel query, then with parallel 2 and parallel 8.

I also ran each twice in row so that the second runs were generally cached (the 100GB partition did not get cached but others did)



  • 0-10% data cached

    • 200MB/s

      • no matter the degree of parallelism is around 200MB/s ( I did a test with parallel degree 64 and got up to 500MB/s but I lack detail on these tests now)

  • 100% data cached

    • 1-2GB/s singled thread

    • 4-6GB/s 2 PQ processes

    • 14-20GB/s 8 PQ processes

On 96 vCPU 624 GB RAM Postgres Cloud SQL

Partitions are 20-100GB


I run a full table scan on foo using un-index column "bar". Not sure if it matters but "bar" is the last column in table.

explain (analyze, buffers)  
select ' foo_23_04_25'  , max(bar) from  foobar_23_04_25;

output looks like

              QUERY PLAN                                                                       
 Finalize Aggregate  (cost=2487057.14..2487057.15 rows=1 width=36) (actual time=91460.903..91464.937 rows=1 loops=1)
   Buffers: shared hit=103064 read=2364857 dirtied=40
   I/O Timings: read=802683.170
   ->  Gather  (cost=2487056.31..2487057.12 rows=8 width=4) (actual time=91460.311..91464.926 rows=9 loops=1)
         Workers Planned: 8
         Workers Launched: 8
         Buffers: shared hit=103064 read=2364857 dirtied=40
         I/O Timings: read=802683.170
         ->  Partial Aggregate  (cost=2486056.31..2486056.32 rows=1 width=4) (actual time=91450.332..91450.334 rows=1 loops=9)
               Buffers: shared hit=103064 read=2364857 dirtied=40
               I/O Timings: read=802683.170
               ->  Parallel Seq Scan on foobar_23_04_08  (cost=0.00..2482429.25 rows=1450825 width=4) (actual time=0.531..90752.789 rows=1202620 loops=9)
                     Buffers: shared hit=103064 read=2364857 dirtied=40
                     I/O Timings: read=802683.170
   Buffers: shared hit=125
 Planning Time: 0.849 ms
 Execution Time: 91465.048 ms

The I dump all the outputs in one file and parse the file with

NOTE: the script was written by ChatGPT! I consider it a hack. I have only actually read about half of it. My goal was to get it to run as fast as possible. If I find it long term useful, I'll look into expanding it.

Topology Schema Map: One thing I really want to do is process a batch of SQL and make like a topology map of which tables are used most and which tables those tables are joined to the most often. The topology map is useful for many things. One is being able to see what parts of the schema are separate and what parts are used together. This can be powerful when it comes to separating a monolithic schema into small schemas.

Heat map field usage: Similarly it would be cool to have some sort of heat map on columns in a table, i.e how often they are referenced (and if they are never referenced)

Visual SQL Tuning: Visual SQL Tuning based on Dan Tow's SQL tuning method.

It's super cool and would love to did again for a web interface . I implemented it at Embarcadero with clunky fat client. It was awesome. Would love to implement it in a web interface or even a cool simple tool like PASH-veiwer

GraphGPT looks like possible way to easily implement it in a lossy way which may or may not be accepetable depending on the situation.

I called the parsing file "" because I created this parser after using depesz on a bunch of traces and realized needed batch processing.

The the lines starting with "foo" the object name are for human reading.

The last 4 lines are to drop into a spread sheet representing the % data cached and the MB/s

Full parsed data:

97 views0 comments


bottom of page