How does one check if there are missing indexes in a database?
Is there a programatic way to do it?
Explain Plan Most blogs on the subject will say look at the explain plan for your query. Basic concept is to look for full table scans in the explain plan and the look and see if there is a predicate filter (i.e. SQL text of form “where table.field=value”) with a full table scan or a full table scan in a join ( i.e. tableA.field = tableB.field). There are other considerations, things to check, and other places for indexes, but this is a good start.
Explain Plans for whole database What if you don’t have a particular query and instead want to check the whole running database? Is there a programatic way to do it? If the answer to finding missing indexes for one query is looking at the explain plan then by extension the answer for the whole database is looking at the explain plans for the whole database.
Execution Plan Cache On Oracle and SQL Server the databases have a cache of execution plans (the plans actually used by the optimizer as opposed to the plan the optimizer thinks it would use the “explain” plan. Execution plans are from executions and explain plans from the explain command without executing the statement. They should be the same but for many reason the might be different). Thus on Oracle (V$SQL_PLAN) and SQL Server ( sys.dm_exec_query_plan) the execution plans can be programmatically queried and analyzed.
Explain plan on SQL from SQL cache What about PostgreSQL and MySQL. Neither have a query plan cache. Both PostgreSQL (pg_stat_statements) and MySQL (events_statements_history) do have the text for executed queries. With the SQL queries one can programmatically collect the SQL queries and run the “explain” command and get the explain plan. Another issue is getting the full SQL text. Without the full SQL text, the Explain plan can’t be run. On MySQL the SQL text length is limited by default to 1K but can be adjusted with parameter performance_schema_max_sql_text_length system variable at server startup. Postgres is also limited to 1K by default can be increased with the parameter track_activity_query_size.
Explain Analyze For select statements (i.e. queries that don’t modify data) one could even consider running “explain analyze” which both PostgreSQL and MySQL have. Explain Analyze which runs the query to give the execution plan as well as the estimated costs and the actual execution statistics which is useful as we will discuss later.
Explain on SQL with bind variables Unfortunately the Explain command breaks with SQL that have variable parameters. In order to get Explain to work requires substituting real values for variables.
Bind variable capture Bind variables can be captured on PostgreSQL with extensions like PG_qualstats. PG_qualstats also captures other information that is perfect for finding missing indexes. Bind variable capture is not possible on MySQL even with slow query log.
So on MySQL without execution plans caches and without bind variables we can’t programatic index advisor on MySQL.