raven/devdocs/specs/core-performance.txt

PERFORMANCE SPECS AND USEFUL INFO

//this is handy:
http://okigiveup.net/what-postgresql-tells-you-about-its-performance/


Useful queries to indicate how indexes are being used in postgresql
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

2020-05-21 THIS SUPERSEDES BELOW
HOW TO FIND SHITTY INDEXES: https://gist.github.com/jberkus/6b1bcaf7724dfc2a54f3

==-=-=-
OK: this is the best query to use to find non used indexes
Run it and look for any indexes that are *not* primary keys (name ends in pkey, want ones that end in idx which are mine)
that have zero in idx_scan, that means they are completely unused (if a lot of full test runs that excercise all routes and features have happened prior to checking)

SELECT * FROM pg_stat_user_indexes order by idx_scan asc

Following queries might be useful but for different purposes...

This is a test query I used with widget and name fetching performance analysis:
explain analyze SELECT m.name
FROM awidget AS m
WHERE m.id = 12989
LIMIT 1

//All index data collected by postgresql

select * from pg_stat_user_indexes


Reveals Unused indices
=-=-=-=-=-=-=-=-=-=-=-

SELECT * FROM pg_stat_user_indexes order by idx_scan asc

 The basic technique is to look at pg_stat_user_indexes and look for ones where idx_scan,
 the count of how many times that index has been used to answer queries, is zero, or at least very low.


Detecting missing indexes
=-=-=-=-=-=-=-=-=-=-=-=-=
SELECT   relname, seq_scan, seq_tup_read,
      idx_scan, idx_tup_fetch,
      seq_tup_read / seq_scan
  FROM   pg_stat_user_tables
  WHERE   seq_scan > 0
  ORDER BY seq_tup_read DESC;


Shows info on all indices
=-=-=-=-=-=-=-=-=-=-=-=-=-
SELECT
    t.tablename,
    indexname,
    c.reltuples AS num_rows,
    pg_size_pretty(pg_relation_size(quote_ident(t.tablename)::text)) AS table_size,
    pg_size_pretty(pg_relation_size(quote_ident(indexrelname)::text)) AS index_size,
    CASE WHEN indisunique THEN 'Y'
       ELSE 'N'
    END AS UNIQUE,
    idx_scan AS number_of_scans,
    idx_tup_read AS tuples_read,
    idx_tup_fetch AS tuples_fetched
FROM pg_tables t
LEFT OUTER JOIN pg_class c ON t.tablename=c.relname
LEFT OUTER JOIN
    ( SELECT c.relname AS ctablename, ipg.relname AS indexname, x.indnatts AS number_of_columns, idx_scan, idx_tup_read, idx_tup_fetch, indexrelname, indisunique FROM pg_index x
           JOIN pg_class c ON c.oid = x.indrelid
           JOIN pg_class ipg ON ipg.oid = x.indexrelid
           JOIN pg_stat_all_indexes psai ON x.indexrelid = psai.indexrelid )
    AS foo
    ON t.tablename = foo.ctablename
WHERE t.schemaname='public'
ORDER BY 7,1,2;


Show performance of indices that are being used
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
SELECT indexrelname,cast(idx_tup_read AS numeric) / idx_scan AS avg_tuples,idx_scan,idx_tup_read FROM pg_stat_user_indexes WHERE idx_scan > 0;


WORK IN PROGRESS:

=-=-=-=-=-=-
Search result list NAME FETCHER in :

//Before attempt to optimize name fetcher (unknown number of results)
//22548, 21187, 20462, 22336, 20094 - AVG = 21325

14244 results with index scan: 24141, 29549, 23366, 24085, 23335 AVG: 24895 = 1.7ms per result

Removed index but kept data:
14244 results without index scan: 23391, 22623, 21428, 22607, 23106  ANOMALOUS, disregarding

### 14244 results without index scan (after a restart of server): 24124, 21157, 21178, 21187, 21932 AVG: 21915 = 1.53 per result #####

14244 results without index scan (after a restart of server and using a fresh aycontext for each query): 32336, 31794...clearly much slower, abandoning this avenue
14244 results without index scan (after a restart of server and using asnotracking for each query): 24625, 21387, 21905, 22190 ... not a dramatic difference, keeping the notracking code in as it makes sense but need to look elsewhere
14244 results without index scan (after a restart of server and bypassing EF entirely with a direct query INITIAL NAIVE ATTEMPT): 13955, 13365, 13421, 13445, 13271


### 14244 results without index scan (after a restart of server and bypassing EF entirely with a direct query OPTIMIZED TO REUSE CONNECTION): 12707, 12341, 12733, 12487, 12452 AVG: 12,544 = .88ms per result  ####

Now I'm going to try it with the index put back in and data regenerated
### 14244 results with index in place (after a restart of server and bypassing EF entirely with a direct query OPTIMIZED TO REUSE CONNECTION): 11229, 15480, 13763, 13051, 13178 AVG: 13,340 = .936 per result

Now fresh test but without index being crated
### 14244 results with index in place (after a restart of server, and bypassing EF entirely with a direct query OPTIMIZED TO REUSE CONNECTION): 14270 results - 13176, 12688, 13179, 12994, 12272 AVG: 12,861 = .90 per result

index put back in and data regenerated
### 14255 results with index in place (after a restart of server and bypassing EF entirely with a direct query OPTIMIZED TO REUSE CONNECTION): 12461, 12040, 11171, 11141, 11214 AVG: 11605 = .81 per result
OK, this tells me that it's faster with the index in place and intuitively that just makes sense.
Also verified it's actually using the index scan instead of table scan.
I'm going to enact a policy to index id,name in all objects that have many columns, if they only have a name and id and not much else then there seems little benefit

### results ("final" id,name indexes on user table and widget table, freshly generated data), 14202 RESULTS: 13295, 14502, 11774, 12521, 12101, 13169 AVG: 12,893 = 1.15

Ok, it just makes logical sense to keep the indexes even if slightly slower, I can revisit this later, the difference is miniscule.  I suspect with a bigger database there would definitely be better peformance.

=-=-=-=-=-=-=-=-=-=-=-=-
LOCALE ITEM
b4 running multiple subset fetches

256287	256301	"public"	"alocaleitem"	"alocaleitem_localeid_key_idx"	"2350"	"13667"	"5172"
256287	256302	"public"	"alocaleitem"	"alocaleitem_localeid_key_display_idx"	"2618"	"2618"	"2618"
256287	256294	"public"	"alocaleitem"	"alocaleitem_pkey"	"7085"	"7090"	"7085"

"alocaleitem"	"24"	"133010"	"12053"	"14875"	"5542"

after I run a bunch of fetch subset test over and over:
256287	256301	"public"	"alocaleitem"	"alocaleitem_localeid_key_idx"	"2398"	"13715"	"5220"
256287	256302	"public"	"alocaleitem"	"alocaleitem_localeid_key_display_idx"	"2618"	"2618"	"2618"
256287	256294	"public"	"alocaleitem"	"alocaleitem_pkey"	"7085"	"7090"	"7085"

                        seq_tup_read
"alocaleitem"	"24"	"133010"	"12073"	"14895"	"5542"

more

"alocaleitem"	"24"	"133010"	"12141"	"14963"	"5542"

256287	256301	"public"	"alocaleitem"	"alocaleitem_localeid_key_idx"	"2438"	"13755"	"5260"
256287	256302	"public"	"alocaleitem"	"alocaleitem_localeid_key_display_idx"	"2618"	"2618"	"2618"
256287	256294	"public"	"alocaleitem"	"alocaleitem_pkey"	"7085"	"7090"	"7085"

Fresh regen, removed alocaleitem_localeid_key_idx kept alocaleitem_localeid_key_display_idx
Before any test runs fetching subsets:
256514	256521	"public"	"alocaleitem"	"alocaleitem_pkey"	"0"	"0"	"0"
256514	256528	"public"	"alocaleitem"	"alocaleitem_localeid_key_display_idx"	"1667"	"1667"	"1667"