PostgreSQL�� - ��29��ִ�мƻ��ɱ��

2023-10-11��ߣ�gth163��Դ��www.gth163.com

PostgreSQL��С�׵�ר�ң��Ǵ��һ��ϵ�н̳̣��ݰ��PG��֪��װʹ�á��ɫȨ�ޡ��ά��ݣ�ϣ��Ȱ�PG��ѧϰPG��ͬѧ��а��ӭ��עCUUG PG��á�

��29��ִ�мƻ��ɱ��

��1 : PostgreSQL�в�ѯִ��

��2 : ȫ��ɨ��ɱ��

��3 : ��ɨ��ɱ��

��

�� SQL��ִ��岽��

Parser

��һ��ϵͳ��ԴӴ��ı��SQL��ж�ȡ��

Analyzer/Analyser

��/�Խ��ɵĽ��ɲ�ѯ��

Rewriter

��д��ʵ�ֹ��ϵͳ��ϵͳ��Ҫʱ��pg_rulesϵͳĿ¼�д洢�Ĺ��ת��ѯ��

PostgreSQL�е��ͼ��ͨ��ϵͳʵ�ֵġ�ͨ��ͼ����ͼʱ��Զ��Ӧ�Ĺ��򲢽��洢��Ŀ¼�С�

��Ѿ��ͼ��Ӧ�Ĺ��洢��pg_rulesϵͳĿ¼�С�

CREATE VIEW employees_list

AS SELECT e.id, e.name, d.name AS department

FROM employees AS e, departments AS d

WHERE e.department_id = d.id;

Planner and Executor

�滮��д��ղ�ѯ��ɣ��ѯ��ƻ��ִ��߿��Ч�ش��

pg_hint_plan��

PostgreSQL��֧��SQL�еļƻ��ʾ��Զ��֧��Ҫ�ڲ�ѯ��ʹ��ʾ��Ҫ��pg_hint_plan��չ��

ִ�мƻ�

�� Explain��ʾsqlִ�мƻ�

��RDBMSһ��PostgreSQL�е�explan��ʾ�ƻ��

��磺

testdb=# EXPLAIN SELECT * FROM tbl_a WHERE id < 300 ORDER BY data;

QUERY PLAN

---------------------------------------------------------------

Sort (cost=182.34..183.09 rows=300 width=8)

Sort Key: data

-> Seq Scan on tbl_a (cost=0.00..170.00 rows=300 width=8)

Filter: (id < 300)

(4 rows)

ִ��뻺��ϵ

ִ��ʱ�ļ�֮��Ĺ�ϵ

��ѯ�ɱ��

�� ѯ�еĳɱ��

�Ż��ڳɱ��ɱ��ֵ��Щ��Ǿ��Եļ�Чָ�꣬��ǱȽ��Ӫ��Լ�Ч��ָ�ꡣ

ִ��ִ�е��в��Ӧ�ĳɱ��

��ֳɱ��к��ܼơ��ܳɱ��гɱ��ܺ�

��ɱ��ڻ�ȡ��һ��֮ǰ��ѵĳɱ��磬��ɨ��ڵ��ɱ��Ƕ�ȡ��ҳ��Է��Ŀ��еĵ�һ��Ԫ��ĳɱ��

��гɱ��ǻ�ȡ��еĳɱ��

�ܳɱ��гɱ��ĳɱ�֮�͡�

�� ѯ�еĳɱ��

EXPLAN��ʾÿ��е��ܳɱ��򵥵��ʾ��

testdb=# EXPLAIN SELECT * FROM tbl;

QUERY PLAN

---------------------------------------------------------

Seq Scan on tbl (cost=0.00..145.00 rows=10000 width=8)

�ڵ�4��У��ʾ�й�˳��ɨ��Ϣ��ڡ��ɱ��У��ֵ��0.00��145.00��£��ܳɱ��ֱ�Ϊ0.00��145.00��

��ѯ�ɱ��֮˳��ɨ��

�� Sequential Scan�ɱ��

˳��ɨ��ĳɱ��cost_seqscan��㡣��ǽ�̽��ι��²�ѯ��˳��ɨ��ɱ��

testdb=# SELECT * FROM tbl WHERE id < 8000;

��˳��ɨ��У��ɱ��0��гɱ��µ�ʽ��壺

�� Sequential Scan�ɱ��

��ѯ��Ŀ��page��tuple��

��ݣ�1��2��ó�

��run cost��=(0.01+0.0025)��10000+1.0��45=170.0

�ܳɱ��

��total cost��=0.0+170.0=170

�� Index Scan�ɱ��

��Ĳ�ѯ��ͨ��ʳɱ��㣺

testdb=# SELECT id, data FROM tbl WHERE data < 240;

�Ȳ�ѯ��ҳ��N_(index,tuple) N_(index,page)

�� IndexScan �ɱ��

��ɱ��㹫ʽ

H_indexָ��ĸ߶�

��ɱ��

�� IndexScan�ɱ��

��гɱ��㹫ʽ

��ɨ��гɱ��Ǳ��cpu�ɱ��IO��/��ɱ�֮��

��run cost��=(��index cpu cost��+��table cpu cost��)+(��index IO cost��+��table IO cost��)

ǰ��ɱ��cpu�ɱ��cpu�ɱ��IO�ɱ��㹫ʽ��

�� Selectivity

��ÿһ�е�MCV(Most Common Value)��Ϊһ��most_common_vals��most_common_freqs��д洢��pg_stats��ͼ�С�

most_common_vals����ĵ�ֵ��ͳ��MCVs�б��С�

most_common_freqs����ֵ��Ƶ�ʣ��ͳ��mcv��Ƶ��С�

mydb=# \x

Expanded display is on.

mydb=# SELECT most_common_vals, most_common_freqs

FROM pg_stats

WHERE tablename = 'countries' AND attname='continent';

-[ RECORD 1 ]-----+---------------------------------------------------------------------

most_common_vals | {Africa,Europe,Asia,"North America",Oceania,"South America"}

most_common_freqs | {0.2746114,0.24352331,0.22797927,0.119170986,0.07253886,0.062176164}

�� Selectivity

��ǿ��Ĳ�ѯ��һ��WHERE�Ӿ䣬��contain=��Asia'��

testdb=# SELECT * FROM countries WHERE

continent = 'Asia';

SELECT continent, count(*) AS "number of countries",

(count(*)/(SELECT count(*) FROM countries)::real) AS "number of countries / all countries"

FROM countries GROUP BY continent ORDER BY "number of countries" DESC;

continent | number of countries | number of countries / all countries

---------------+---------------------+-------------------------------------

Africa | 53 | 0.27461139896373055

Europe | 47 | 0.24352331606217617

Asia | 44 | 0.22797927461139897

North America | 23 | 0.11917098445595854

Oceania | 14 | 0.07253886010362694

South America | 12 | 0.06217616580310881

�� Selectivity

�ܽ᣺

�롰��ޡ��Ӧ����Ƶ��ֵΪ0.227979��ˣ��ڸù��ʹ��0.227979��Ϊѡ��ԡ�

��ֵ��ѡ��ܸߵ��Ͳ��ʹ��MCV��ʹ��Ŀ��е�ֱ��ͼ��ֵ��Ƴɱ��

�� histogram_bounds

��һ��ֵ�б��ڽ��е�ֵ�ֳɴ��ȵ��

�� Buckets and histogram_bounds

testdb=# SELECT histogram_bounds

FROM pg_stats

WHERE tablename = 'tbl' AND attname = 'data';

Ĭ��£�ֱ��ͼ��ޱ��Ϊ100��Ͱ��ѯ˵��е�Ͱ��Ӧ��ֱ��ͼ��Χ��bucket��0��ʼ��ţ�ÿ��bucket�洢��Լ��ͬ��Ԫ�顣ֱ��ͼ��޵�ֵ��Ӧ�洢Ͱ�Ľ��ޡ��磬ֱ��ͼ�Ͻ�ĵ�0��ֵ��1��ζ��Ǵ洢��bucket_0�е�Ԫ��Сֵ��1��ֵ��100��Ǵ洢��bucket_1�е�Ԫ��Сֵ��ơ�

�� Selectivity

WHERE data<240��ѡ��

�� IndexScan�ɱ��

ǰ��ɱ��cpu�ɱ��cpu�ɱ��IO�ɱ��㹫ʽ��

��ݣ�1��3��4��6��cpu�ɱ��cpu�ɱ��IO�ɱ��

��index cpu cost��=0.024��10000��(0.005+0.0025)=1.8, ��7��

��table cpu cost��=0.024��10000��0.01=2.4, ��8��

��index IO cost��=ceil(0.024��30)��4.0=4.0. ��9��

�� IndexScan�ɱ��

table IO cost��㹫ʽ��

�� IndexScan�ɱ��

max_IO_cost��㹫ʽ��

min_IO_cost��㹫ʽ��

�� indexCorrelation

indexCorrelation=1.0 ��12��

��ݣ�10��11��12��ó��

��table IO cost��=180.0+��1.0��^2��(5.0?180.0)=5.0 ��13��

��ݣ�7��8��9��13��ó��ܳɱ��

��run cost��=(1.8+2.4)+(4.0+5.0)=13.2 ��14��

�� е�indexCorrelation��ѯ

testdb=# \d tbl_corr

Table "public.tbl_corr"

Column | Type | Modifiers

----------+---------+-----------

col | text |

col_asc | integer |

col_desc | integer |

col_rand | integer |

data | text |

Indexes:

"tbl_corr_asc_idx" btree (col_asc)

"tbl_corr_desc_idx" btree (col_desc)

"tbl_corr_rand_idx" btree (col_rand)

testdb=# select * from tbl_corr;

col | col_asc | col_desc | col_rand | data

----------+---------+----------+----------+------

Tuple_1 | 1 | 12 | 3 |

Tuple_2 | 2 | 11 | 8 |

Tuple_3 | 3 | 10 | 5 |

Tuple_4 | 4 | 9 | 9 |

Tuple_5 | 5 | 8 | 7 |

Tuple_6 | 6 | 7 | 2 |

Tuple_7 | 7 | 6 | 10 |

Tuple_8 | 8 | 5 | 11 |

Tuple_9 | 9 | 4 | 4 |

Tuple_10 | 10 | 3 | 1 |

Tuple_11 | 11 | 2 | 12 |

Tuple_12 | 12 | 1 | 6 |

(12 rows)

�� indexCorrelation��֮��Ĺ�ϵ

�� е�indexCorrelation��ѯ

testdb=# SELECT tablename,attname, correlation FROM pg_stats

WHERE tablename = 'tbl_corr';

tablename | attname | correlation

-----------+----------+-------------

tbl_corr | col_asc | 1

tbl_corr | col_desc | -1

tbl_corr | col_rand | 0.125874

�� ܳɱ�

��ݣ�5��14��ó�ͨ��ʱ��ܴ��ۣ�

��5��--��ɱ�

��14��--ͨ��ʱ��ĳɱ�

��total cost��=0.285+13.2=13.485 ��15��

testdb=# EXPLAIN SELECT id, data FROM tbl WHERE data < 240;

QUERY PLAN

---------------------------------------------------------------------------

Index Scan using tbl_data_idx on tbl (cost=0.29..13.49 rows=240 width=8)

Index Cond: (data < 240)

�� seq_page_cost and random_page_cost��ز��

HDDӲ�̣�

seq_page_cost=1.0

random_page_cost=4.0

SSDӲ�̣�

seq_page_cost=1.0

random_page_cost=1.0

��ѯ�ɱ��֮��

�� Sort

�ɱ��㹫ʽ��

��²�ѯ��ɱ��

testdb=# SELECT id, data FROM tbl WHERE data < 240 ORDER BY id;

�� Sort�ɱ��

-->> ��ڹ��ϣ��ϵCUUG�ͷ��ȡ

��Ͼ��ǡ�PostgreSQL��С�׵�ר�ҡ��29�� -ִ�мƻ��ɱ�� ݣ��ӭ��Ⱥһ��̽�ֽ��

��Ⱥ��35822460��Ⱥר��Ƶ��

PostgreSQL�������� - ��29����ִ�мƻ���ɱ�����

PostgreSQL�� - ��29��ִ�мƻ��ɱ��