分層查詢 SQL 效能故障這是避免坑的可靠指南！

關於作者：

江，雲曲網路科技聯合創始人，Oracle ACE，11G OCM，擁有多年的Oracle設計、管理和實施經驗，精通資料庫優化、Oracle CBO和並行原理。雲曲鷹眼監控核心設計師和開發者，資深python Web開發者。

最近，我們在分層查詢 SQL 中頻繁遇到效能問題，基於歷史故障案例，我們總結了 Connect By 場景中常見的一些效能故障型別，並在本文中分享。

100 幫助計畫 1.將結果中的篩選條件或生成樹中的篩選條件放置在其中後，在結果樹生成完成後裁剪葉節點將其放置在連線方式中後，它用於在生成樹過程中修剪子樹。

常見的現象是，業務邏輯實際上並不需要形成結果樹然後進行過濾，因為開發人員對放置在不同位置（where 之後和連線後）的過濾條件導致的過濾效果感到困惑，導致效能低效。

以下 SQL 語句是乙個典型的示例。使用者反饋，zzzz.sys_rc_route_detail表上有生產環境中的 3000+ 條資料，但 SQL 語句無法執行結果：

select xxxxx

from zzzz.sys_rc_route_detail t

where t.route_id = (select a.route_id

from xxx.sys_rc_route a, xxx.g_wo_base b

where a.route_id = b.route_id

and b.work_order = 'yyyyyyyyy')

start with t.node_type = '0'

connect by nocycle prior next_node_id = node_id

讓客戶執行 sql 一分鐘後取消，爬取的監控報告如下：

問題很明顯，表中有很多 nextnodeid = node id 的重複值，導致結果集數量很大。在SQL執行的一分鐘內，Connect By在生成完整樹之前就有3000W+的資料，所以我們開始思考在構建完整樹後是否有必要進行過濾。

與業務部門溝通後，發現不需要。

以下資料可用於測試 3000 行的資料量，但 count（*）會很慢。

sql> create table test1 as

select

mod(rownum,2) id,mod(rownum +1 ,2) id2

fromdual

connect by level <= 3000

table created.

sql> set timing on

sql> select count(*)from test1 where id =0 start with id =0 connect by nocycle prior id = id2 ;

count(*)

elapsed: 00:09:26.88

sql>

如上圖過濾結果，耗時9 min;生成樹篩選僅使用 03s：

sql> select count(*)from test1 start with id =0 connect by nocycle prior id = id2 and id = 0 ;

count(*)

elapsed: 00:00:00.31

在許多情況下，兩種拼寫的結果集可能相同，如下所示：

create table test2 as

select

rownum id,rownum +1 id2,rownum + 2 id3

fromdual

connect by level <= 3000;

sql> select id from test2 where id3 < 10 start with id = 3 connect by nocycle prior id2 = id;id

7 rows selected.

sql> select id from test2 start with id = 1 connect by nocycle prior id2 = id and id3 <10;id

7 rows selected.

但實際上，這兩種型別的寫作在語義上有很大的不同，結果集也可能不同，如下：

sql> select id from test2 where id3 = 10 start with id = 3 connect by nocycle prior id2 = id;id

elapsed: 00:00:00.13

sql> select id from test2 start with id = 3 connect by nocycle prior id2 = id and id3=10;id

elapsed: 00:00:00.00

其次，國會預算辦公室的估計不準確

分層查詢中SQL語句的常見問題是CBO估計返回的結果集有偏差，導致執行計畫不準確。雖然統計資訊已經收集到桌面上，但 CBO 對結果集的估計與實際值相差很大（幾百或幾千倍），但這不能完全歸咎於 CBO，畢竟考慮了多少層遞迴查詢和要修剪的資料，結果確實是無法計算的。

針對CBO估計不準確的問題，我們考慮了結果集中比較特殊的引數，在SQL文字上進行了區分，應用特殊引數的標識來執行帶有提示的SQL，並通過提示指定返回結果集。這種情況與普通的資料傾斜不同，不可能通過基線提供不涉及應用轉型的解決方案。

3. 並行處理

分層查詢的 SQL 語句直接使用並行的提示，會遇到並行序列化的問題，即不能真正並行。對於一些重要且耗時的分層查詢，可以考慮pipelined table function重寫 SQL 方式。

以下指令碼測試參考了陳煥生的童鞋部落格和oracle相關文件（doc id 2168864）。1）：

drop table t1;

t1 with 100,000 rows

create table t1

asselect

rownum id,lpad(rownum, 10, '0') v1,trunc((rownum - 1)/100) n1,rpad(rownum, 100) padding

fromdual

connect by level <= 100000

begindbms_stats.gather_table_stats(user,'t1');

end;select /*+ monitor */

count(*)

fromselect

connect_by_root ltrim(id) root_id,connect_by_isleaf is_leaf,level as t1_level,a.v1

from t1 a

start with a.id <=1000

connect by nocycle id = prior id + 1000

create or replace package refcur_pkg

astype r_rec is record (row_id rowid);

type refcur_t is ref cursor return r_rec;

end;create or replace package connect_by_parallel

as/* n**iagates a shallow hiearchy in parallel, where we do a tree walk for each root */

cursor c1 (p_rowid rowid) is --cursor done for each subtree. this select is provided by the customer

select connect_by_root ltrim(id) root_id, connect_by_isleaf is_leaf, level as t1_level, a.v1

from t1 a

start with rowid = p_rowid

connect by nocycle id = prior id + 1000;

type t1_tab is table of c1%rowtype;

function treewalk (p_ref refcur_pkg.refcur_t) return t1_tab

pipelined

parallel_enable(partition p_ref by any);

end connect_by_parallel;

create or replace package body connect_by_parallel

as function treewalk (p_ref refcur_pkg.refcur_t) return t1_tab

pipelined parallel_enable(partition p_ref by any)

isin_rec p_ref%rowtype;

beginexecute immediate 'alter session set "_old_connect_by_enabled"=true';

loop --for each root

fetch p_ref into in_rec;

exit when p_ref%notfound;

for c1rec in c1(in_rec.row_id) loop --retrieve rows of subtree

pipe row(c1rec);

end loop;

execute immediate 'alter session set "_old_connect_by_enabled"=false';

return;

end treewalk;

end connect_by_parallel;

select

+ monitor */

count(*)

from table(connect_by_parallel.treewalk (cursor

select /*+ parallel (a 100) */

rowid from t1 a where id <= 100)))b;

分層查詢的SQL在整個SQL優化場景中佔比相對較小，但這類SQL優化往往比較麻煩，本文分享的三個案例在實戰中進行了總結，對預言機分層查詢的SQL優化有很大的借鑑意義，尤其是陳煥生提供的做並行的案例，含金量高，感興趣的童鞋可以測試。

分層查詢 SQL 效能故障這是避免坑的可靠指南！

相關問題答案

思維水平：努力？

奧迪SQ7是豪華與效能的完美結合，引領SUV新潮流

如何填寫大學水平？

新款奧迪SQ5上市，價格不變，接近60萬

CCC認證查詢