??????SUBQUERY SOLUTION ??????---------------------- ??????SELECT st.stor_name AS 'Store', ??????(SELECT SUM(bs.qty) ??????FROM big_sales AS bs ??????WHERE bs.stor_id = st.stor_id), 0) ??????AS 'Books Sold' ??????FROM stores AS st ??????WHERE st.stor_id IN ??????(SELECT DISTINCT stor_id ??????FROM big_sales) | JOIN SOLUTION ---------------------- SELECT st.stor_name AS 'Store', SUM(bs.qty) AS 'Books Sold' FROM stores AS st JOIN big_sales AS bs ON bs.stor_id = st.stor_id WHERE st.stor_id IN (SELECT DISTINCT stor_id FROM big_sales) GROUP BY st.stor_name |
??????SUBQUERY SOLUTION ??????---------------------- ??????SQL Server parse and compile time: ??????????CPU time = 28 ms ??????????elapsed time = 28 ms ??????SQL Server Execution Times: ??????????CPU time = 145 ms ??????????elapsed time = 145 ms ??????Table 'big_sales'. Scan count 14, logical reads ??????1884, physical reads 0, read-ahead reads 0. ??????Table 'stores'. Scan count 12, logical reads 24, | JOIN SOLUTION ---------------------- SQL Server parse and compile time: ????CPU time = 50 ms ????elapsed time = 54 ms SQL Server Execution Times: ????CPU time = 109 ms ????elapsed time = 109 ms Table 'big_sales'. Scan count 14, logical reads 966, physical reads 0, read-ahead reads 0. Table 'stores'. Scan count 12, logical reads 24, |
??????不必更深探索,我們可以看到在CPU和總的實耗時間方面連接更快,僅需要子查詢方案邏輯讀的一半。此外,這兩種情況伴隨著相同的結果集,雖然排序的順序不同,這是因為連接查詢(由于它的GROUP BY子句)有一個隱含的ORDER BY:
??????Store Books Sold ??????------------------------------------------------- ??????Barnum's 154125 ??????Bookbeat 518080 ??????Doc-U-Mat: Quality Laundry and Books 581130 ??????Eric the Read Books 76931 ??????Fricative Bookshop 259060 ??????News & Brews 161090 ??????(6 row(s) affected) ??????Store Books Sold ??????------------------------------------------------- ??????Eric the Read Books 76931 ??????Barnum's 154125 ??????News & Brews 161090 ??????Doc-U-Mat: Quality Laundry and Books 581130 ??????Fricative Bookshop 259060 ??????Bookbeat 518080 ??????(6 row(s) affected) |
??????|--Compute Scalar(DEFINE:([Expr1006]=isnull([Expr1004], 0))) ??????|--Nested Loops(Left Outer Join, OUTER REFERENCES:([st].[stor_id])) ??????|--Nested Loops(Inner Join, OUTER REFERENCES:([big_sales].[stor_id])) ?????? ??| |--Stream Aggregate(GROUP BY:([big_sales].[stor_id])) ????????????| | |--Clustered Index Scan(OBJECT:([pubs].[dbo].[big_sales]. ????????????[UPKCL_big_sales]), ORDERED FORWARD) ?????? ??| |--Clustered Index Seek(OBJECT:([pubs].[dbo].[stores].[UPK_storeid] ??????AS [st]), ??????SEEK:([st].[stor_id]=[big_sales].[stor_id]) ORDERED FORWARD) ?????? |--Stream Aggregate(DEFINE:([Expr1004]=SUM([bs].[qty]))) ??????|--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales]. ????????[UPKCL_big_sales] AS [bs]), ??????SEEK:([bs].[stor_id]=[st].[stor_id]) ORDERED FORWARD) |
??????|--Stream Aggregate(GROUP BY:([st].[stor_name]) ????????DEFINE:([Expr1004]=SUM([partialagg1005]))) ??????|--Sort(ORDER BY:([st].[stor_name] ASC)) ??????|--Nested Loops(Left Semi Join, OUTER REFERENCES:([st].[stor_id])) ??????|--Nested Loops(Inner Join, OUTER REFERENCES:([bs].[stor_id])) ????????| |--Stream Aggregate(GROUP BY:([bs].[stor_id]) ??????????DEFINE:([partialagg1005]=SUM([bs].[qty]))) ???????????? | | |--Clustered Index Scan(OBJECT:([pubs].[dbo].[big_sales]. ????????????[UPKCL_big_sales] AS [bs]), ORDERED FORWARD) ????????| |--Clustered Index Seek(OBJECT:([pubs].[dbo].[stores]. ????????????[UPK_storeid] AS [st]), ????????SEEK:([st].[stor_id]=[bs].[stor_id]) ORDERED FORWARD) ??????|--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales]. ??????????[UPKCL_big_sales]), ????????SEEK:([big_sales].[stor_id]=[st].[stor_id]) ORDERED FORWARD) |
??????使用連接是更有效的方案。它不需要額外的流聚合(stream aggregate),即子查詢所需在big_sales.qty列的求和。
??????UNION vs UNION ALL
??????無論何時盡可能用UNION ALL 代替UNION。其中的差異是因為UNION有排除重復行并且對結果進行排序的副作用,而UNION ALL不會做這些工作。選擇無重復行的結果需要建立臨時工作表,用它排序所有行并且在輸出之前排序。(在一個select distinct 查詢中顯示查詢計劃將發現存在一個流聚合,消耗百分之三十多的資源處理查詢)。當你確切知道你得需要時,可以使用UNION。但如果你估計在結果集中沒有重復的行,就使用UNION ALL吧。它只是從一個表或一個連接中選擇,然后從另一個表中選擇,附加在第一條結果集的底部。UNION ALL不需要工作表和排序(除非其它條件引起的)。在大部分情況下UNION ALL更具效率。一個有潛在危險的問題是使用UNION會在數據庫中產生巨大的泛濫的臨時工作表。如果你期望從UNION查詢中獲得大量的結果集時,這就可能發生。
??????示例
??????下面的查詢是選擇pubs數據庫中的表sales的所有商店的ID,也選擇表big_sales中的所有商店的ID,這個表中我們加入了70,000多行數據。在這兩個方案間不同之處僅僅是UNION 與UNION ALL的使用比較。但在這個計劃中加入ALL關鍵字產生了三大不同。第一個方案中,在返回結果集給客戶端之前需要流聚合并且排序結果。第二個查詢更有效率,特別是對大表。在這個例子中兩個查詢返回同樣的結果集,雖然順序不同。在我們的測試中有兩個臨時表。你的結果可能會稍有差異。
??????UNION SOLUTION ??????----------------------- | ??????UNION ALL SOLUTION ??????----------------------- |
??????SELECT stor_id FROM big_sales ??????UNION ??????SELECT stor_id FROM sales ??????---------------------------- | ??????SELECT stor_id FROM big_sales ??????UNION ALL ??????SELECT stor_id FROM sales ??????---------------------------- |
??????|--Merge Join(Union) ?????? |--Stream Aggregate(GROUP BY: ??????([big_sales].[stor_id])) ??????| |--Clustered Index Scan ??????(OBJECT:([pubs].[dbo]. ??????[big_sales]. ??????[UPKCL_big_sales]), ??????ORDERED FORWARD) ??????|--Stream Aggregate(GROUP BY: ??????([sales].[stor_id])) ?????? |--Clustered Index Scan ?????? (OBJECT:([pubs].[dbo]. ?????? [sales].[UPKCL_sales]), ?????? ORDERED FORWARD) | ??????|--Concatenation ??????|--Index Scan ??????(OBJECT:([pubs].[dbo]. ?????? [big_sales].[ndx_sales_ttlID])) ??????|--Index Scan ??????(OBJECT:([pubs].[dbo]. ??????[sales].[titleidind])) |
??????UNION SOLUTION ??????----------------------- ??????Table 'sales'. Scan count 1, logical ??????reads 2, physical reads 0, ??????read-ahead reads 0. ??????Table 'big_sales'. Scan count 1, ??????logical ??????reads 463, physical reads 0, ??????read-ahead reads 0. | ??????UNION ALL SOLUTION ??????----------------------- ??????Table 'sales'. Scan count 1, logical ??????reads 1, physical reads 0, ??????read-ahead reads 0. ??????Table 'big_sales'. Scan count 1, ??????logical ??????reads 224, physical reads 0, ??????read-ahead reads 0. |
??????雖然在這個例子的結果集是可互換的,你可以看到UNION ALL語句比UNION語句少消耗一半的資源。所以應當預料你的結果集并且確定已經沒有重復時,使用UNION ALL子句。
??????函數和表達式約束索引
??????當你在索引列上使用內置的函數或表達式時,優化器不能使用這些列的索引。盡量重寫這些條件,在表達式中不要包含索引列。
??????示例
??????你應該幫助SQL Server移除任何在索引數值列周圍的表達式。下面的查詢是從表jobs通過唯一的聚集索引的唯一鍵值選擇出的一行。如果你在這個列上使用表達式,這個索引就不起作用了。但一旦你將條件’job_id-2=0’ 該成‘job_id=2’,優化器將在聚集索引上執行seek操作。
??????QUERY WITH SUPPRESSED INDEX ??????----------------------- | ??????OPTIMIZED QUERY USING INDEX ??????----------------------- |
??????SELECT * ??????FROM jobs ??????WHERE (job_id-2) = 0 | ??????SELECT * ??????FROM jobs ??????WHERE job_id = 2 |
??????|--Clustered Index Scan(OBJECT: ??????([pubs].[dbo].[jobs]. ??????[PK__jobs__117F9D94]), ??????WHERE:(Convert([jobs].[job_id])- ??????2=0)) | ??????|--Clustered Index Seek(OBJECT: ??????([pubs].[dbo].[jobs]. ??????[PK__jobs__117F9D94]), ??????SEEK:([jobs].[job_id]=Convert([@1])) ??????ORDERED FORWARD) ??????Note that a SEEK is much better than a SCAN, ??????as in the previous query. |
??????下面表中列出了多種不同類型查詢示例,其被禁止使用列索引,同時給出改寫的方法,以獲得更優的性能。
??????QUERY WITH SUPPRESSED INDEX ??????--------------------------------------- | ??????OPTIMIZED QUERY USING INDEX ??????-------------------------------------- |
??????DECLARE @job_id VARCHAR(5) ??????SELECT @job_id = ‘2’ ??????SELECT * ??????FROM jobs ??????WHERE CONVERT( VARCHAR(5), ??????job_id ) = @job_id ??????------------------------------- | ??????DECLARE @job_id VARCHAR(5) ??????SELECT @job_id = ‘2’ ??????SELECT * ??????FROM jobs ??????WHERE job_id = CONVERT( ??????SMALLINT, @job_id ) ??????------------------------------- |
??????SELECT * ??????FROM authors ??????WHERE au_fname + ' ' + au_lname ??????= 'Johnson White' ??????------------------------------- | ??????SELECT * ??????FROM authors ??????WHERE au_fname = 'Johnson' ??????AND au_lname = 'White' ??????------------------------------- |
??????SELECT * ??????FROM authors ??????WHERE SUBSTRING( au_lname, 1, 2 ) = 'Wh' ??????------------------------------- | ??????SELECT * ??????FROM authors ??????WHERE au_lname LIKE 'Wh%' ??????------------------------------- |
??????CREATE INDEX employee_hire_date ??????ON employee ( hire_date ) ??????GO ??????-- Get all employees hired ??????-- in the 1st quarter of 1990: ??????SELECT * ??????FROM employee ??????WHERE DATEPART( year, hire_date ) = 1990 ??????AND DATEPART( quarter, hire_date ) = 1 ??????------------------------------- | ??????CREATE INDEX employee_hire_date ??????ON employee ( hire_date ) ??????GO ??????-- Get all employees hired ??????-- in the 1st quarter of 1990: ??????SELECT * ??????FROM employee ??????WHERE hire_date >= ‘1/1/1990’ ??????AND hire_date < ‘4/1/1990’ ??????------------------------------- |
??????-- Suppose that hire_date may ??????-- contain time other than 12AM ??????-- Who was hired on 2/21/1990? ??????SELECT * ??????FROM employee ??????WHERE CONVERT( CHAR(10), ??????hire_date, 101 ) = ‘2/21/1990’ | ??????-- Suppose that hire_date may ??????-- contain time other than 12AM ??????-- Who was hired on 2/21/1990? ??????SELECT * ??????FROM employee ??????WHERE hire_date >= ‘2/21/1990’ ??????AND hire_date < ‘2/22/1990’ |
??????SET NOCOUNT ON
??????使用SET NOCOUNT ON 提高T-SQL代碼速度的現象使SQL Server開發者和數據庫系統管理者驚訝難解。你可能已經注意到成功的查詢返回了關于受影響的行數的系統信息。在很多情況下,你不需要這些信息。這個SET NOCOUNT ON命令允許你禁止所有在你的會話事務中的子查詢的信息,直到你發出SET NOCOUNT OFF。
??????這個選項不只在于其輸出的裝飾效果。它減少了從服務器端到客戶端傳遞的信息量。因此,它幫助降低了網絡通信量并提高了你的事務整體響應時間。傳遞單個信息的時間可以忽略,但考慮到這種情況,一個腳本在一個循環里執行一些查詢并且發送好幾千字節無用的信息給用戶。
??????為做個例子,一個文件含T-SQL批處理,其在big_sales表插入了9999行。
??????當帶SET NOCOUNT OFF命令運行,實耗時間是5176毫秒。當帶SET NOCOUNT ON命令運行,實耗時間是1620毫秒。如果不需要輸出中的行數信息,考慮在每一個存儲過程和腳本開始時增加SET NOCOUNT ON 命令將。
??????TOP 和 SET ROWCOUNT
??????SELECT 語句中的TOP子句限制單個查詢返回的行數,而SET ROWCOUNT限制所有后續查詢影響的行數。在很多編程任務中這些命令提供了高效率。
??????SET ROWCOUNT在SELECT,INSERT,UPDATE OR DELETE語句中設置可以被影響的最大行數。這些設置在命令執行時馬上生效并且只影響當前的會話。為了移除這個限制執行SET ROWCOUNT 0。
一些實際的任務用TOP or SET ROWCOUNT比用標準的SQL命令對編程是更有效率的。讓我們在幾個例子中證明:
??????TOP n
??????在幾乎所有的數據庫中最流行的一個查詢是請求一個列表中的前N項。在 pubs數據庫案例中,我們可以查找銷售最好CD的前五項。比較用TOP,SET ROWCOUNT和使用ANSI SQL的三種方案。
??????純 ANSI SQL:
??????Select title,ytd_sales
??????From titles a
??????Where (select count(*)
??????From titles b
??????Where b.ytd_sales>a.ytd_sales
??????)<5
??????Order by ytd_sales DESC
??????這個純ANSI SQL方案執行一個效率可能很低的關聯子查詢,特別的在這個例子中,在ytd_sales上沒有索引支持。另外,這個純的標準SQL命令沒有過濾掉在ytd_sales的空值,也沒有區別多個CD間有關聯的情況。
??????使用 SET ROWCOUNT:
??????SET ROWCOUNT 5
??????SELECT title, ytd_sales
??????FROM titles
??????ORDER BY ytd_sales DESC
??????SET ROWCOUNT 0
??????使用 TOP n:
??????SELECT TOP 5 title, ytd_sales
??????FROM titles
??????ORDER BY ytd_sales DESC
??????第二個方案使用SET ROWCOUNT來停止SELECT查詢,而第三個方案是當它找到前五行時用TOP n來停止。在這種情況下,在獲得結果之前我們也要有一個ORDER BY子句強制對整個表進行排序。兩個查詢的查詢計劃實際上是一樣的。然而,TOP優于SET ROWCOUNT的關鍵點是SET必須處理ORDER BY子句所需的工作表,而TOP 不用。
??????在一個大表上,我們可以在ytd_sales上創建一個索引以避免排序。查詢將使用該索引找到前5行并停止。與第一個方案相比較,其掃描了整個表,并對每一行執行了一個關聯子查詢。在小表上,性能的差異是很小的。但是在一個大表上,第一個方案的處理時間可能是數個小時,而后兩個方法是數秒。
??????當確定查詢需要時,請考慮是否只需要其中幾行,如果是,使用TOP子句將節約大量時間。
???? (北京鑄銳數碼科技有限公司 www.InnovateDigital.com)


文章來源:http://21958978.spaces.live.com/Blog/cns!A7DF246804AD47BB!208.entry