您的位置:首页 > 产品设计 > UI/UE

Query performance optimization of Vertica

2013-07-23 15:51 597 查看
Don't fetch any data that you don't need,or don't fetch any columns that you don't need. Because retrieving more data or more columns, which can increase network,I/O,memory and CPU overhead for the server. For example, if you need several columns you can use

AT EPOCH LATEST

SELECT fi.name, fi.InvestmentKey,id.VendorId,id.CUSIP,id.ISIN,id.DomicileCountryId,id.CurrencyId

FROM dbo.FixedIncome fi

INNER JOIN dbo.InvestmentIdDimension id ON id.InvestmentKey = fi.InvestmentKey

WHERE id.InvestmentId = 'B000023K1X'

But do not use:

AT EPOCH LATEST

SELECT fi.*, id.*

FROM dbo.FixedIncome fi

INNER JOIN dbo.InvestmentIdDimension id ON id.InvestmentKey = fi.InvestmentKey

WHERE id.InvestmentId = 'B000023K1X'

To avoid blocking Vertica write process, we alway add the "AT EPOCH
LATEST" for query,which is snapshot read. for example, You can use

AT EPOCH LATEST SELECT ... FROM ...,

But do not use:

SELECT ... FROM ...

Chop up a complex query to many simpler queries.

Join decomposition, if posible, Sometimes, Using "In" clause or sub
query clause instead of a complex "JOIN" clause. like this, we can use

AT EPOCH LATEST

SELECT s1.CompanyId, id.InvestmentId, s1.InvestmentKey,id.VendorId,id.CUSIP,id.ISIN,id.DomicileCountryId,id.CurrencyId

FROM ( SELECT CompanyId,InvestmentKey FROM dbo.FixedIncome WHERE CompanyId = '0C00000BDL') s1

INNER JOIN dbo.InvestmentIdDimension id ON id.InvestmentKey = s1.InvestmentKey

WHERE id.VendorId = 101 OR id.VendorId = 102;

But do not use:

AT EPOCH LATEST

SELECT s1.CompanyId, id.InvestmentId, s1.InvestmentKey,id.VendorId,id.CUSIP,id.ISIN,id.DomicileCountryId,id.CurrencyId

FROM dbo.FixedIncome fi

INNER JOIN dbo.InvestmentIdDimension id ON id.InvestmentKey = s1.InvestmentKey

WHERE fi.CompanyId = '0C00000BDL' AND( id.VendorId = 101 OR id.VendorId = 102 );

Try to use the temporary table to cache data, which can avoid scan an physical table for times.

Try to push the outer predicate into the inner subquery clause, so that it is evaluated before the analytic computation

For Top-K query, if posible, we'd better omit the order by clause, Or we'd better adding a filter condition for it.

For sort operation, We can create Pre-sorted projections, so the
vertica can choose the faster Group By Pipeline over Group By Hash

Please refer to the "Optimizing Query Performance" chapter in
reference manual of vertica, which doc's name is "Communiti Vertica
Community Edition 6.0"

[https://my.vertica.com/docs/CE/6.0.1/HTML/index.htm#12525.htm ]
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: