Using partition by to write alternative SQL to get effdt and effseq items
2012-07-18 22:52
531 查看
Using partition by to write alternative SQL to get effdt and effseq items
Paritioning data is really useful for grouping data prior to data manipulation (as opposed to group by which does grouping after manipulation the data). This is because group by is grouping resulting data while partition by is actually partitioning thedata and returning the appropriate partitioned data. Simply said; a normal query is run and the results are retrieves by the Database, then Analytics are applied to the results and the Analytic function columns are computed.
This can be very usefull in peoplesoft when retrieving MAX(effdt) and MAX(effseq). Normally we would write a SQL like this to get the MAX(effdt) and MAX(effseq) for each EMPLID and EMPL_RCD.
SELECT A.EMPLID, A.EMPL_RCD, A.EFFDT, A.EFFSEQ FROM PS_JOB A WHERE ( A.EFFDT = (SELECT MAX(A_ED.EFFDT) FROM PS_JOB A_ED WHERE A.EMPLID = A_ED.EMPLID AND A.EMPL_RCD = A_ED.EMPL_RCD AND A_ED.EFFDT <= sysdate) AND A.EFFSEQ = (SELECT MAX(A_ES.EFFSEQ) FROM PS_JOB A_ES WHERE A.EMPLID = A_ES.EMPLID AND A.EMPL_RCD = A_ES.EMPL_RCD AND A.EFFDT = A_ES.EFFDT) )
As we can see in the explain plan, a lot is going on in the tables just to get the MAX(effdt) and MAX(effseq).
In the next example I have use the Partition by function to get the MAX(effdt) and MAX(effseq) It partitions job by EMPLID and EMPL_RCD, and returns the maximum dated row for each partition.
SELECT J.EMPLID, J.EMPL_RCD, J.EFFDT, J.EFFSEQ FROM (SELECT EMPLID, EMPL_RCD, EFFDT, EFFSEQ, MAX (EFFDT) over (partition BY EMPLID, EMPL_RCD) AS MAX_JOB_EFFDT, MAX (EFFSEQ) over (partition BY emplid, EMPL_RCD, EFFDT) AS MAX_JOB_EFFSEQ FROM PS_JOB WHERE effdt <= SYSDATE) J WHERE J.EFFDT = J.MAX_JOB_EFFDT AND j.EFFSEQ = J.MAX_JOB_EFFSEQ;
In this explain plan, a lot less is going on but it returns the exact same data.
Also you might notice that the CPU Cost is much lower. This doesn’t have to be the case with all partition by data. Because we are using analytic functions, when processing this on 300,000 rows in job the cost might be much higher then the cost of a normal
query. The big difference will be in the execution time. For example; when executing the normal query this might cost 2.500 seconds, while in analythis this will be below 1.000 seconds. This is something to keep in mind.
Analytic functions sometimes cost more CPU when processig large amounts of rows, but execution time will also be much, much faster.
Viewed 1324 times by 542 visitors
相关文章推荐
- CodeSnip: How to Get Id of the Record Using ASP.NET and SQL Server 2000
- Adobe--Without using /,% ,+and * operators. write a function to check whether a number is divisible by 3 or nor?
- How To Read and Write BLOB Data by Using ADO.NET with Visual C# .NET
- How To Read and Write BLOB Data by Using ADO.NET with Visual C# .NET
- SQL: How to get a series of dates in sql using connect by
- How To Display Hierarchical Data by Using Nested Repeater Controls and Visual C# .NET(ZT)
- How To Open Physical Standby For Read Write Testing and Flashback
- How to write to an event log by using Visual C#
- How to mount partition with ntfs file system and read write access
- virtual memory layout and how to get it by the correspoinding functuon
- how to get Multilevel BOM in Windchill by using...
- Using Reflection and Expression to get Property with better performance.
- How To Display Hierarchical Data by Using Nested Repeater Controls and Visual
- How to write to TO CHAR and TO DATE SQL SERVER
- Issue when try to get select count using Linq to SQL
- Read and Write to a Keyboard device in Linux using C++
- Read and Write sample on Cassandra by using Aquiles...
- Using Presto to combine data from Hive and MySQL in one SQL-like query
- SQL Query to Linq (Group By and Sum)
- Collecting performance counters and using SQL Server to analyze the data