感谢以下博文
http://www.2cto.com/database/201210/162822.html
percent_rank()
percent_rank函数以0到1之间的分数形式返回某个值在数据分区中的排名。 建表:Sql代码 create table SMALL_CUSTOMERS(CUSTOMER_ID NUMBER,SUM_ORDERS NUMBER); insert into SMALL_CUSTOMERS (CUSTOMER_ID, SUM_ORDERS) values (1000, 10); insert into SMALL_CUSTOMERS (CUSTOMER_ID, SUM_ORDERS) values (1000, 20); insert into SMALL_CUSTOMERS (CUSTOMER_ID, SUM_ORDERS) values (1000, 30); insert into SMALL_CUSTOMERS (CUSTOMER_ID, SUM_ORDERS) values (800, 5); insert into SMALL_CUSTOMERS (CUSTOMER_ID, SUM_ORDERS) values (800, 10); insert into SMALL_CUSTOMERS (CUSTOMER_ID, SUM_ORDERS) values (800, 1); 具体用法:Sql代码 select customer_id, sum_orders, percent_rank() over(order by sum_orders) percentRank from small_customers t;
percentile_cont()
官网文档地址:http://docs.oracle.com/cd/E11882_01/server.112/e41084/functions127.htm#SQLRF00687
Purpose
PERCENTILE_CONT is an inverse distribution function that assumes a continuous distribution model. It takes a percentile value and a sort specification, and returns an interpolated value that would fall into that percentile value with respect to the sort specification. Nulls are ignored in the calculation.
This function takes as an argument any numeric data type or any nonnumeric data type that can be implicitly converted to a numeric data type. The function returns the same data type as the numeric data type of the argument.
The first expr must evaluate to a numeric value between 0 and 1, because it is a percentile value. This expr must be constant within each aggregation group. The ORDER BY clause takes a single expression that must be a numeric or datetime value, as these are the types over which Oracle can perform interpolation.
The result of PERCENTILE_CONT is computed by linear interpolation between values after ordering them. Using the percentile value (P) and the number of rows (N) in the aggregation group, you can compute the row number you are interested in after ordering the rows with respect to the sort specification. This row number (RN) is computed according to the formula RN = (1+(P*(N-1)). The final result of the aggregate function is computed by linear interpolation between the values from rows at row numbers CRN = CEILING(RN) and FRN = FLOOR(RN).
The final result will be:
If (CRN = FRN = RN) then the result is(value of expression from row at RN)Otherwise the result is(CRN - RN) * (value of expression for row at FRN) +(RN - FRN) * (value of expression for row at CRN)
You can use the PERCENTILE_CONT function as an analytic function. You can specify only the query_partitioning_clause in its OVER clause. It returns, for each row, the value that would fall into the specified percentile among a set of values within each partition.
The MEDIAN function is a specific case of PERCENTILE_CONT where the percentile value defaults to 0.5. For more information, refer to MEDIAN.
percentile
percentile_cont:percentile_cont函数对于计算内插值是非常有用的。percentile_cont函数接收一个0到1之间的几率值并返回与声明了排序的percent_rank函数计算值相等的内插值百分比。
语法:percentile_con(expr) within group(sort-clause) over(partition-clause order-by-clause)具体用法:Sql代码 select customer_id, sum_orders, percent_rank() over(order by sum_orders) percentRank, percentile_cont(0.4) within group(order by sum_orders) over(partition by customer_id) percentileCont from small_customers t;
取 p=0.4 可以看出n=3
rn=1.8 crn=2 frn=1 得出 (2-1.8)*1+(1.8-1)*5=4.2 (2-1.8)*10+(1.8-1)*20=18
percentile_disc()
PERCENTILE_DISC
这里就不做详细分析。
Syntax
Description of the illustration \’\’percentile_disc.gif\’\’
Purpose
PERCENTILE_DISC is an inverse distribution function that assumes a discrete distribution model. It takes a percentile value and a sort specification and returns an element from the set. Nulls are ignored in the calculation.
This function takes as an argument any numeric data type or any nonnumeric data type that can be implicitly converted to a numeric data type. The function returns the same data type as the numeric data type of the argument.
The first expr must evaluate to a numeric value between 0 and 1, because it is a percentile value. This expression must be constant within each aggregate group. The ORDER BY clause takes a single expression that can be of any type that can be sorted.
For a given percentile value P, PERCENTILE_DISC sorts the values of the expression in the ORDER BY clause and returns the value with the smallest CUME_DIST value (with respect to the same sort specification) that is greater than or equal to P.
Aggregate Example
See aggregate example for PERCENTILE_CONT.
Analytic Example
The following example calculates the median discrete percentile of the salary of each employee in the sample table hr.employees:
SELECT last_name, salary, department_id,PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY salary DESC)OVER (PARTITION BY department_id) \"Percentile_Disc\",CUME_DIST() OVER (PARTITION BY department_idORDER BY salary DESC) \"Cume_Dist\"FROM employeesWHERE department_id in (30, 60)ORDER BY last_name, salary, department_id;LAST_NAME SALARY DEPARTMENT_ID Percentile_Disc Cume_Dist------------------------- ---------- ------------- --------------- ----------Austin 4800 60 4800 .8Baida 2900 30 2900 .5Colmenares 2500 30 2900 1Ernst 6000 60 4800 .4Himuro 2600 30 2900 .833333333Hunold 9000 60 4800 .2Khoo 3100 30 2900 .333333333Lorentz 4200 60 4800 1Pataballa 4800 60 4800 .8Raphaely 11000 30 2900 .166666667Tobias 2800 30 2900 .666666667
The median value for Department 30 is 2900, which is the value whose corresponding percentile (Cume_Dist) is the smallest value greater than or equal to 0.5. The median value for Department 60 is 4800, which is the value whose corresponding percentile is the smallest value greater than or equal to 0.5.
percentile_disc:percentile_disc函数在功能上类似于percentile_cont函数,只是percentile_cont函数使用了连续分布模型,而percentile_disc函数使用了离期 分布模型。当没有值与指定的percent_rank精确匹配的时候,percentile_cont(0.5)会计算两个离得最近的值的平均值。相反,在升序排列的情况下,percentile_disc函数只取比所传递的参数percent_rank值更大的值。在降序排列的时候,percentile_disc函数只取比所传递的参数percent_rank值更小的值。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29074224/viewspace-2123194/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/29074224/viewspace-2123194/