IT박스

T-SQL의 PIVOT 함수 이해

itboxs 2020. 11. 18. 08:49
반응형

T-SQL의 PIVOT 함수 이해


저는 SQL을 처음 접했습니다.

다음과 같은 테이블이 있습니다.

ID | TeamID | UserID | ElementID | PhaseID | Effort
-----------------------------------------------------
1  |   1    |  1      |   3       |  5     |   6.74
2  |   1    |  1      |   3       |  6     |   8.25
3  |   1    |  1      |   4       |  1     |   2.23
4  |   1    |  1      |   4       |  5     |   6.8
5  |   1    |  1      |   4       |  6     |   1.5

그리고 나는 이와 같은 데이터를 얻으라고 들었습니다.

ElementID | PhaseID1 | PhaseID5 | PhaseID6
--------------------------------------------
    3     |   NULL   |   6.74   |   8.25
    4     |   2.23   |   6.8    |   1.5

PIVOT 기능을 사용해야한다는 것을 알고 있습니다. 그러나 그것을 명확하게 이해할 수 없습니다. 누군가가 위의 경우에 설명 할 수 있다면 큰 도움이 될 것입니다. (또는 대안이 있다면)


PIVOT한 열의 데이터를 여러 열로 회전 하는 데 사용됩니다.

예를 들어 다음은 회전하려는 열을 하드 코딩하는 것을 의미하는 STATIC Pivot입니다.

create table temp
(
  id int,
  teamid int,
  userid int,
  elementid int,
  phaseid int,
  effort decimal(10, 5)
)

insert into temp values (1,1,1,3,5,6.74)
insert into temp values (2,1,1,3,6,8.25)
insert into temp values (3,1,1,4,1,2.23)
insert into temp values (4,1,1,4,5,6.8)
insert into temp values (5,1,1,4,6,1.5)

select elementid
  , [1] as phaseid1
  , [5] as phaseid5
  , [6] as phaseid6
from
(
  select elementid, phaseid, effort
  from temp
) x
pivot
(
  max(effort)
  for phaseid in([1], [5], [6])
)p

다음은 작동하는 버전 SQL 데모 입니다.

열 목록을 동적으로 생성하고 PIVOT을 수행하는 동적 PIVOT을 통해이 작업을 수행 할 수도 있습니다.

DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX);

select @cols = STUFF((SELECT distinct ',' + QUOTENAME(c.phaseid) 
            FROM temp c
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

set @query = 'SELECT elementid, ' + @cols + ' from 
            (
                select elementid, phaseid, effort
                from temp
           ) x
            pivot 
            (
                 max(effort)
                for phaseid in (' + @cols + ')
            ) p '


execute(@query)

두 가지 모두에 대한 결과 :

ELEMENTID   PHASEID1    PHASEID5    PHASEID6
3           Null        6.74        8.25
4           2.23        6.8         1.5

이것들은 아주 기본적인 피벗 예제입니다.

SQL SERVER – PIVOT 및 UNPIVOT 테이블 예

제품 테이블에 대한 위 링크의 예 :

SELECT PRODUCT, FRED, KATE
FROM (
SELECT CUST, PRODUCT, QTY
FROM Product) up
 PIVOT (SUM(QTY) FOR CUST IN (FRED, KATE)) AS pvt
ORDER BY PRODUCT

렌더링 :

 PRODUCT FRED  KATE
 --------------------
 BEER     24    12
 MILK      3     1
 SODA   NULL     6
 VEG    NULL     5

유사한 예제는 블로그 게시물 SQL Server의 피벗 테이블에서 찾을 수 있습니다 . 간단한 샘플


호환성 오류를 설정하려면

피벗 기능을 사용하기 전에 이것을 사용하십시오

ALTER DATABASE [dbname] SET COMPATIBILITY_LEVEL = 100  

    SELECT <non-pivoted column>,
    [first pivoted column] AS <column name>,
    [second pivoted column] AS <column name>,
    ...
    [last pivoted column] AS <column name>
FROM
    (<SELECT query that produces the data>)
    AS <alias for the source query>
PIVOT
(
    <aggregation function>(<column being aggregated>)
FOR
[<column that contains the values that will become column headers>]
    IN ( [first pivoted column], [second pivoted column],
    ... [last pivoted column])
) AS <alias for the pivot table>
<optional ORDER BY clause>;

USE AdventureWorks2008R2 ;
GO
SELECT DaysToManufacture, AVG(StandardCost) AS AverageCost 
FROM Production.Product
GROUP BY DaysToManufacture;

    DaysToManufacture          AverageCost
0                          5.0885
1                          223.88
2                          359.1082
4                          949.4105

    -- Pivot table with one row and five columns
SELECT 'AverageCost' AS Cost_Sorted_By_Production_Days, 
[0], [1], [2], [3], [4]
FROM
(SELECT DaysToManufacture, StandardCost 
    FROM Production.Product) AS SourceTable
PIVOT
(
AVG(StandardCost)
FOR DaysToManufacture IN ([0], [1], [2], [3], [4])
) AS PivotTable;




Here is the result set.
Cost_Sorted_By_Production_Days    0         1         2           3       4       
AverageCost                       5.0885    223.88    359.1082    NULL    949.4105

나는 이것에 익숙하지 않았고 그것에 대한 좋은 게시물을 작성했습니다 ... 내 문제는 집계를 올바르게 적용하는 방법을 이해하는 것이었고 여기에 내 게시물이 있습니다. http://jaider.net/posts/1176-pivot-in-sql-server -올바른 집계 결과 /

@bluefeet 솔루션에서 elementid"invisible"의 핵심 열 임을 언급하는 것이 중요합니다 Group By. 또한 대체 할 수 elementid또는 같은 열을 추가 userid.

enter image description here


아무도 언급하지 않은 여기에 추가 할 것이 있습니다.

The pivot function works great when the source has 3 columns: One for the aggregate, one to spread as columns with for, and one as a pivot for row distribution. In the product example it's QTY, CUST, PRODUCT.

However, if you have more columns in the source it will break the results into multiple rows instead of one row per pivot based on unique values per additional column (as Group By would do in a simple query).

See this example, ive added a timestamp column to the source table:

enter image description here

Now see its impact:

SELECT CUST, MILK

FROM Product
-- FROM (SELECT CUST, Product, QTY FROM PRODUCT) p
PIVOT (
    SUM(QTY) FOR PRODUCT IN (MILK)
) AS pvt

ORDER BY CUST

enter image description here


In order to fix this, you can either pull a subquery as a source as everyone has done above - with only 3 columns (this is not always going to work for your scenario, imagine if you need to put a where condition for the timestamp).

Second solution is to use a group by and do a sum of the pivoted column values again.

SELECT 
CUST, 
sum(MILK) t_MILK

FROM Product
PIVOT (
    SUM(QTY) FOR PRODUCT IN (MILK)
) AS pvt

GROUP BY CUST
ORDER BY CUST

GO

enter image description here


A pivot is used to convert one of the columns in your data set from rows into columns (this is typically referred to as the spreading column). In the example you have given, this means converting the PhaseID rows into a set of columns, where there is one column for each distinct value that PhaseID can contain - 1, 5 and 6 in this case.

These pivoted values are grouped via the ElementID column in the example that you have given.

Typically you also then need to provide some form of aggregation that gives you the values referenced by the intersection of the spreading value (PhaseID) and the grouping value (ElementID). Although in the example given the aggregation that will be used is unclear, but involves the Effort column.

Once this pivoting is done, the grouping and spreading columns are used to find an aggregation value. Or in your case, ElementID and PhaseIDX lookup Effort.

Using the grouping, spreading, aggregation terminology you will typically see example syntax for a pivot as:

WITH PivotData AS
(
    SELECT <grouping column>
        , <spreading column>
        , <aggregation column>
    FROM <source table>
)
SELECT <grouping column>, <distinct spreading values>
FROM PivotData
    PIVOT (<aggregation function>(<aggregation column>)
        FOR <spreading column> IN <distinct spreading values>));

This gives a graphical explanation of how the grouping, spreading and aggregation columns convert from the source to pivoted tables if that helps further.

참고URL : https://stackoverflow.com/questions/10428993/understanding-pivot-function-in-t-sql

반응형