IT박스

SQL Server : 첫 번째 행에 조인하는 방법

itboxs 2020. 9. 30. 09:02
반응형

SQL Server : 첫 번째 행에 조인하는 방법


구체적이지만 가설적인 예를 사용하겠습니다.

주문 에는 일반적으로 하나의 광고 항목 만 있습니다 .

명령:

OrderGUID   OrderNumber
=========   ============
{FFB2...}   STL-7442-1      
{3EC6...}   MPT-9931-8A

품목 :

LineItemGUID   Order ID Quantity   Description
============   ======== ========   =================================
{098FBE3...}   1        7          prefabulated amulite
{1609B09...}   2        32         spurving bearing

그러나 때때로 두 개의 광고 항목이있는 주문이 있습니다.

LineItemID   Order ID    Quantity   Description
==========   ========    ========   =================================
{A58A1...}   6,784,329   5          pentametric fan
{0E9BC...}   6,784,329   5          differential girdlespring 

일반적으로 사용자에게 주문을 표시 할 때 :

SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description
FROM Orders
    INNER JOIN LineItems 
    ON Orders.OrderID = LineItems.OrderID

주문에 단일 항목을 표시하고 싶습니다. 그러나이 가끔 순서가 항목을 두를 포함하는 (또는 그 이상)과 함께 주문 것이라고 표시복제 :

OrderNumber   Quantity   Description
===========   ========   ====================
STL-7442-1    7          prefabulated amulite
MPT-9931-8A   32         spurving bearing
KSG-0619-81   5          panametric fan
KSG-0619-81   5          differential girdlespring

내가 정말 원하는 것은 SQL 서버가하는 것입니다 단지 하나를 선택 이 될 수있는 바와 같이, 충분한 :

OrderNumber   Quantity   Description
===========   ========   ====================
STL-7442-1    7          prefabulated amulite
MPT-9931-8A   32         differential girdlespring
KSG-0619-81   5          panametric fan

모험심이 생기면 사용자에게 둘 이상의 항목이 있음을 나타내는 줄임표를 표시 할 수 있습니다.

OrderNumber   Quantity   Description
===========   ========   ====================
STL-7442-1    7          prefabulated amulite
MPT-9931-8A   32         differential girdlespring
KSG-0619-81   5          panametric fan, ...

그래서 질문은

  • "중복"행 제거
  • 중복을 피하기 위해 행 중 하나에 만 조인

첫번째 시도

첫 번째 순진한 시도는 " TOP 1 "광고 항목 에만 참여하는 것이 었습니다 .

SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description
FROM Orders
    INNER JOIN (
       SELECT TOP 1 LineItems.Quantity, LineItems.Description
       FROM LineItems
       WHERE LineItems.OrderID = Orders.OrderID) LineItems2
    ON 1=1

그러나 그것은 오류를 제공합니다.

열 또는 접두사 'Orders'가 쿼리에 사용 된
테이블 이름 또는 별칭 이름과 일치 하지 않습니다
.

아마도 내부 선택이 외부 테이블을 보지 못하기 때문일 것입니다.


SELECT   Orders.OrderNumber, LineItems.Quantity, LineItems.Description
FROM     Orders
JOIN     LineItems
ON       LineItems.LineItemGUID =
         (
         SELECT  TOP 1 LineItemGUID 
         FROM    LineItems
         WHERE   OrderID = Orders.OrderID
         )

SQL Server 2005이상, 당신은 대체 할 수 INNER JOIN와 함께 CROSS APPLY:

SELECT  Orders.OrderNumber, LineItems2.Quantity, LineItems2.Description
FROM    Orders
CROSS APPLY
        (
        SELECT  TOP 1 LineItems.Quantity, LineItems.Description
        FROM    LineItems
        WHERE   LineItems.OrderID = Orders.OrderID
        ) LineItems2

주의하시기 바랍니다 TOP 1없이 ORDER BY이 쿼리는 당신에게 주문 당 한 줄의 항목을 얻을 것이다, 그러나 하나가 될 것이다 정의되지 않은 : 결정되지 않습니다.

쿼리를 여러 번 호출하면 기본이 변경되지 않았더라도 동일한 주문에 대해 서로 다른 광고 항목이 제공 될 수 있습니다.

If you want deterministic order, you should add an ORDER BY clause to the innermost query.


I know this question was answered a while ago, but when dealing with large data sets, nested queries can be costly. Here is a different solution where the nested query will only be ran once, instead of for each row returned.

SELECT 
  Orders.OrderNumber,
  LineItems.Quantity, 
  LineItems.Description
FROM 
  Orders
  INNER JOIN (
    SELECT
      Orders.OrderNumber,
      Max(LineItem.LineItemID) AS LineItemID
    FROM
      Orders INNER JOIN LineItems
      ON Orders.OrderNumber = LineItems.OrderNumber
    GROUP BY Orders.OrderNumber
  ) AS Items ON Orders.OrderNumber = Items.OrderNumber
  INNER JOIN LineItems 
  ON Items.LineItemID = LineItems.LineItemID

You could do:

SELECT 
  Orders.OrderNumber, 
  LineItems.Quantity, 
  LineItems.Description
FROM 
  Orders INNER JOIN LineItems 
  ON Orders.OrderID = LineItems.OrderID
WHERE
  LineItems.LineItemID = (
    SELECT MIN(LineItemID) 
    FROM   LineItems
    WHERE  OrderID = Orders.OrderID
  )

This requires an index (or primary key) on LineItems.LineItemID and an index on LineItems.OrderID or it will be slow.


@Quassnoi answer is good, in some cases (especially if the outer table is big), a more efficient query might be with using windowed functions, like this:

SELECT  Orders.OrderNumber, LineItems2.Quantity, LineItems2.Description
FROM    Orders
LEFT JOIN 
        (
        SELECT  LineItems.Quantity, LineItems.Description, OrderId, ROW_NUMBER()
                OVER (PARTITION BY OrderId ORDER BY (SELECT NULL)) AS RowNum
        FROM    LineItems

        ) LineItems2 ON LineItems2.OrderId = Orders.OrderID And RowNum = 1

Sometimes you just need to test which query gives better performance.


,Another aproach using common table expression:

with firstOnly as (
    select Orders.OrderNumber, LineItems.Quantity, LineItems.Description, ROW_NUMBER() over (partiton by Orders.OrderID order by Orders.OrderID) lp
    FROM Orders
        join LineItems on Orders.OrderID = LineItems.OrderID
) select *
  from firstOnly
  where lp = 1

or, in the end maybe you would like to show all rows joined?

comma separated version here:

  select *
  from Orders o
    cross apply (
        select CAST((select l.Description + ','
        from LineItems l
        where l.OrderID = s.OrderID
        for xml path('')) as nvarchar(max)) l
    ) lines

Correlated sub queries are sub queries that depend on the outer query. It’s like a for loop in SQL. The sub-query will run once for each row in the outer query:

select * from users join widgets on widgets.id = (
    select id from widgets
    where widgets.user_id = users.id
    order by created_at desc
    limit 1
)

From SQL Server 2012 and onwards I think this will do the trick:

SELECT DISTINCT
    o.OrderNumber ,
    FIRST_VALUE(li.Quantity) OVER ( PARTITION BY o.OrderNumber ORDER BY li.Description ) AS Quantity ,
    FIRST_VALUE(li.Description) OVER ( PARTITION BY o.OrderNumber ORDER BY li.Description ) AS Description
FROM    Orders AS o
    INNER JOIN LineItems AS li ON o.OrderID = li.OrderID

EDIT: nevermind, Quassnoi has a better answer.

For SQL2K, something like this:

SELECT 
  Orders.OrderNumber
, LineItems.Quantity
, LineItems.Description
FROM (  
  SELECT 
    Orders.OrderID
  , Orders.OrderNumber
  , FirstLineItemID = (
      SELECT TOP 1 LineItemID
      FROM LineItems
      WHERE LineItems.OrderID = Orders.OrderID
      ORDER BY LineItemID -- or whatever else
      )
  FROM Orders
  ) Orders
JOIN LineItems 
  ON LineItems.OrderID = Orders.OrderID 
 AND LineItems.LineItemID = Orders.FirstLineItemID

My favorite way to run this query is with a not exists clause. I believe this is the most efficient way to run this sort of query:

select o.OrderNumber,
       li.Quantity,
       li.Description
from Orders as o
inner join LineItems as li
on li.OrderID = o.OrderID
where not exists (
    select 1
    from LineItems as li_later
    where li_later.OrderID = o.OrderID
    and li_later.LineItemGUID > li.LineItemGUID
    )

But I have not tested this method against other methods suggested here.


Tried the cross, works nicely, but takes slightly longer. Adjusted line columns to have max and added group which kept speed and dropped the extra record.

Here's the adjusted query:

SELECT Orders.OrderNumber, max(LineItems.Quantity), max(LineItems.Description)
FROM Orders
    INNER JOIN LineItems 
    ON Orders.OrderID = LineItems.OrderID
Group by Orders.OrderNumber

I solve a similar problem by using LEFT JOIN and GROUP BY Orders.OrderNumber. Is there a reason not to do it this way?

SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description
FROM Orders
    LEFT JOIN LineItems 
    ON Orders.OrderID = LineItems.OrderID
GROUP BY Orders.OrderNumber

I'll answer your answer question with an answer in your own question:

Orders             LineItems
+-------------+    +---------+----------+---------------+
| OrderNumber |    | OrderID | Quantity | Description   |
+-------------+    +---------+----------+---------------+
| 22586       |    | 22586   | 17       | Trunion       |
+-------------+    | 22586   | 3        | Girdle Spring |
                   +---------+----------+---------------+

Joining the two together on OrderNumber gives:

OrderNumber  Quantity  Description
-----------  --------  -------------
22586        17        Trunion
22586        3         Girdle Spring

2 row(s) affected

Where we wanted it to to return only one row:

OrderNumber  Quantity  Description
-----------  --------  -------------
22586        17        Trunion

1 row(s) affected

This is why I use GROUP BY Orders.OrderNumber which only returns one row per OrderNumber.

참고URL : https://stackoverflow.com/questions/2043259/sql-server-how-to-join-to-first-row

반응형