SQL 서버에서 삭제 중복 레코드?

EmployeeName 테이블 직원이라는 이름의 열을 고려하십시오. 목표는 EmployeeName의 필드를 기준으로 반복되는 기록을 삭제하는 것입니다.

EmployeeName
------------
Anand
Anand
Anil
Dipak
Anil
Dipak
Dipak
Anil

하나의 쿼리를 사용하여, 나는 반복되는 레코드를 삭제할.

이것은 어떻게 SQL Server의 TSQL 수행 할 수 있습니다?

해결법

==============================
1.당신은 윈도우 기능이 작업을 수행 할 수 있습니다. 이 EMPID에 의해 속는를 주문하고, 모든하지만 첫 번째 삭제됩니다.

당신은 윈도우 기능이 작업을 수행 할 수 있습니다. 이 EMPID에 의해 속는를 주문하고, 모든하지만 첫 번째 삭제됩니다.
```
delete x from (
  select *, rn=row_number() over (partition by EmployeeName order by empId)
  from Employee 
) x
where rn > 1;
```
삭제 될 것입니다 무엇을 볼 수있는 선택으로 실행
```
select *
from (
  select *, rn=row_number() over (partition by EmployeeName order by empId)
  from Employee 
) x
where rn > 1;
```
==============================
2.당신의 Employee 테이블도 (아래의 예 ID) 고유 열이 있다고 가정하면, 다음과 같은 작동합니다 :

당신의 Employee 테이블도 (아래의 예 ID) 고유 열이 있다고 가정하면, 다음과 같은 작동합니다 :
```
delete from Employee 
where ID not in
(
    select min(ID)
    from Employee 
    group by EmployeeName 
);
```
이 테이블에서 가장 낮은 ID와 버전을 떠날 것입니다.

편집하다 다시 McGyver의 코멘트 - SQL 2012로

2008 R2 및 이전 버전의 경우,

2008R2 위해 당신은 MIN, 예를 들어, 지원하는 유형으로 GUID를 캐스팅해야합니다
```
delete from GuidEmployees
where CAST(ID AS binary(16)) not in
(
    select min(CAST(ID AS binary(16)))
    from GuidEmployees
    group by EmployeeName 
);
```
SQL 2008에서 다양한 유형의 SqlFiddle

는 SQL 2012에서 다양한 유형의 SqlFiddle
==============================
3.당신은 다음과 같은 것을 시도해 볼 수도 있습니다 :

당신은 다음과 같은 것을 시도해 볼 수도 있습니다 :
```
delete T1
from MyTable T1, MyTable T2
where T1.dupField = T2.dupField
and T1.uniqueField > T2.uniqueField  
```
(이것은 당신이 정수를 기반으로 고유 한 필드가 있다고 가정)

내가 말하고 싶지만 개인적으로하지만 당신은 오히려 포스트 수정 - 그것 작동으로보다 발생하기 전에 중복 된 항목이 데이터베이스에 추가되고 있다는 사실을 해결하기 위해 노력하고 더 잘했다.

==============================

4.

DELETE
FROM MyTable
WHERE ID NOT IN (
     SELECT MAX(ID)
     FROM MyTable
     GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)

WITH TempUsers (FirstName, LastName, duplicateRecordCount)
AS
(
    SELECT FirstName, LastName,
    ROW_NUMBER() OVER (PARTITIONBY FirstName, LastName ORDERBY FirstName) AS duplicateRecordCount
    FROM dbo.Users
)
DELETE
FROM TempUsers
WHERE duplicateRecordCount > 1

==============================

5.

WITH CTE AS
(
   SELECT EmployeeName, 
          ROW_NUMBER() OVER(PARTITION BY EmployeeName ORDER BY EmployeeName) AS R
   FROM employee_table
)
DELETE CTE WHERE R > 1;

공통 테이블 식의 마법.

==============================

6.시험

시험

DELETE
FROM employee
WHERE rowid NOT IN (SELECT MAX(rowid) FROM employee
GROUP BY EmployeeName);

==============================

7.당신은 중복을 제거하는 방법을 찾고, 아직 중복으로 테이블을 가리키는 외래 키가있는 경우, 당신은 속도가 느린하면서도 효과적인 커서를 사용하여 다음과 같은 접근 방식을 취할 수있다.

당신은 중복을 제거하는 방법을 찾고, 아직 중복으로 테이블을 가리키는 외래 키가있는 경우, 당신은 속도가 느린하면서도 효과적인 커서를 사용하여 다음과 같은 접근 방식을 취할 수있다.

그것은 외래 키 테이블에 중복 키를 재배치합니다.

create table #properOlvChangeCodes(
    id int not null,
    name nvarchar(max) not null
)

DECLARE @name VARCHAR(MAX);
DECLARE @id INT;
DECLARE @newid INT;
DECLARE @oldid INT;

DECLARE OLVTRCCursor CURSOR FOR SELECT id, name FROM Sales_OrderLineVersionChangeReasonCode; 
OPEN OLVTRCCursor;
FETCH NEXT FROM OLVTRCCursor INTO @id, @name;
WHILE @@FETCH_STATUS = 0  
BEGIN  
        -- determine if it should be replaced (is already in temptable with name)
        if(exists(select * from #properOlvChangeCodes where Name=@name)) begin
            -- if it is, finds its id
            Select  top 1 @newid = id
            from    Sales_OrderLineVersionChangeReasonCode
            where   Name = @name

            -- replace terminationreasoncodeid in olv for the new terminationreasoncodeid
            update Sales_OrderLineVersion set ChangeReasonCodeId = @newid where ChangeReasonCodeId = @id

            -- delete the record from the terminationreasoncode
            delete from Sales_OrderLineVersionChangeReasonCode where Id = @id
        end else begin
            -- insert into temp table if new
            insert into #properOlvChangeCodes(Id, name)
            values(@id, @name)
        end

        FETCH NEXT FROM OLVTRCCursor INTO @id, @name;
END;
CLOSE OLVTRCCursor;
DEALLOCATE OLVTRCCursor;

drop table #properOlvChangeCodes

==============================

8.다음은 런타임에 정의 할 수 있습니다 원하는 기본 키에 따라 ID 열이있는 테이블에서 레코드를 중복 제거의 좋은 방법입니다. 전에 나는 다음과 같은 코드를 사용하여 작업 할 샘플 데이터 세트를 채울 수 있습니다 시작 :

다음은 런타임에 정의 할 수 있습니다 원하는 기본 키에 따라 ID 열이있는 테이블에서 레코드를 중복 제거의 좋은 방법입니다. 전에 나는 다음과 같은 코드를 사용하여 작업 할 샘플 데이터 세트를 채울 수 있습니다 시작 :

if exists (select 1 from sys.all_objects where type='u' and name='_original')
drop table _original

declare @startyear int = 2017
declare @endyear int = 2018
declare @iterator int = 1
declare @income money = cast((SELECT round(RAND()*(5000-4990)+4990 , 2)) as money)
declare @salesrepid int = cast(floor(rand()*(9100-9000)+9000) as varchar(4))
create table #original (rowid int identity, monthyear varchar(max), salesrepid int, sale money)
while @iterator<=50000 begin
insert #original 
select (Select cast(floor(rand()*(@endyear-@startyear)+@startyear) as varchar(4))+'-'+ cast(floor(rand()*(13-1)+1) as varchar(2)) ),  @salesrepid , @income
set  @salesrepid  = cast(floor(rand()*(9100-9000)+9000) as varchar(4))
set @income = cast((SELECT round(RAND()*(5000-4990)+4990 , 2)) as money)
set @iterator=@iterator+1
end  
update #original
set monthyear=replace(monthyear, '-', '-0') where  len(monthyear)=6

select * into _original from #original

다음으로는 유형이라고 COLUMNNAMES를 만듭니다 :

create type ColumnNames AS table   
(Columnnames varchar(max))

마지막으로 나는 다음과 같은 3 가지주의 사항과 함께 저장된 프로 시저를 생성합니다 : 1. PROC는 당신이 당신의 데이터베이스에서 삭제 된 테이블의 이름을 정의하는 필수 매개 변수의 @tablename를 취할 것입니다. 2. PROC 당신이에 대해 삭제하는 것을 원하는 기본 키를 구성하는 필드를 정의하는 데 사용할 수있는 선택적 매개 변수 @columns 있습니다. 이 필드를 비워두면, 그것은 가정이 원하는 기본 키까지 식별 컬럼 메이크업 이외의 모든 필드. 중복 레코드가 삭제되는 경우 3. 그것의 ID 열에서 가장 낮은 값을 가진 레코드가 유지됩니다.

여기 내 delete_dupes 저장된 프로 시저입니다 :

 create proc delete_dupes (@tablename varchar(max), @columns columnnames readonly) 
 as
 begin

declare @table table (iterator int, name varchar(max), is_identity int)
declare @tablepartition table (idx int identity, type varchar(max), value varchar(max))
declare @partitionby varchar(max)  
declare @iterator int= 1 


if exists (select 1 from @columns)  begin
declare @columns1 table (iterator int, columnnames varchar(max))
insert @columns1
select 1, columnnames from @columns
set @partitionby = (select distinct 
                substring((Select ', '+t1.columnnames 
                From @columns1 t1
                Where T1.iterator = T2.iterator
                ORDER BY T1.iterator
                For XML PATH ('')),2, 1000)  partition
From @columns1 T2 )

end

insert @table 
select 1, a.name, is_identity from sys.all_columns a join sys.all_objects b on a.object_id=b.object_id
where b.name = @tablename  

declare @identity varchar(max)= (select name from @table where is_identity=1)

while @iterator>=0 begin 
insert @tablepartition
Select          distinct case when @iterator=1 then 'order by' else 'over (partition by' end , 
                substring((Select ', '+t1.name 
                From @table t1
                Where T1.iterator = T2.iterator and is_identity=@iterator
                ORDER BY T1.iterator
                For XML PATH ('')),2, 5000)  partition
From @table T2
set @iterator=@iterator-1
end 

declare @originalpartition varchar(max)

if @partitionby is null begin
select @originalpartition  = replace(b.value+','+a.type+a.value ,'over (partition by','')  from @tablepartition a cross join @tablepartition b where a.idx=2 and b.idx=1
select @partitionby = a.type+a.value+' '+b.type+a.value+','+b.value+') rownum' from @tablepartition a cross join @tablepartition b where a.idx=2 and b.idx=1
 end
 else
 begin
 select @originalpartition=b.value +','+ @partitionby from @tablepartition a cross join @tablepartition b where a.idx=2 and b.idx=1
 set @partitionby = (select 'OVER (partition by'+ @partitionby  + ' ORDER BY'+ @partitionby + ','+b.value +') rownum'
 from @tablepartition a cross join @tablepartition b where a.idx=2 and b.idx=1)
 end


exec('select row_number() ' + @partitionby +', '+@originalpartition+' into ##temp from '+ @tablename+'')


exec(
'delete a from _original a 
left join ##temp b on a.'+@identity+'=b.'+@identity+' and rownum=1  
where b.rownum is null')

drop table ##temp

end

이 준수되면, 당신은 시저를 실행하여 모든 중복 레코드를 삭제할 수 있습니다. 원하는 기본 키 사용이 전화를 정의하지 않고 속는을 삭제하려면 :

exec delete_dupes '_original'

정의 원하는 기본 키의 사용이 호출을 기반으로 속는을 삭제하려면 :

declare @table1 as columnnames
insert @table1
values ('salesrepid'),('sale')
exec delete_dupes '_original' , @table1

==============================

9.

delete from person 
where ID not in
(
        select t.id from 
        (select min(ID) as id from person 
         group by email 
        ) as t
);

==============================

10.삭제 아래의 방법도 참조하십시오.

삭제 아래의 방법도 참조하십시오.

Declare @Employee table (EmployeeName varchar(10))

Insert into @Employee values 
('Anand'),('Anand'),('Anil'),('Dipak'),
('Anil'),('Dipak'),('Dipak'),('Anil')

Select * from @Employee

@Employee라는 이름의 샘플 테이블을 작성하고 주어진 데이터를로드.

Delete  aliasName from (
Select  *,
        ROW_NUMBER() over (Partition by EmployeeName order by EmployeeName) as rowNumber
From    @Employee) aliasName 
Where   rowNumber > 1

Select * from @Employee

결과:

나는이가 그 사람에게 도움이 그냥 넣다 게시, 6 년 전 요구된다 알고있다.

from https://stackoverflow.com/questions/3317433/delete-duplicate-records-in-sql-server by cc-by-sa and MIT license

'SQL' 카테고리의 다른 글

[SQL] PostgreSQL의에서 중복 레코드 삭제 (0)	2020.03.22
[SQL] 안드로이드 SQLite는 것은 특정 행을 업데이트하는 방법 (0)	2020.03.22
[SQL] JSON 형 내부 배열 요소를 쿼리 (0)	2020.03.22
[SQL] 어떻게 그룹 시간 시간으로 10 분 (0)	2020.03.22
[SQL] 어떻게 테이블의 열 이름을 반환합니까? (0)	2020.03.22

복붙노트

[SQL] SQL 서버에서 삭제 중복 레코드?

SQL 서버에서 삭제 중복 레코드?

해결법

1.당신은 윈도우 기능이 작업을 수행 할 수 있습니다. 이 EMPID에 의해 속는를 주문하고, 모든하지만 첫 번째 삭제됩니다.

2.당신의 Employee 테이블도 (아래의 예 ID) 고유 열이 있다고 가정하면, 다음과 같은 작동합니다 :

3.당신은 다음과 같은 것을 시도해 볼 수도 있습니다 :

4.

5.

6.시험

7.당신은 중복을 제거하는 방법을 찾고, 아직 중복으로 테이블을 가리키는 외래 키가있는 경우, 당신은 속도가 느린하면서도 효과적인 커서를 사용하여 다음과 같은 접근 방식을 취할 수있다.

8.다음은 런타임에 정의 할 수 있습니다 원하는 기본 키에 따라 ID 열이있는 테이블에서 레코드를 중복 제거의 좋은 방법입니다. 전에 나는 다음과 같은 코드를 사용하여 작업 할 샘플 데이터 세트를 채울 수 있습니다 시작 :

9.

10.삭제 아래의 방법도 참조하십시오.

'SQL' 카테고리의 다른 글

티스토리툴바