It's simple to remove duplicates from a table. Here is how.
Assume you have a table with duplicate rows:
Table: emp
id (Primary Key)
name
1
Ela
2
Ramesh
3
Ela
4
Ashok
5
Ela
In the table above, 'Ela' is present more than once with different ids. One way to remove the duplicate is GROUP the rows by name alone and then insert the results to a new table with newly generated ids. But the new ids will not match the old (at least for non-duplicate rows).
To ensure the ids will match, we need a workaround, which is comparatively simple.
Step 1:
Try to get the ids of the unique items present in the table. This can be done either by MIN or MAX function. For example,
SELECT
MIN(id)
FROM
emp
GROUP BY
name;
This will result in the following table:
min
1
3
5
Step 2:
Hah... this gives me the unique id values. This is enough to remove the other rows using a sub query. Here it is:
DELETE
FROM
emp
WHERE
id NOT IN
(
SELECT
MIN(id)
FROM
emp
GROUP BY
name
);
That's all folks! The result is a cleaned up table without duplicates:
No comments:
Post a Comment