This kind of questions are asked many times in many Interviews, so better we should know how to answer them.
Let’s create a table first and insert some dummy duplicate records:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
CREATE TABLE IF NOT EXISTS `employee` ( `emp_id` int(3) NOT NULL AUTO_INCREMENT, `emp_name` varchar(100) NOT NULL, `emp_salary` int(10) NOT NULL, PRIMARY KEY (`emp_id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=11 ; -- -- Dumping data for table `employee` -- INSERT INTO `employee` (`emp_id`, `emp_name`, `emp_salary`) VALUES (1, 'Roj', 100), (2, 'Fren', 50), (3, 'Rin', 300), (4, 'Yen', 150), (5, 'Roj', 500), (6, 'Noop', 250), (7, 'Fren', 600), (8, 'Rin', 120), (9, 'Yen', 160), (10, 'Con', 590); |
Run this above query in a SQL browser and you will get a table like below:
emp_id | emp_name | emp_salary |
1 | Roj | 100 |
2 | Fren | 50 |
3 | Rin | 300 |
4 | Yen | 150 |
5 | Roj | 500 |
6 | Noop | 250 |
7 | Fren | 600 |
8 | Rin | 120 |
9 | Yen | 160 |
10 | Con | 590 |
Before deleting the duplicate records, lets see how to filter out only the duplicate values from the above table, lets write a query:
1 2 3 4 |
SELECT emp1.emp_name FROM employee emp1, employee emp2 WHERE emp1.emp_name = emp2.emp_name AND emp1.emp_id > emp2.emp_id |
If you run the above query, you will get only the duplicate records, the query is not so tough, creating alias for the same table and filtering duplicate records by matching the duplicate names (=) and with a (>) condition on emp_id.
Now, to delete these records, instead of SELECT you are going to use DELETE, lets do it:
1 2 3 |
DELETE emp1 FROM employee emp1, employee emp2 WHERE emp1.emp_name = emp2.emp_name AND emp1.emp_id > emp2.emp_id |
If you compare the SELECT query and DELETE query, you can see a small change between them, ie., column name (emp1.emp_name) is removed in DELETE query, because we cannot delete a single column using DELETE, and actually we have the entire row, so you have to use only the alias name (emp1) in DELETE query.