Find Duplicate Rows in Database Table

Posted February 28, 2006
Filed under: Lessons Learned |

This little chunk of SQL saved my ass today after I found out that most of the 223 students who had been automatically enrolled in an Activity Management system I just deployed were actually enrolled twice because the student list provided by another system was corrupt:

SELECT * FROM table_name WHERE field_name NOT IN (SELECT DISTINCT(field_name) FROM table_name)

Ideally the field in question would be UNIQUE, but in this scenario that is not the case because the field was a student number and this system supports multiple courses — meaning a single student number is allowed to exist multiple times so long as each instance is tied to a different course. New constraints and checks have been put in place to ensure this can’t happen again, but regardless; lesson learned 😉

16 comments so far

Relapse on March 22, 2006

Dunno if you realised this, but your code snippet appears to truncate rather than wrap. At least in FF 1.5.0.1 under XP.

That SQL statement looks like it’ll not return anything. I would’ve thought that:

SELECT * FROM table_name a WHERE 1

Reply
- Javame on December 18, 2012
  
  It gets real tricky when you have more than one column which defines uniqueness, its important to include both of them in group by clause as shown in this sql article, leaving any one of them will result in logical wrong result set.
  
  Reply
Andrew K on March 22, 2006

Hey mate, yeah that’s because pretty much every theme on wordpress.com doesn’t have decent <pre> styling 😦

I’ll try another one…

The SQL statement works fine, it returns any row where ‘field_name’ is not distinct. I was actually running it through a ref cursor, but lets not go there today… bad day in Oracle land 😉

Reply
Steve on June 21, 2006

I really can’t see that SQL query ever returning any results. Are you sure it’s right?

I am using SQL Server 2000 however, perhaps Oracle handles it differently.

It looks to me as though you are trying to select rows from the table where the value of the speicfied field is equal to a value for that field that isn’t in the table.

Reply
amar on July 26, 2006

select name,count(*) from X1
group by name
having count(*)>1

I thnk it wrks..

Reply
Blue on September 20, 2006

I could be way off, but won’t this query take a looong time to complete. My thinking is that for a table with 10000 rows, mostly distinct it will make about 10000^2 comparisons. Estimating (based on little knowledge) about 10 comparisons a second, it may still take about 116 days to complete. Again, I could be way off. If I am don’t hesitate to call me on it.

Reply
AA on September 20, 2006

I used the query posted above:

select name,count(*) from X1
group by name
having count(*)>1

on a table with 7,000 records and it took less than a 1 second to run. From my experience, it’s a great query and quick! Thanks for posting!

Reply
jami on January 18, 2007

select name,count(*) from X1
group by name
having count(*)>1 is working well.

Reply
Rajnii Sharma on October 25, 2007

I need to find out those duplicate records like company name having slight difference in their name (either punctuation or typo error) and two differen company id are created.

Reply
Газовик Станислав on May 20, 2009

Ну как же только так? Ищу, как можно уточнить данную тему.

Reply
jflklhhxyelg on October 6, 2009

nlgerbsgwpik

Reply
muyiwa on November 3, 2009

i like it.straight to the point.

Reply
Torres on April 4, 2010

If you wrote a movie, what would the title be?

Reply
pinoyden on March 17, 2012

Your query doesnt work for me but amar’s post from above works. Here it is.

select name,count(*) from X1 group by name having count(*)>1

Reply
ravi on April 17, 2012

Thnks buddy..

Reply
Nirmal on August 4, 2012

How do i check for duplicate items in a database
if previus data already in database so doesn’t insert new data

Reply

	arleenmoran6637 on Thought for the day: Leat…
	Javame on Find Duplicate Rows in Databas…
	Nirmal on Find Duplicate Rows in Databas…
	ravi on Find Duplicate Rows in Databas…
	pinoyden on Find Duplicate Rows in Databas…

not interactive