Find Duplicate Rows in Database Table

This little chunk of SQL saved my ass today after I found out that most of the 223 students who had been automatically enrolled in an Activity Management system I just deployed were actually enrolled twice because the student list provided by another system was corrupt:

SELECT * FROM table_name WHERE field_name NOT IN (SELECT DISTINCT(field_name) FROM table_name)

Ideally the field in question would be UNIQUE, but in this scenario that is not the case because the field was a student number and this system supports multiple courses — meaning a single student number is allowed to exist multiple times so long as each instance is tied to a different course. New constraints and checks have been put in place to ensure this can’t happen again, but regardless; lesson learned 😉

16 comments so far

  1. Relapse on

    Dunno if you realised this, but your code snippet appears to truncate rather than wrap. At least in FF 1.5.0.1 under XP.

    That SQL statement looks like it’ll not return anything. I would’ve thought that:

    SELECT * FROM table_name a WHERE 1

    • Javame on

      It gets real tricky when you have more than one column which defines uniqueness, its important to include both of them in group by clause as shown in this sql article, leaving any one of them will result in logical wrong result set.

  2. Andrew K on

    Hey mate, yeah that’s because pretty much every theme on wordpress.com doesn’t have decent <pre> styling 😦

    I’ll try another one…

    The SQL statement works fine, it returns any row where ‘field_name’ is not distinct. I was actually running it through a ref cursor, but lets not go there today… bad day in Oracle land 😉

  3. Steve on

    I really can’t see that SQL query ever returning any results. Are you sure it’s right?

    I am using SQL Server 2000 however, perhaps Oracle handles it differently.

    It looks to me as though you are trying to select rows from the table where the value of the speicfied field is equal to a value for that field that isn’t in the table.

  4. amar on

    select name,count(*) from X1
    group by name
    having count(*)>1

    I thnk it wrks..

  5. Blue on

    I could be way off, but won’t this query take a looong time to complete. My thinking is that for a table with 10000 rows, mostly distinct it will make about 10000^2 comparisons. Estimating (based on little knowledge) about 10 comparisons a second, it may still take about 116 days to complete. Again, I could be way off. If I am don’t hesitate to call me on it.

  6. AA on

    I used the query posted above:

    select name,count(*) from X1
    group by name
    having count(*)>1

    on a table with 7,000 records and it took less than a 1 second to run. From my experience, it’s a great query and quick! Thanks for posting!

  7. jami on

    select name,count(*) from X1
    group by name
    having count(*)>1 is working well.

  8. Rajnii Sharma on

    I need to find out those duplicate records like company name having slight difference in their name (either punctuation or typo error) and two differen company id are created.

  9. Ну как же только так? Ищу, как можно уточнить данную тему.

  10. jflklhhxyelg on

    nlgerbsgwpik

  11. muyiwa on

    i like it.straight to the point.

  12. Torres on

    If you wrote a movie, what would the title be?

  13. pinoyden on

    Your query doesnt work for me but amar’s post from above works. Here it is.

    select name,count(*) from X1
    group by name
    having count(*)>1

  14. ravi on

    Thnks buddy..

  15. Nirmal on

    How do i check for duplicate items in a database
    if previus data already in database so doesn’t insert new data


Leave a reply to jami Cancel reply