Given a table like tx_example_domain_model_resourcelocalized
, some localization entries are duplicated for whatever reason. An easy way to clean them out is using SQL as shown below.
Our magic to this SQL starts with the JOIN
command. In particular, INNER JOIN
helps us insist upon a one-to-one relation between database table records.
Next, we use WHERE
conditions to further set the join points upon unique identifiers of duplicated entries; resource
, lang
, and title
. resource
is used for our primary INNER JOIN
since it’s indexed or easier to lookup for the database.
The last line of the WHERE
clause pulling the larger uids gives us the duplicated entries. If l.uid < l2.uid
was used, then we’d get back our original entries.
Run a SQL command like the following to demonstrate to yourself that only the duplicates are selected.
SELECT l.uid, l.resource FROM tx_example_domain_model_resourcelocalized l INNER JOIN tx_example_domain_model_resourcelocalized l2 ON l2.resource = l.resource WHERE l.lang = 2 AND l2.lang = l.lang AND l2.title = l.title AND l.uid > l2.uid ;
Once happy with the results, run the SQL command again but using DELETE
than SELECT
.
DELETE l FROM tx_example_domain_model_resourcelocalized l INNER JOIN tx_example_domain_model_resourcelocalized l2 ON l2.resource = l.resource WHERE l.lang = 2 AND l2.lang = l.lang AND l2.title = l.title AND l.uid > l2.uid ;
The DELETE
SQL clause will take about 4-times longer than the SELECT
, but heck, your table is duplicate free afterward.
References
- delete duplicate rows (and leaving one copy) in PostgreSQL, MS SQL, MySQL
- MySQL DELETE Syntax
- MySQL JOIN Syntax