A blog about software development, written by Daniel Diekmeier. Archive.

Why does RuboCop want me to use has_many :through instead of has_and_belongs_to_many?

March 19, 2023

A few weeks ago, I had to implement a Label feature. Picture Github’s labels for Issues and Pull Requests:

Github Labels

This is a classic many-to-many relationship: A label can be assigned to many issues, and an issue can have many labels.

For this project, we use Rails, so I created a Label model and connected it to our existing Issue model with has_and_belongs_to_many:

class Label < ApplicationRecord
  has_and_belongs_to_many :issues

class Issue < ApplicationRecord
  has_and_belongs_to_many :labels

This worked fine to create the many-to-many relationship I was after. I implemented the UI and was ready to call it a day.

I’m not sure how I missed it for so long, but I finally noticed that RuboCop complained about the has_and_belongs_to_many. According to the rule Rails/HasAndBelongsToMany, you should never use it and always prefer has_many ... through.

I found that surprising! The rule itself does not explain a reason, so I looked around and found a few answers here on StackOverflow.

In most cases, people were concerned that has_many :through will probably be needed everywhere at some point, which makes it the future-proof choice:

You should use has_many :through if you need validations, callbacks, or extra attributes on the join model.

From my experience it's always better to use has_many :through because you can add timestamps to the table.

If you decided to use has_and_belongs_to_many, and want to add one simple datapoint or validation 2 years down the road, migrating this change will be extremely difficult and bug-prone. To be safe, default to has_many :through

While writing this post, I found the corresponding entry in the Rails Style Guide, which indeed explains:

Using has_many :through allows additional attributes and validations on the join model.

For myself, I ended up with the following conclusion: I you ever want to talk about the relationship itself (and you probably will!), you should use has_many :through. I expect this will even help me outside of Rails, because many-to-many relationships are a common pattern in many projects.

I rewrote my code to has_many :through and it was no problem – especially because RuboCop helped me to catch this so early. I even ended up adding timestamps to the join table, which wouldn’t have been possible with has_and_belongs_to_many.