Polymorphic vs Shared Table: Is speed a valid concern?

What do you do when you find a model that can belong to multiple models?

In my case, I’m building the inventory tracking functionality for an MRP. An Inventory Item can be either related to a Product, or a Material – in summary, I need to find a way …


This content originally appeared on DEV Community and was authored by Lucas Kuhn

What do you do when you find a model that can belong to multiple models?

In my case, I'm building the inventory tracking functionality for an MRP. An Inventory Item can be either related to a Product, or a Material - in summary, I need to find a way to relate the two tables on the left, to the table on the right:

Image description

What usually comes to mind is using a polymorphic relationship. Let's call this poly_inventory_item:

Image description

This way, the item_id field is a foreign key and the item_type will let us know if it is a material or a product. This has some strong points:

  • Clean solution with only two fields
  • Flexibility to add more relations in the future

However, since there are only two options (either a material or a product), a table with two foreign keys is also a viable option. Let's call this shared_inventory_item:

Image description

This case is a bit unusual since it has nullable foreign keys, but it comes with some advantages:

  • Clearer relations, as you can see the foreign keys directly
  • Faster speeds, due to the indexes on the foreign keys

This second assumption, is what made me question my decision. How much faster can it be?

Testing the speed of both relations

Database setup

I'm using the standard indexes expected for polymorphic relations, and the shared table has indexes on both foreign keys:

# db/schema.rb (simplified)

create_table "poly_inventory_items", force: :cascade do |t|
  t.string "item_type", null: false
  t.integer "item_id", null: false
  t.index ["item_type", "item_id"], name: "index_poly_inventory_items_on_item"
end

create_table "shared_inventory_items", force: :cascade do |t|
  t.integer "product_id"
  t.integer "material_id"
  t.index ["material_id"], name: "index_shared_inventory_items_on_material_id"
  t.index ["product_id"], name: "index_shared_inventory_items_on_product_id"
end
add_foreign_key "shared_inventory_items", "materials"
add_foreign_key "shared_inventory_items", "products"

Models setup

The polymorphic relation is quite simple:

class PolyInventoryItem < ApplicationRecord
  belongs_to :item, polymorphic: true

  validates :item_id, uniqueness: {scope: :item_type}
  validates :item_type, inclusion: {in: %w[Product Material]}
end

The shared table is a bit more complex, as it needs to validate the presence of one of the foreign keys, and the absence of the other:

class SharedInventoryItem < ApplicationRecord
  belongs_to :product, optional: true
  belongs_to :material, optional: true

  validates :product_id, presence: true, unless: :material_id?
  validates :material_id, presence: true, unless: :product_id?

  validates :product_id, absence: true, if: :material_id?
  validates :material_id, absence: true, if: :product_id?

  validates :product_id, uniqueness: true, allow_nil: true
  validates :material_id, uniqueness: true, allow_nil: true
end

On the other side, the Material and Product can be very straightforward:

class Material < ApplicationRecord
  has_one :shared_inventory_item, dependent: :destroy
  has_one :poly_inventory_item, as: :item, dependent: :destroy
end 
class Product < ApplicationRecord
  has_one :shared_inventory_item, dependent: :destroy
  has_one :poly_inventory_item, as: :item, dependent: :destroy
end

Benchmarking

Since indexes matter more on a large database, I did all tests in a situation with only a thousand records, and again with 100K records. The tests were done using the benchmark-ips gem, considering creating records, reading from the association, and querying.

# --- Creating records
Benchmark.ips do |x|
  x.report("PolyInventoryItem") do
    material = Material.create!(name: "Material")
    product = Product.create!(name: "Product", sku: "SKU")
    material.create_poly_inventory_item!
    product.create_poly_inventory_item!
  end
  x.report("SharedInventoryItem") do
    material = Material.create!(name: "Material")
    product = Product.create!(name: "Product", sku: "SKU")
    material.create_shared_inventory_item!
    product.create_shared_inventory_item!
  end
  x.compare!
end
# --- Reading from association
product = Product.first
material = Material.first
Benchmark.ips do |x|
  x.report("PolyInventoryItem") do
    product.reload.poly_inventory_item
    material.reload.poly_inventory_item
  end
  x.report("SharedInventoryItem") do
    product.reload.shared_inventory_item
    material.reload.shared_inventory_item
  end

  x.compare!
end
# --- Querying with find_by
product = Product.first
material = Material.first
Benchmark.ips do |x|
  x.report("PolyInventoryItem") do
    PolyInventoryItem.find_by(item: product)
    PolyInventoryItem.find_by(item: material)
  end
  x.report("SharedInventoryItem") do
    SharedInventoryItem.find_by(product: product)
    SharedInventoryItem.find_by(material: material)
  end

  x.compare!
end

Results

Creating records

 --- 1K records
 SharedInventoryItem:    409.4 i/s
 PolyInventoryItem:      394.5 i/s - same-ish: difference falls within error
 --- 100K records
 SharedInventoryItem:    378.4 i/s
 PolyInventoryItem:      377.4 i/s - same-ish: difference falls within error

Reading from association

 --- 1K records
  SharedInventoryItem:     1982.0 i/s
    PolyInventoryItem:     1863.5 i/s - 1.06x  slower
 --- 100K records
  SharedInventoryItem:     1915.8 i/s
    PolyInventoryItem:     1761.8 i/s - 1.09x  slower

Querying with find_by

 --- 1K records
  SharedInventoryItem:     7471.5 i/s
    PolyInventoryItem:     4476.7 i/s - 1.67x  slower
 --- 100K records
  SharedInventoryItem:     6686.9 i/s
    PolyInventoryItem:     3862.5 i/s - 1.73x  slower

Conclusion

I was surprised to see that the speed difference was not as significant as I thought. The shared table is faster, but the difference is not significant enough to justify the extra complexity. It all comes down to the trade-offs you are willing to make.

I will stick with the polymorphic relation, as it is a cleaner solution. Hope this helps you make a decision in the future! 🙌


This content originally appeared on DEV Community and was authored by Lucas Kuhn


Print Share Comment Cite Upload Translate Updates
APA

Lucas Kuhn | Sciencx (2024-06-29T14:39:35+00:00) Polymorphic vs Shared Table: Is speed a valid concern?. Retrieved from https://www.scien.cx/2024/06/29/polymorphic-vs-shared-table-is-speed-a-valid-concern/

MLA
" » Polymorphic vs Shared Table: Is speed a valid concern?." Lucas Kuhn | Sciencx - Saturday June 29, 2024, https://www.scien.cx/2024/06/29/polymorphic-vs-shared-table-is-speed-a-valid-concern/
HARVARD
Lucas Kuhn | Sciencx Saturday June 29, 2024 » Polymorphic vs Shared Table: Is speed a valid concern?., viewed ,<https://www.scien.cx/2024/06/29/polymorphic-vs-shared-table-is-speed-a-valid-concern/>
VANCOUVER
Lucas Kuhn | Sciencx - » Polymorphic vs Shared Table: Is speed a valid concern?. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/06/29/polymorphic-vs-shared-table-is-speed-a-valid-concern/
CHICAGO
" » Polymorphic vs Shared Table: Is speed a valid concern?." Lucas Kuhn | Sciencx - Accessed . https://www.scien.cx/2024/06/29/polymorphic-vs-shared-table-is-speed-a-valid-concern/
IEEE
" » Polymorphic vs Shared Table: Is speed a valid concern?." Lucas Kuhn | Sciencx [Online]. Available: https://www.scien.cx/2024/06/29/polymorphic-vs-shared-table-is-speed-a-valid-concern/. [Accessed: ]
rf:citation
» Polymorphic vs Shared Table: Is speed a valid concern? | Lucas Kuhn | Sciencx | https://www.scien.cx/2024/06/29/polymorphic-vs-shared-table-is-speed-a-valid-concern/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.