What are the best practices while using BigQuery?

BigQuery is affordable and fast, but it can quickly consume a lot of processing power, increasing the bills if not used properly. In BigQuery, slots decide the amount of processing power that directly impacts cost. First, let’s understand slots.


This content originally appeared on DEV Community and was authored by Kedar.K

BigQuery is affordable and fast, but it can quickly consume a lot of processing power, increasing the bills if not used properly. In BigQuery, slots decide the amount of processing power that directly impacts cost. First, let's understand slots.

Slots

In BigQuery, slots are the computational capacity required to execute a SQL query. Slots play a crucial role in pricing and resource allocation. BigQuery manages the slot allocation, and the number of slots allocated to a job depends on

  1. Query size: The amount of data processed
  2. Complexity: The amount of information shuffled

In supplement to consumption-based pricing, BigQuery also provides flat-rate pricing, that is, you can buy a fixed number of slots over some period which provides more granularity in planning capital.

BigQuery is fast because it uses heavy parallel processing by converting the declarative SQL into stages.

Without further delay, let's see the best practices in BigQuery for:

  • Cost controls
  • Query performance
  • Storage optimization

Cost controls

  • BigQuery is a columnar database. In other words, all columns from a table are stored separately at the physical level. Hence, we should always avoid using select * in our queries. This practice will limit the data processed.

  • For looking at the sample data, we should use the preview option instead of the query. The preview feature in BigQuery is free.

  • We should use a pricing calculator to check the costs before pushing the query to production. On-demand queries are charged based on the number of bytes read, and we can calculate the price based on bytes read.

  • We should use dashboard for viewing costs and logs. The dashboard can provide us with good insights on our usage of BigQuery.

  • Partitioning in BigQuery allows us to scan only the required data and reduce costs.

Query performance

  • We should avoid querying non-required partitions.
  • We should use external data sources suitably. There's no guarantee of good performance when using external sources.
  • We should avoid excessive use of wildcard tables, for example, FROM bigquery-public-data.noaa_gsod.good*.
  • JavaScript user-defined functions reduce overall performance. We should avoid them.
  • Certain joins, such as a cartesian product, can create more output than input, and should be avoided.
  • We should avoid DML statements with updating or inserting single rows.

Storage optimization

  • The expiration time for datasets/tables/partitions should be set. It allows us to control storage expenditures and optimize the usage of BigQuery storage.
  • Using BigQuery to process the data, we can also use it to store in tables and partitions for an extended period. It has lower charges for the data, which is not modified in the last 90 days.
  • Lastly, we can use the pricing calculator to check storage costs.

That's all for this post folks.
Don't forget to ❤ or 📑 this article if you enjoyed it.

Have a great day.
Happy Learning!!


This content originally appeared on DEV Community and was authored by Kedar.K


Print Share Comment Cite Upload Translate Updates
APA

Kedar.K | Sciencx (2022-02-19T11:35:37+00:00) What are the best practices while using BigQuery?. Retrieved from https://www.scien.cx/2022/02/19/what-are-the-best-practices-while-using-bigquery/

MLA
" » What are the best practices while using BigQuery?." Kedar.K | Sciencx - Saturday February 19, 2022, https://www.scien.cx/2022/02/19/what-are-the-best-practices-while-using-bigquery/
HARVARD
Kedar.K | Sciencx Saturday February 19, 2022 » What are the best practices while using BigQuery?., viewed ,<https://www.scien.cx/2022/02/19/what-are-the-best-practices-while-using-bigquery/>
VANCOUVER
Kedar.K | Sciencx - » What are the best practices while using BigQuery?. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2022/02/19/what-are-the-best-practices-while-using-bigquery/
CHICAGO
" » What are the best practices while using BigQuery?." Kedar.K | Sciencx - Accessed . https://www.scien.cx/2022/02/19/what-are-the-best-practices-while-using-bigquery/
IEEE
" » What are the best practices while using BigQuery?." Kedar.K | Sciencx [Online]. Available: https://www.scien.cx/2022/02/19/what-are-the-best-practices-while-using-bigquery/. [Accessed: ]
rf:citation
» What are the best practices while using BigQuery? | Kedar.K | Sciencx | https://www.scien.cx/2022/02/19/what-are-the-best-practices-while-using-bigquery/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.