Basic SQL Optimization Suggestions that Programmers Should Know

When we talk about system performance optimization, in addition to various targeted optimizations at the code level, there is also a very important means to optimize the performance of the database.In an Internet system, when the system visits more and…


This content originally appeared on Level Up Coding - Medium and was authored by lance

When we talk about system performance optimization, in addition to various targeted optimizations at the code level, there is also a very important means to optimize the performance of the database.

In an Internet system, when the system visits more and more, and the amount of data is more and more, the pressure on the database will become greater and greater. If the database table structure is not designed properly and the SQL statements are not written well, the code performance may be extremely high, but the system is dragged down by the database. Therefore, it is necessary for us programmers to understand database and database access optimization in order to design a high-performance system.

As a programmer, we may not know the server hardware configuration of the production environment, and we cannot perform various practical tests and summaries on the database as professionally as DBAs. However, we should have a good understanding of our SQL business logic and the data we access to tables and fields. In fact, we don’t want to know the highly available architecture of the database and how to access the data. We only care whether my SQL can return the results as soon as possible. So how should programmers optimize the database? How can we quickly locate SQL performance problems and find the right direction for optimization? Facing these problems, I summarized some basic optimization knowledge for programmers (this article is based on MySQL database).

The basic direction of system performance optimization

In order to optimize the performance of a computer system, we need to know where the system runs, and quickly locate the bottleneck of performance. In most cases, the slowest device will be the bottleneck. As we all know, the CPU of a computer system runs much faster than the cache, and the cache is much faster than the memory. Therefore, in many cases, disk IO and network IO are the performance bottlenecks of the system.

According to the database operation principle, the main work contents of these devices during database operation are as follows:

  • CPU: Transaction control, concurrency control, SQL parsing, function or logic calculation.
  • Memory: Read/write of cached data.
  • Network: Data response and query result transmission.
  • Disk: Data reading and writing, logging, massive data sorting, table join.

For databases, the above four points can be translated into the following four optimization suggestions:

  • Reduce disk usage and data access (design appropriate table structures and create high-performance indexes)
  • Reduce network access (batch requests, return less data)
  • Reduce CPU overhead (reduce aggregate function calls and reasonably use sorting)

Appropriate table structure design

Good table structure design is the basis for high database performance. The core of table structure design is the selection of field data types. Selecting the correct data type is crucial. There are some general principles for selecting data types:

  • It is strongly recommended to create a auto-increment primary key for each table. The auto-increment primary key can promote random io to sequential io and can apply for index pages in order to make the index pages compact and reduce the impact of page splitting on performance
  • If the length meets the requirements, smaller data types are better, and use less disk, memory, and cpu cache. For numeric data, if you can select an unsigned type, you can select an unsigned type. The unsigned type can store twice as many positive numbers as the signed type.
  • The character type length is determined. Please use char instead of varchar. The same character length char saves more storage space.
  • If you can use timestamp, you don’t need datetime. Timestamp only takes 4 bytes, and datetime takes 8 bytes.
  • Try to avoid null fields. When the field in MySQL is NULL, it still takes up space, and makes the index and index statistics more complex. It is easy to split index pages when updating fields that are NULL, which will affect performance. A meaningful value should be used instead of NULL.

Create high-performance indexes

As the name implies, the database index is an auxiliary data structure used to optimize queries. It can be seen as another redundant data created to improve query speed. The index is relatively simple. As far as the innodb engine of MySQL is concerned, it is stored using the B+tree data structure. However, despite the fact that indexes are simple, few people can use indexes correctly in complex tables.

Indexes will greatly increase the DML (update, insert, delete) cost of table records. Excellent indexes can improve the database performance hundreds of times, but unreasonable indexes may reduce the performance hundreds of times. Therefore, it is necessary to balance the business requirements to create indexes in a table. In general, there are several experiences about which fields to create indexes on:

  • A field frequently used for query, and the records filtered by this field account for about 10% of the total records.
  • It is recommended to create indexes for the primary key, foreign key for table association, and fields with identity meaning, such as username, email, etc.
  • The status flag is like order_ status, is_ Delete and gender fields are not suitable for creating indexes, and large text, large fields, and description fields are not suitable for creating indexes.

In the following cases, even if an index is created, the index will not be used:

  • When the index field uses<>,not in,is null the index will not be used, such as Index_column <> ?。You can use union to aggregate search results instead of <>.

Such as select id, product_name from order where amount!=1000

to

(select id, product_name from order where amount>1000) union all (select id, product_name from order where amount<1000)

  • Indexes cannot be used for index fields after ordinary arithmetic operations or function operations, such as function(Index_column)=?,Index_column+1=?
  • LIKE syntax with a leading fuzzy query cannot use index, such as index_column like '%?%'

Pagination query

When the amount of data queried exceeds 30% of the total, MySQL will not use indexes, so paging queries are very important. However, you should also be careful with paging queries. For example, select * from table limit 100000 10;, MySQL will query the first 100010 records and discard the first 100000 records, so the query speed will be slower and slower when paging to the compared pages. We can solve this problem by using delayed correlation.

select * from table where id in (select id from table limit 1000000 10)

This method skillfully uses clustered indexes to reduce the execution times of a large number of back table queries, thus improving the execution efficiency.

Reasonable use of sort

Database sorting is generally carried out in memory. For databases, sorting is a CPU-consuming operation. Because of the high performance of modern CPUs, the sorting of tens of thousands of data may have little impact on the database. However, if you have hundreds of thousands of data in a table, you need to consider how to handle sorting. Sorting large data sets not only consumes memory and CPU, but also hard disk sorting will occur if memory is not enough, leading to a sharp decline in sorting performance. Therefore, generally speaking, you can not sort without sorting. If you must sort, try to create an index for the sorting field because the index itself is ordered.

There are also some simple suggestions, such as only returning the required data and batch processing. This article aims to analyze some common optimization methods of databases, and put forward some suggestions for programmer-oriented SQL optimization, hoping to improve your SQL optimization ability. Thank you for reading.

Level Up Coding

Thanks for being a part of our community! Before you go:

🚀👉 Placing developers like you at top startups and tech companies


Basic SQL Optimization Suggestions that Programmers Should Know was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding - Medium and was authored by lance


Print Share Comment Cite Upload Translate Updates
APA

lance | Sciencx (2022-09-22T15:35:57+00:00) Basic SQL Optimization Suggestions that Programmers Should Know. Retrieved from https://www.scien.cx/2022/09/22/basic-sql-optimization-suggestions-that-programmers-should-know/

MLA
" » Basic SQL Optimization Suggestions that Programmers Should Know." lance | Sciencx - Thursday September 22, 2022, https://www.scien.cx/2022/09/22/basic-sql-optimization-suggestions-that-programmers-should-know/
HARVARD
lance | Sciencx Thursday September 22, 2022 » Basic SQL Optimization Suggestions that Programmers Should Know., viewed ,<https://www.scien.cx/2022/09/22/basic-sql-optimization-suggestions-that-programmers-should-know/>
VANCOUVER
lance | Sciencx - » Basic SQL Optimization Suggestions that Programmers Should Know. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2022/09/22/basic-sql-optimization-suggestions-that-programmers-should-know/
CHICAGO
" » Basic SQL Optimization Suggestions that Programmers Should Know." lance | Sciencx - Accessed . https://www.scien.cx/2022/09/22/basic-sql-optimization-suggestions-that-programmers-should-know/
IEEE
" » Basic SQL Optimization Suggestions that Programmers Should Know." lance | Sciencx [Online]. Available: https://www.scien.cx/2022/09/22/basic-sql-optimization-suggestions-that-programmers-should-know/. [Accessed: ]
rf:citation
» Basic SQL Optimization Suggestions that Programmers Should Know | lance | Sciencx | https://www.scien.cx/2022/09/22/basic-sql-optimization-suggestions-that-programmers-should-know/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.