This content originally appeared on DEV Community and was authored by Cong Li
In the previous article, we introduced GCluster parameter optimization in the GBase database. Today, we will discuss GNode parameter optimization in the GBase 8a MPP Cluster. You can refer to the previous article here: GBase 8a MPP Cluster Performance Tuning: GCluster Parameter Optimization.
1 Memory Parameters of GNode
1.1. Heap Parameters (global)
-
gbase_heap_data
: Mainly designed for caching data (DC) and should be allocated the most memory. -
gbase_heap_large
: Used to manage memory that is infrequently requested or released. -
gbase_heap_temp
: Allocates temporary, small chunks of memory that are used less frequently. -
gbase_memory_pct_target
: Sets the available memory proportion, with a default value of 0.8.
Minimum Parameter Values:
-
gbase_heap_data
>= 512MB -
gbase_heap_large
>= 256MB -
gbase_heap_temp
>= 256MB
Maximum Parameter Values:
(gbase_heap_data + gbase_heap_large + gbase_heap_temp) <= total memory * gbase_memory_pct_target
Note:
- Default:
total memory = physical memory
- When
_gbase_memory_use_swap
is set to 1:total memory = physical memory + swap
1.2. Operator Buffers (session level)
Operator buffers are session-level settings. For example, if you set gbase_buffer_result=1G
and the concurrency is 30, the total memory usage for gbase_buffer_result
during execution will be 1G * 30 = 30G
, not accounting for other operator buffers.
Commonly used buffers:
-
gbase_buffer_distgrby
: Stores intermediate results ofDISTINCT
operations. -
gbase_buffer_hgrby
: Stores intermediate results ofHASH GROUP BY
operations. -
gbase_buffer_hj
: Stores intermediate results ofHASH JOIN
operations. -
gbase_buffer_insert
: Stores intermediate results ofINSERT VALUES
operations. -
gbase_buffer_result
: Stores materialized intermediate results. -
gbase_buffer_rowset
: Stores intermediate result sets fromJOIN
computations. -
gbase_buffer_sj
: Stores intermediate results ofSORT MERGE JOIN
. This is used when join conditions involvea >= b
ora <= b
. -
gbase_buffer_sort
: Stores intermediate results ofSORT
operations.
Buffer Setting Guidelines:
Under normal circumstances (non-high concurrency scenarios), operator buffers can be set according to the system memory size:
-
gbase_buffer_hgrby
andgbase_buffer_hj
should not exceed 4G. -
gbase_buffer_result
should not exceed 2G. -
gbase_buffer_rowset
should not exceed 1G. - Other buffers can be set using system estimates.
In high-concurrency scenarios, operator buffers should not be set too large. System auto-evaluation is usually sufficient, but in cases of extreme concurrency, manual adjustment to smaller buffer sizes may be necessary. The total buffer size multiplied by the concurrency should not exceed gbase_heap_large
, and in no case should it exceed the total system memory.
Other Situations to Adjust Operator Buffers:
If a particular SQL operation is slow due to a specific operator (as identified by single-node tracing), you can increase the corresponding buffer. For example, if tracing shows slow performance in a join operation, you can increase the value of gbase_buffer_hj
. However, care must be taken not to affect the execution of other SQL queries.
2. GNode Concurrency Control Parameters
2.1. gbase_parallel_execution
This parameter controls whether to enable parallel execution:
-
0
: Disabled (default). -
1
: Enabled.
When to use:
- If you find low CPU utilization, enabling parallel execution may help.
- Parallel execution works by splitting data into chunks and processing them simultaneously, merging the results at the end. This is ideal for complex SQL operations with low concurrency.
2.2. gbase_parallel_max_thread_in_pool
Configures the maximum number of threads in the parallel execution thread pool. By default, this is set to twice the number of CPU cores.
Value range: 0~4096
. When deploying a single GNode on a server, it's recommended to set this to 4–8 times the number of CPU cores.
Explanation:
- Threads in the pool are created when the database service starts. Threads are borrowed from the pool during SQL execution and returned afterward, which reduces the overhead of frequent thread creation and destruction.
2.3. gbase_parallel_degree
Controls the maximum degree of parallelism for each SQL query.
Value range: 0~gbase_parallel_max_thread_in_pool
. The maximum value cannot exceed the available threads in the thread pool.
- 1: No parallelism, i.e., single-thread execution.
-
0: The default degree of parallelism is half of the thread pool (
gbase_parallel_max_thread_in_pool / 2
). If the thread pool size is an odd number, the integer part is used.
That's all for today's content. Thank you for reading!
This content originally appeared on DEV Community and was authored by Cong Li
Cong Li | Sciencx (2024-09-25T02:18:54+00:00) GBase 8a MPP Cluster Performance Tuning: GNode Parameter Optimization. Retrieved from https://www.scien.cx/2024/09/25/gbase-8a-mpp-cluster-performance-tuning-gnode-parameter-optimization/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.