I am testing out Always On replication on SQL Server with a cool docker setup that includes primary and secondary https://github.com/rafaelrodrigues3092/docker-mssql-alwayson
When I created data on the primary there was never any lag on the secondary so I wanted to make sure my lag values were correct and wanted to introduce network latency between the primary and secondary to introduce lag time. For this there is a neat tool called “tc” that can introduced lag. For example on either or both of the docker containers , I can set up lag with
# add lag tc qdisc add dev eth0 root netem delay 100ms # delete lag tc qdisc del dev eth0 root netem delay 100ms #show lag tc qdisc
Problem is, there was no “tc” on my docker containers and yum couldn’t find it (it is part of iproute2)
# yum install iproute2
Last metadata expiration check: 1:34:54 ago on Wed Apr 13 20:10:03 2022.
No match for argument: iproute2
Error: Unable to find a match: iproute2
I figure there is something wrong with the way the yum repos are setup on the docker images and I poked around there for a while but couldn’t get it to work, so then I tried to install by hand which turned out more onerous than I thought, so here I’m just documenting what was needed
create directory tc_build
cd tc_build
curl -o iproute2.tz https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/snapshot/iproute2-5.17.0.tar.gz
curl -o bison.gz http://ftp.gnu.org/gnu/bison/bison-3.8.tar.gz
curl -o gettext.gz https://ftp.gnu.org/pub/gnu/gettext/gettext-0.21.tar.gz
curl -o texinfo-6.8.tar.gz https://ftp.gnu.org/gnu/texinfo/texinfo-6.8.tar.gz
wget https://github.com/westes/flex/releases/download/v2.6.4/flex-2.6.4.tar.gz
yum install make
yum install m4
yum install libtool
yum install diffutils
for i in *z; do echo $i ; tar xvf $i ; done
mkdir trash
mv *z trash
cd textinfo*
./autogen.sh
./configure
make
make install
cd ../gettext*
make
cd ../flex*
make
cd ../bison*
make
cd ../iproute2*
make
The above isn’t exact but should get you mostly there. When it comes to “make” in each directory its some variation of
autogen.sh
configure
make
make install
Not every directory needed all 4 commands but unfortunately didn’t take detailed enough notes to say which needed which. The most onerous part was “gettext” which was needed just for “autopoint” but I didn’t find a way just to install “autopoint” and “gettext” is big and takes a while to compile and install. Once it was all done “tc” was there. Then I just tar’ed the “tc” commands in /usr/sbin in tc.tar.gz which I can copy to a docker container and voila it’s there.
Had to modify the docker-compose.yml to include
cap_add:
- NET_ADMIN
then after starting the containers , on both the primary and secondary, I set network latency
tc qdisc add dev eth0 root netem delay 500ms
The docker primary comes with SALES database that is replicated to the secondary, do I go on the primary, use SALES and create a bunch of data:
use sales;
create table foo(id int, content varchar(2000)) ;
WITH mycte AS (
SELECT 1 DataValue
UNION all
SELECT DataValue + 1
FROM mycte
WHERE DataValue + 1 <= 100000
)
INSERT INTO foo(id,content)
SELECT DataValue,REPLICATE (NEWID() ,55) --
FROM mycte m
OPTION (MAXRECURSION 0)
;
There is a nice sql query for monitoring Always On here https://dba.stackexchange.com/questions/278324/log-send-queue-size-and-redo-queue-size-in-ag
Here’s a short version of the query
SELECT
--AG info
AGName = ag.name,
ReplicaServer = ar.replica_server_name,
DbName = adc.database_name,
Failover_ready = ar.failover_mode_desc,
-- State Info
IsLocal = drs.is_local,
IsPrimary = drs.is_primary_replica,
SynchronizationState = drs.synchronization_state_desc,
SynchronizationHealthState = drs.synchronization_health_desc,
-- Queue Size & rates
SendQueueSize = drs.log_send_queue_size,
RedoQueueSize = drs.redo_queue_size,
SendRateKb = drs.log_send_rate,
RedoRateKb = drs.redo_rate,
--Oh yeah, filestream, too
FileStreamSendRate = drs.filestream_send_rate,
drs.Secondary_lag_seconds
FROM sys.dm_hadr_database_replica_states AS drs
JOIN sys.availability_databases_cluster AS adc
ON drs.group_id = adc.group_id AND
drs.group_database_id = adc.group_database_id
JOIN sys.availability_groups AS ag
ON ag.group_id = drs.group_id
JOIN sys.availability_replicas AS ar
ON drs.group_id = ar.group_id AND
drs.replica_id = ar.replica_id
ORDER BY
-- ag.name, ar.replica_server_name, adc.database_name;
drs.log_send_queue_size + drs.redo_queue_size DESC;
primary
-[ RECORD 1 ]-------------------------
AGName | AG1
ReplicaServer | db2
DbName | SALES
Failover_ready | MANUAL
IsLocal | 0
IsPrimary | 0
SynchronizationState | SYNCHRONIZED
SynchronizationHealthState | HEALTHY
SendQueueSize | 6904
RedoQueueSize | 9544
SendRateKb | 5113
RedoRateKb | 41343
FileStreamSendRate | 0
Secondary_lag_seconds | 0
-[ RECORD 2 ]-------------------------
AGName | AG1
ReplicaServer | db1
DbName | SALES
Failover_ready | MANUAL
IsLocal | 1
IsPrimary | 1
SynchronizationState | SYNCHRONIZED
SynchronizationHealthState | HEALTHY
SendQueueSize | NULL
RedoQueueSize | NULL
SendRateKb | NULL
RedoRateKb | NULL
FileStreamSendRate | NULL
Secondary_lag_seconds | NULL
secondary
-[ RECORD 1 ]-------------------------
AGName | AG1
ReplicaServer | db2
DbName | SALES
Failover_ready | MANUAL
IsLocal | 1
IsPrimary | 0
SynchronizationState | SYNCHRONIZED
SynchronizationHealthState | HEALTHY
SendQueueSize | 60
RedoQueueSize | 29068
SendRateKb | 4421
RedoRateKb | 41462
FileStreamSendRate | 0
Secondary_lag_seconds | NULL
another thing you can do is read the logs for Always On info
EXEC xp_ReadErrorLog 0,1,"Always"
-[ RECORD 1 ]-------------------------
LogDate | 2022-04-13 18:47:56.440
ProcessInfo | spid27s
Text | Always On: The availability replica manager is starting. This is an informational message only. No user action is required.
-[ RECORD 2 ]-------------------------
LogDate | 2022-04-13 18:47:56.590
ProcessInfo | spid27s
Text | Always On: The availability replica manager is waiting for the instance of SQL Server to allow client connections. This is an informational message only. No user action is required.
-[ RECORD 3 ]-------------------------
LogDate | 2022-04-13 18:48:18.900
ProcessInfo | spid53
Text | Always On: The local replica of availability group 'AG1' is preparing to transition to the primary role. This is an informational message only. No user action is required.
-[ RECORD 4 ...
Comments