A dockerized mongod instance with authentication enabled

Here’s just a quick walkthrough showing how to create a dockerized instance of a standalone MongoDB instance.

First, from within a terminal, create a folder to hold the Dockerfile…

mkdir Docker_MongoDB_Image
cd Docker_MongoDB_Image
touch Dockerfile

Edit the Dockerfile…

vi Dockerfile

Enter the following text. You may wish to modify the file slightly. For example; if you need to set any of the proxy values or the MongoDB admin password.

FROM centos
#ENV http_proxy XXXXXXXXXXXXXXXXXX
#ENV https_proxy XXXXXXXXXXXXXXX

MAINTAINER Rhys Campbell no_mail@no_mail.cc

RUN echo $'[mongodb-org-3.4] \n\
name=MongoDB Repository \n\
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/3.4/x86_64/ \n\
gpgcheck=1 \n\
enabled=1 \n\
gpgkey=https://www.mongodb.org/static/pgp/server-3.4.asc ' > /etc/yum.repos.d/mongodb-org-3.4.repo

RUN yum clean all && yum install -y mongodb-org-server mongodb-org-shell mongodb-org-tools
RUN mkdir -p /data/db && chown -R mongod:mongod /data/db
RUN /usr/bin/mongod -f /etc/mongod.conf && sleep 5 && mongo admin --eval "db.createUser({user:\"admin\",pwd:\"secret\",roles:[{role:\"root\",db:\"admin\"}]}); db.shutdownServer()"
RUN echo $'security: \n\
  authorization: enabled \n ' >> /etc/mongod.conf
RUN sed -i 's/^  bindIp: 127\.0\.0\.1/  bindIp: \[127\.0\.0\.1,0\.0\.0\.0\]/' /etc/mongod.conf
RUN sed -i 's/^  fork: true/  fork: false/' /etc/mongod.conf
RUN chown mongod:mongod /etc/mongod.conf
RUN cat /etc/mongod.conf

EXPOSE 27017

ENTRYPOINT /usr/bin/mongod -f /etc/mongod.conf

Build the image from within the current dirfectory…

docker build -t mongod-instance . --no-cache

Run the image and map to the 27017 port…

docker run  -p 27017:27017 --name mongod-instance -t mongod-instance

Inspect the mapped port with…

docker ps

The output should look something like this…

CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                      NAMES
5e7e4a069f4a        mongod-instance     "/bin/sh -c '/usr/..."   2 hours ago         Up 2 hours          0.0.0.0:27017->27017/tcp   mongod-instance

We can view the docker ip address with this command…

docker-machine ls

Output looks like this…

NAME      ACTIVE   DRIVER       STATE     URL                         SWARM   DOCKER        ERRORS
default   *        virtualbox   Running   tcp://192.168.99.100:2376           v17.05.0-ce

You can connect to the dockerized mongod instance with this command…

mongo admin -u admin -p --port 27017 --host 192.168.99.100

When you are done with the instance it can be destroyed with…

docker stop mongod-instance
docker rm mongod-instance

Update: I’ve added this to my Docker Hub account so you can grab the image directly from there.


Check MariaDB replication status inside Ansible

I needed a method to check replication status inside Ansible. The method I came up with uses the shell module

---
- hosts: mariadb
  vars_prompt:
      - name: "mariadb_user"
        prompt: "Enter MariaDB user"
      - name: "mariadb_password"
        prompt: "Enter MariaDB user password"
 
  tasks:
    - name: "Check MariaDB replication state"
      shell: "test 2 -eq $(mysql -u '{{ mariadb_user }}' -p'{{ mariadb_password }}' -e 'SHOW SLAVE STATUS' --auto-vertical-output | grep -E 'Slave_IO_Running|Slave_SQL_Running' | grep Yes | wc -l)"
      register: replication_status
 
    - name: "Print replication status var"
      debug:
        var: replication_status

This is executed like so…

ansible-playbook mariadb_check_repl.yml -i inventories/my_mariadb_hosts

The playbook will prompt for a MariaDB user and password and will output something like below…

PLAY [mariadb] *****************************************************************

TASK [setup] *******************************************************************
ok: [slave2]
ok: [master2]
ok: [slave1]
ok: [master1]

TASK [Check MariaDB replication state] *****************************************
changed: [master1]
changed: [master2]
changed: [slave2]
changed: [slave1]

TASK [Print replication status var] ********************************************
ok: [master1] => {
    "replication_status": {
        "changed": true,
        "cmd": "test 2 -eq $(mysql -u 'mariadb_user' -p'secret' -e 'SHOW SLAVE STATUS' --auto-vertical-output | grep -E 'Slave_IO_Running|Slave_SQL_Running' | grep Yes | wc -l)",
        "delta": "0:00:00.009477",
        "end": "2017-06-02 16:52:51.293609",
        "rc": 0,
        "start": "2017-06-02 16:52:51.284132",
        "stderr": "",
        "stdout": "",
        "stdout_lines": [],
        "warnings": []
    }
}
ok: [slave1] => {
    "replication_status": {
        "changed": true,
        "cmd": "test 2 -eq $(mysql -u 'mariadb_user' -p'secret' -e 'SHOW SLAVE STATUS' --auto-vertical-output | grep -E 'Slave_IO_Running|Slave_SQL_Running' | grep Yes | wc -l)",
        "delta": "0:00:00.017658",
        "end": "2017-06-02 16:52:51.325027",
        "rc": 0,
        "start": "2017-06-02 16:52:51.307369",
        "stderr": "",
        "stdout": "",
        "stdout_lines": [],
        "warnings": []
    }
}
ok: [master2] => {
    "replication_status": {
        "changed": true,
        "cmd": "test 2 -eq $(mysql -u 'mariadb_user' -p'secret' -e 'SHOW SLAVE STATUS' --auto-vertical-output | grep -E 'Slave_IO_Running|Slave_SQL_Running' | grep Yes | wc -l)",
        "delta": "0:00:00.015469",
        "end": "2017-06-02 16:52:51.292966",
        "rc": 0,
        "start": "2017-06-02 16:52:51.277497",
        "stderr": "",
        "stdout": "",
        "stdout_lines": [],
        "warnings": []
    }
}
ok: [slave2] => {
    "replication_status": {
        "changed": true,
        "cmd": "test 2 -eq $(mysql -u 'mariadb_user' -p'secret' -e 'SHOW SLAVE STATUS' --auto-vertical-output | grep -E 'Slave_IO_Running|Slave_SQL_Running' | grep Yes | wc -l)",
        "delta": "0:00:00.014586",
        "end": "2017-06-02 16:52:51.291047",
        "rc": 0,
        "start": "2017-06-02 16:52:51.276461",
        "stderr": "",
        "stdout": "",
        "stdout_lines": [],
        "warnings": []
    }
}

PLAY RECAP *********************************************************************
master1   : ok=3    changed=1    unreachable=0    failed=0
master2   : ok=3    changed=1    unreachable=0    failed=0
slave1   : ok=3    changed=1    unreachable=0    failed=0
slave2   : ok=3    changed=1    unreachable=0    failed=0

N.B. There is the mysql_replication Ansible module that some may prefer to use but it requires the MySQLdb Python package to be present on the remote host.


my: a command-line tool for MariaDB Clusters

I’ve posted the code for my MariaDB Cluster command-line tool called my. It does a bunch of stuff but the main purpose is to allow you to easily monitor replication cluster-wide while working in the shell.

Here’s an example of this screen…

hostname port  cons  u_cons  role  repl_detail                                       lag  gtid    read_only
master1  3306  7     0       ms    master2.ucid.local:3306 mysql-bin.000046 7296621  0    0-2-4715491  OFF
master2  3306  33    20      ms    master1.ucid.local:3306 mysql-bin.000052 1031424  0    0-2-4715491  OFF
slave1   3306  5     0       ms    master1.ucid.local:3306 mysql-bin.000052 1031424  0    0-2-4715491  ON
slave2   3306  29    19      ms    master2.ucid.local:3306 mysql-bin.000046 7296621  0    0-2-4715491  ON
backup   3306  5     0       ms    master2.ucid.local:3306 mysql-bin.000046 7296621  0    0-2-4715491  ON

This screen will handle hosts that are down, identify ones where MariaDB isn’t running, highlight replication lag or errors, as well as multi-master setups. See the README for more details for how to get started.


Notes from the field: CockroachDB Cluster Setup

Download the CockroachDB Binary

Perform on each node.

wget https://binaries.cockroachdb.com/cockroach-latest.linux-amd64.tgz
tar xvzf cockroach-latest.linux-amd64.tgz
mv cockroach-latest.linux-amd64/cockroach /usr/bin/
chmod +x /usr/bin/cockroach

Create cockroach user and directories

Perform on each node.

groupadd cockroach
useradd -r cockroach -g cockroach
su - cockroach
cd /home/cockroach
mkdir -p certs my-safe-directory cockroach_db

Check ntp status

Check NTP is running and configured correctly. CockroachDB replies on syncronised clocks to function correctly.

service ntpd.service status

Secure the Cluster

Perform on each node.

Copy all keysgenerate on the first host to the others but regenerate the node certificates (This means the command with create-node). For further details see Secure a Cluster.

cockroach cert create-ca --certs-dir=certs --ca-key=my-safe-directory/ca.key # These keys, in both dirs, need to be copied to each host
ls -l certs
cockroach cert create-node localhost $(hostname) --certs-dir=certs --ca-key=my-safe-directory/ca.key
ls -l certs
cockroach cert create-client root --certs-dir=certs --ca-key=my-safe-directory/ca.key --overwrite
ls -l certs

Start the nodes

node1

su - cockroach
cockroach start --background --host=node1 --http-host=node1 --port=26257 --http-port=8080 --store=/home/cockroach/cockroach_db --certs-dir=/home/cockroach/certs;

node2

su - cockroach
cockroach start --background --host=node2 --http-host=node2 --port=26257 --http-port=8080 --store=/home/cockroach --join=node1:26257 --certs-dir=/home/cockroach/certs

node3

su - cockroach
cockroach start --background --host=node3 --http-host=node3 --port=26257 --http-port=8080 --store=/home/cockroach --join=node1.ucid.local:26257 --certs-dir=/home/cockroach/certs

Check the status of the cluster

sudo su - cockroach
cockroach node ls --certs-dir=certs --host node1
cockroach node status --certs-dir=certs --host node1

Create a cron to start CockroachDB on boot

Create the file /etc/cron.d/cockroach_start with the below cron command for each node…

node1

@reboot cockroach       cockroach start --background --host=node1 --http-host=node1 --port=26257 --http-port=8080 --store=/home/cockroach/cockroach_db --join="node2:26257,node3:26257" --certs-dir=/home/cockroach/certs;

node2

@reboot cockroach       cockroach start --background --host=node2 --http-host=node2 --port=26257 --http-port=8080 --store=/home/cockroach --join="node1:26257,node3:26257" --certs-dir=/home/cockroach/certs;

node3

@reboot cockroach       cockroach start --background --host=node3 --http-host=node3 --port=26257 --http-port=8080 --store=/home/cockroach --join="node1:26257,node2:26257" --certs-dir=/home/cockroach/certs;

Reboot the nodes to ensure the CockroachDB comes up and all join the cluster successfully.


MongoDB: Making the most of a 2 Data-Centre Architecture

There’s a big initiative at my employers to improve the uptime of the services we provide. The goal is 100% uptime as perceived by the customer. There’s obviously a certain level of flexibility one could take in the interpretation of this. I choose to be as strict as I can about this to avoid any disappointments! I’ve decided to work on this in the context of our primary MongoDB Cluster. Here is a logical view of the current architecture, spread over two data centres;

MongoDB Cluster Architecture Two Data Centres

What happens with this architecture?

If DC1 goes down shard0 and shard2 are both read-only while shard1 remains read/write. DC1 contains only a single config server so some meta-data operations will be unavailable. If DC2 goes down shard0 and shard2 remain read/write while shard1 becomes read only. 2 config servers are hosted in DC1 so cluster meta-data operations remain available.

What are the options we can consider when working within the constraints of a two data-centre architecture?

  1. Do nothing and depend on the failed data-centre coming back online quickly. Obviously not an ideal solution. If either data-centre goes down we suffer some form of impairment to the customer.
  2. Nominate one site as the PRIMARY data-centre which will contain enough shard members, for each shard, to achieve quorum should the secondary site go down. Under this architecture we remain fully available assuming the PRIMARY data centre remains up. The entire cluster becomes read-only if the main data-centre goes down. To achieve our goal, of 100% uptime, we have to hope our PRIMARY Data centre never goes down. We could request that application developers make some changes to cope with short periods of write unavailability.
  3. A third option, which would actually work combined with both of the above approaches, would be to force a replicaset to accept writes when only a single server is available. You need to be careful here and be sure the other nodes in the replicaset aren’t receiving any writes. We can follow the procedure Reconfigure a replicaset with unavailable members to make this happen. This would require manual intervention and there would be some work to reconfigure the replicaset when the offline nodes returned.

Clearly none of these options are ideal. In the event of a data-centre outage we are likely to suffer some form of impairment to service. The only exception here is if the SECONDARY data-centre goes down in #2. This strategy depends, to some extent, on luck. With full documented, and tested, failover procedures we can minimise this downtime. The goal of 100% uptime seems pretty tricky to achieve without moving to a 3 data-centre architecture.