Technical Goals for 2016

Happy New Year!

Time sleeps for no one least of all the busy IT worker! It’s important to constantly be pushing your professional skills onwards by improving existing skills and learning new ones.

Here are my goals for 2016.

  • Python – I’ve dabbled in Python occasionally over the years but it’s time to take it to the next level. I’ll do this by starting a github project or two. I’ll probably be building tools for MongoDB, MariaDB or elasticsearch. Codecademy have a good python course that’s free. It starts off a little basic for my taste but it’s otherwise pretty good. Python has some interesting language features that you probably haven’t seen before.
  • Puppet – While I’ve worked in environments using Puppet I’ve never got involved myself. This must change! Some of my existing ways of doing things seem a little stone-age compared to the Puppet way of doing things! PuppetLabs have a lot of training materials on their site (some free).
  • elasticsearch – Become a more competent elasticsearch ‘DBA’. I’ve played with elasticsearch a little, it’s good fun and I want to increase my skills in both the IT and development areas. There are a lot of learning resources available from the company behind elasticsearch.
  • PostgreSQL – Become a more competent PostgreSQL DBA. I’ll focus more on the IT side here; backups / restore, configuration and high availability will be the main areas of focus. PostgreSQL seems to be gaining a lot of traction recently so I think it’s a worthy addition to your skillset.

Using the $lookup operator in MongoDB 3.2

I often loiter over on the MongoDB User Google Group and there was an interesting question posted the other day. The poster wanted to form a document like this from two collections (where foo is a document from another collection)…

?View Code JAVASCRIPT
{
    "_id" : ObjectId("ObjectId of this bar"),
    "name" : "Bar Name",
    "foo" : {
      "_id" : ObjectId("56534720e2359196cf20f791"),
      "name" : "Foo Name" 
    }
  }

Historically this would have been a multi-step operation.  We can now use the $lookup operator, in MongoDB 3.2, to achieve this in a single query using the aggregate framework. Here’s a quick demo on how to create the above document using the aggregate framework.

Insert some test data…

?View Code JAVASCRIPT
use test
 
db.foo.insert({"name": "Foo Name", "lookup_id": 1});
db.bar.insert({"name" : "Bar Name", "lookup_id": 1 });

With this data the aim is to pull a document in from the foo collection by matching the lookup_id field. We do this by running an aggregate on the bar collection using the $lookup operator.

?View Code JAVASCRIPT
db.bar.aggregate([
			{ "$match": { "lookup_id": 1 } },
			{ "$project": { "_id": 1, "name": 1, "lookup_id": 1 } },
			{ "$lookup": { "from": "foo",
					"localField": "lookup_id",
					"foreignField": "lookup_id",
					"as": "foo" } 
				}
		]);

This produces the following document…

?View Code JAVASCRIPT
{
	"_id" : ObjectId("565b20ca7288e2c4e2b3b148"),
	"name" : "Bar Name",
	"lookup_id" : 1,
	"foo" : [
		{
			"_id" : ObjectId("565b20c97288e2c4e2b3b147"),
			"name" : "Foo Name",
			"lookup_id" : 1
		}
	]
}

Note how the foo document is contained within an array. This isn’t completely what the posted wanted but we can sort that out with the $unwind operator.

?View Code JAVASCRIPT
db.bar.aggregate([
			{ "$match": { "lookup_id": 1 } },
			{ "$project": { "_id": 1, "name": 1, "lookup_id": 1 } },
			{ "$lookup": { "from": "foo",
					"localField": "lookup_id",
					"foreignField": "lookup_id",
					"as": "foo" } 
				},
			{ "$unwind": "$foo" },
			{ "$project": { "_id": 1, "name": 1, "foo._id": 1, "foo.name": 1 } }
		]);

This gives us our final document…

?View Code JAVASCRIPT
{
	"_id" : ObjectId("565b20ca7288e2c4e2b3b148"),
	"name" : "Bar Name",
	"foo" : {
		"_id" : ObjectId("565b20c97288e2c4e2b3b147"),
		"name" : "Foo Name"
	}
}

Progress bars in BASH with pv

Way back in 2009 I wrote a post about how to display progress bars in Powershell. The same thing is possible in bash with pv. If it’s not available in your shell just do…

yum install pv

Or equivalent for your platform. The following example compresses a file into a tar archive reporting progress with a bar…

tar cf - mongodb-linux-x86_64-2.6.10.tgz | pv -s 116654065 | gzip -9 > rhys.tar.gz

Breaking this down…

tar cf - "file to compress" | pv -s "size of file to compress (bytes)" | gzip -9 > "gz archive to create"

This will show a progress bar looking something like…

111MiB 0:00:05 [20.7MiB/s] [==================================================================================================>] 100%

You can display integer numbers for percentage progress in you prefer. Just do…

tar cf - mongodb-linux-x86_64-2.6.10.tgz | pv -s 116654065 -n | gzip -9 > rhys.tar.gz

Output…

17
36
53
71
89
100

Launch a MongoDB Cluster for testing

Here’s a bash script I use to create a sharded MongoDB Cluster for testing purposes. The key functions are mongo_setup_cluster and mongo_teardown_cluster. The script will created a Mongo Cluster with 2 shards, with 3 nodes each, 3 config server and 3 mongos servers.

UPDATE 2015/10/02 I’ve found out about an undocumented option available to set the WiredTiger cache size in megabytes rather than gigabytes. Useful on test machines with limited RAM. The option is –wiredTigerEngineConfigString . To set the cache size to 200 megabytes you would do… –wiredTigerEngineConfigString=”cache_size=200M”

set -e;
set -u;
 
function mongo_teardown_cluster()
{
	killall --quiet mongos && echo "mongos processes have been murdered.";
	killall --quiet mongod && echo "mongod processes have been murdered.";
	cd ~;
	rm -Rf rhys_sharded_cluster_test_temp && echo "Directory rhys_sharded_cluster_test_temp removed.";
}
 
function mongo_create_directories()
{
	cd ~;
	mkdir -p rhys_sharded_cluster_test_temp;
	cd rhys_sharded_cluster_test_temp;
	mkdir config1 config2 config3 mongos1 mongos2 mongos3 shard0_30001 shard0_30002 shard0_30003 shard1_30004 shard1_30005 shard1_30006;
}
 
function mongo_create_config_servers()
{
	mongod --configsvr --port 27019 --dbpath ./config1 --logpath config1.log --smallfiles --nojournal --fork
	mongod --configsvr --port 27020 --dbpath ./config2 --logpath config2.log --smallfiles --nojournal --fork
	mongod --configsvr --port 27021 --dbpath ./config3 --logpath config3.log --smallfiles --nojournal --fork
}
 
function mongo_create_mongos_servers()
{
		mongos --configdb "localhost.localdomain:27019,localhost.localdomain:27020,localhost.localdomain:27021" --logpath mongos1.log --port 27017 --fork
		mongos --configdb "localhost.localdomain:27019,localhost.localdomain:27020,localhost.localdomain:27021" --logpath mongos2.log --port 27018 --fork
		mongos --configdb "localhost.localdomain:27019,localhost.localdomain:27020,localhost.localdomain:27021" --logpath mongos3.log --port 27016 --fork
}
 
function mongo_create_mongod_shard_servers()
{
	# shard0 mongod instances
	mongod --smallfiles --nojournal --storageEngine wiredTiger --dbpath ./shard0_30001  --port 30001 --replSet "rs0" --logpath shard0_30001.log --fork
	mongod --smallfiles --nojournal --storageEngine wiredTiger --dbpath ./shard0_30002  --port 30002 --replSet "rs0" --logpath shard0_30002.log --fork
	mongod --smallfiles --nojournal --storageEngine wiredTiger --dbpath ./shard0_30003  --port 30003 --replSet "rs0" --logpath shard0_30003.log --fork
	# shard1 mongod instances
	mongod --smallfiles --nojournal --storageEngine wiredTiger --dbpath ./shard1_30004  --port 30004 --replSet "rs1" --logpath shard1_30004.log --fork
	mongod --smallfiles --nojournal --storageEngine wiredTiger --dbpath ./shard1_30005  --port 30005 --replSet "rs1" --logpath shard1_30005.log --fork
	mongod --smallfiles --nojournal --storageEngine wiredTiger --dbpath ./shard1_30006  --port 30006 --replSet "rs1" --logpath shard1_30006.log --fork	
}
 
function mongo_configure_replicaset_rs0()
{
	mongo --port 30001 <<EOF
	rs.initiate();
	while(rs.status()['myState'] != 1) {
		print("State is not yet PRIMARY. Waiting...");
	}	
	rs.add("localhost.localdomain:30002");
	rs.add("localhost.localdomain:30003");
EOF
	STATUS=$?;
	return $STATUS;
}
 
function mongo_configure_replicaset_rs1()
{
	mongo --port 30004 <<EOF
	rs.initiate();
	while(rs.status()['myState'] != 1) {
		print("State is not yet PRIMARY. Waiting...");
	}	
	rs.add("localhost.localdomain:30005");
	rs.add("localhost.localdomain:30006");
EOF
}
 
function mongo_configure_sharding()
{
	mongo <<EOF 
	sh.addShard( "rs0/localhost.localdomain:30001" );
	sh.addShard( "rs1/localhost.localdomain:30004" );
	sh.enableSharding("test");
	sh.shardCollection("test.test_collection", { "a": 1 } );
EOF
}
 
function mongo_setup_cluster()
{
	mongo_create_directories && echo "Successfully created directories";
	mongo_create_config_servers && echo "Successfully started configuration servers." && sleep 5;
	mongo_create_mongos_servers && echo "Successfully started mongos servers." && sleep 5;
	mongo_create_mongod_shard_servers && echo "Successfully started mongod shard servers";
	echo "Sleeping for sixty seconds before attempting replicaset & shard configuration." && sleep 60;
	mongo_configure_replicaset_rs0 && echo "Successfully configured replicaset rs0." && sleep 5;
	mongo_configure_replicaset_rs1 && echo "Successfully configured replicaset rs1." && sleep 5;
	mongo_configure_sharding && echo "Successfully configured Sharding and sharded test.collection by t_u.";
	echo "TODO.... add function for loading data into test.test_collection.";
}

Usage is as follows…

. /path/to/mongo_cluster_script.sh; # Source the above bash functions
mongo_setup_cluster(); # Fire up a sharded cluster

To destroy the cluster…

mongo_teardown_cluster();

Note this will kill all mongo processes and remove all data directories!


Partitioning setup for Linux from Scratch in VirtualBox

I’ve finally taken the plunge and committed, to untarring and compiling, a bucket load of source code to complete Linux from Scratch. I’ll be documenting some of my setup here. I’m far from an expert, that’s why I’m doing this, but if you have any constructive criticism I’d be glad to hear it. I’m using VirtualBox and an installation of CentOS to build LFS.

The first task I’ll be undertaking is partitioning a disk ready for my LFS setup. I’ve designed my partition setup based on the advice in the LFS manual: Creating a New Partition.

Partition Size (GB/MB) On Primary
/ (root partition) 10GB 1
/home 10GB 1
/usr 5GB 1
/opt 10GB 1
/swap 2GB 0
/boot 100M 0

* The On Primary = 1 means the partition will be hosted on the first, larger, partition we create.

Add a VirtualBox HDD

First add VirtualBox hard disk. I added a 50GB drive in VB for this to give me plenty of space for my LFS installation.

Linux From Scratch VirtualBox Disk

Identify the new device

Boot up the host OS, CentOS 7 in my case, and open a command prompt once logged in. The command lsblk can be used to quickly identify the new device. From the output we can easily identify the disk as sdb. It is 50GB and contain no partitions.

linux> lsblk
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda               8:0    0   50G  0 disk 
├─sda1            8:1    0  500M  0 part /boot
└─sda2            8:2    0 49.5G  0 part 
  ├─centos-swap 253:0    0    2G  0 lvm  [SWAP]
  └─centos-root 253:1    0 47.5G  0 lvm  /
sdb               8:16   0   50G  0 disk 
sr0              11:0    1 1024M  0 rom  

Create a large Primary Partition

First we will create a single large partition. This partition will be the logical container for the above partition mark above as On primary = 1. We will be using fdisk for this. You’ll need to run fdisk as the root user.

fdisk /dev/sdb
Welcome to fdisk (util-linux 2.23.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Device does not contain a recognized partition table
Building a new DOS disklabel with disk identifier 0xdf241aa7.

Command (m for help): 

Enter ‘n’ to create a new partition.
Enter ‘e’ for partition type.
Enter ‘1’ for partition number.
Accept the default for first sector.
Enter ‘+40G’ for the last sector.

This partition does not yet exist, we have to tell fdisk to writes changes before that happens. However you can have a look at what will be done by entering the command ‘p’…

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1            2048    83888127    41943040    5  Extended

Now we will add the four logical partitions inside the one above…

Enter ‘n’ to create a new partition.
Enter ‘l’ for logicial.
Accept the default for first sector.
For last sector enter ‘+10G’

Enter ‘n’ to create a new partition.
Enter ‘l’ for logicial.
Accept the default for first sector.
For last sector enter ‘+10G’

Enter ‘n’ to create a new partition.
Enter ‘l’ for logicial.
Accept the default for first sector.
For last sector enter ‘+5G’

Enter ‘n’ to create a new partition.
Enter ‘l’ for logicial.
Accept the default for first sector.
For last sector enter ‘+10G’

Enter the ‘p’ command to print out the partition table

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1            2048    83888127    41943040    5  Extended
/dev/sdb5            4096    20975615    10485760   83  Linux
/dev/sdb6        20977664    41949183    10485760   83  Linux
/dev/sdb7        41951232    52436991     5242880   83  Linux
/dev/sdb8        52439040    73410559    10485760   83  Linux

Finally we will add partitions for the swap and boot partitions. While still in fdisk…

Enter ‘n’ to create a new partition.
Enter ‘p’ for primary.
Accept the default for first sector and partition number.
For last sector enter ‘+2G’

Enter ‘n’ to create a new partition.
Enter ‘p’ for primary.
Accept the default for first sector and partition number.
For last sector enter ‘+100M’

Enter ‘p to print the output. You should have something like below.

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1            2048    83888127    41943040    5  Extended
/dev/sdb2        83888128    88082431     2097152   83  Linux
/dev/sdb3        88082432    88287231      102400   83  Linux
/dev/sdb5            4096    20975615    10485760   83  Linux
/dev/sdb6        20977664    41949183    10485760   83  Linux
/dev/sdb7        41951232    52436991     5242880   83  Linux
/dev/sdb8        52439040    73410559    10485760   83  Linux

Finally enter ‘w’ to write the changes to disk and exit. You can use lsblk again to get a more human friendly view of the partition sizes.

NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda               8:0    0   50G  0 disk 
├─sda1            8:1    0  500M  0 part /boot
└─sda2            8:2    0 49.5G  0 part 
  ├─centos-swap 253:0    0    2G  0 lvm  [SWAP]
  └─centos-root 253:1    0 47.5G  0 lvm  /
sdb               8:16   0   50G  0 disk 
├─sdb1            8:17   0    1K  0 part 
├─sdb2            8:18   0    2G  0 part 
├─sdb3            8:19   0  100M  0 part 
├─sdb5            8:21   0   10G  0 part 
├─sdb6            8:22   0   10G  0 part 
├─sdb7            8:23   0    5G  0 part 
└─sdb8            8:24   0   10G  0 part 
sr0              11:0    1 1024M  0 rom  

Format the partitions and give them each a table

From the output of lsblk you can match up the devices using their sizes and give them a label.

sudo mkfs -v -t ext4 /dev/sdb7 -L usr 
sudo mkfs -v -t ext4 /dev/sdb3 -L boot
sudo mkfs -v -t ext4 /dev/sdb5 -L root
sudo mkfs -v -t ext4 /dev/sdb6 -L home
sudo mkfs -v -t ext4 /dev/sdb8 -L opt
sudo mkswap /dev/sdb2 -L swap

Mount the partitions

Next create some folders to mount the partitions on….

mkdir /mnt/lfs
chown -R rhys:users /mnt/lfs
export LFS=/mnt/lfs
mount -v -t ext4 /dev/sdb5 $LFS
mkdir /mnt/lfs/usr
mount -v -t ext4 /dev/sdb7 $LFS/usr
mkdir /mnt/lfs/home
mount -v -t ext4 /dev/sdb6 $LFS/home
mkdir /mnt/lfs/opt
mount -v -t ext4 /dev/sdb8 $LFS/opt
mkdir /mnt/lfs/boot
mount -v -t ext4 /dev/sdb3 $LFS/boot
swapon /dev/sdb2

You can view the new mounts with..

linux&gt; df -h
...
dev/sdb5                9.8G   37M  9.2G   1% /mnt/lfs
/dev/sdb7                4.8G   20M  4.6G   1% /mnt/lfs/usr
/dev/sdb6                9.8G   37M  9.2G   1% /mnt/lfs/home
/dev/sdb8                9.8G   37M  9.2G   1% /mnt/lfs/opt
/dev/sdb3                 93M  1.6M   85M   2% /mnt/lfs/boot
...

You may receive a warning about these mounts…

You just mounted an file system that supports labels which does not
contain labels, onto an SELinux box. It is likely that confined
applications will generate AVC messages and not be allowed access to
this file system.  For more details see restorecon(8) and mount(8).

Fix this with…

linux&gt; restorecon -R /mnt

Making the mounts persistent

You can copy the output from /etc/mtab and add an edited version to your /etc/fstab file to make these mounts persistent.

These mounts should probably be owned by the lfs user. I’ll update this section with more detail when I decide precisely what to do.