Posted on September 17, 2016, 12:46 pm, by Rhys, under
Data,
MongoDB.
Although the mongoexport tool has a –fields option it will always include the _id field by default. You can remove this with a simple line of sed. This was slightly modified from this sed expression. Given the following data… {“_id”:{“$oid”:”57dd2809beed91a333ebe7d1″},”a”:”Rhys”} {“_id”:{“$oid”:”57dd2810beed91a333ebe7d2″},”a”:”James”} {“_id”:{“$oid”:”57dd2815beed91a333ebe7d3″},”a”:”Campbell”} This command-line expression will export and munge the data… mongoexport –authenticationDatabase admin –db test […]
Posted on March 31, 2015, 4:50 pm, by Rhys, under
Big Data,
Data.
If you’re playing with elasticsearch on a single host you may notice your cluster health is always yellow. This is probably because your indexes are set to have one replica but there’s no other node to replicate it to. To confirm if this is the case or not you can look in elasticsearch-head. In the […]
Posted on March 24, 2015, 5:47 pm, by Rhys, under
Data,
DBA,
Linux.
If you’re playing with Kibana and you notice any Pie charts splitting values incorrectly, i.e. on a hostname with hyphen characters, then here’s the fix you need to apply. It’s actually something elasticsearch does… curl -XPUT http://localhost:9200/_template/syslog -d ‘ { “template”: “*syslog*”, “settings” : { “number_of_shards” : 1 }, “mappings” : { “file” : { “properties” […]
Here is an updated version of the instructions given at Free Alternative to Splunk Using Fluentd. The installation was performed in CentOS 6.5. 1. Install ElasticSearch mkdir /opt/src cd /opt/src wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.2.1.noarch.rpm rpm -ivh elasticsearch-1.2.1.noarch.rpm /sbin/chkconfig –add elasticsearch service elasticsearch start # Move default file locations if required mkdir /data/elasticsearch mkdir /data/elasticsearch/data mkdir /data/elasticsearch/tmp mkdir /data/elasticsearch/logs […]
Posted on February 27, 2010, 7:18 pm, by Rhys, under
Data,
Powershell.
The London Datastore has loads of datasets available that we can use for free. One of the datasets available is a list of TFL Station Locations. The station location feed is a geo-coded KML feed of most of London Underground, DLR and London Overground stations. Here’s Powershell script that will extract this data from a […]