Quantcast
Channel: Nickebo.net » fifo
Viewing all articles
Browse latest Browse all 2

Setting up a local Dataset API repository for SmartOS

$
0
0

The Dataset API is a repository for dataset metadata and files for SmartDataCenter, the commercial product from Joyent for managing a private cloud based on SmartOS. The implementation of this API isn’t open, that means you can’t use it with FiFo and/or your own SmartOS nodes for example. Since I use FiFO to manage my private cloud I really wanted a local repository for my datasets, especially the Windows based ones since they aren’t available from Joyents reposity nor datasets.at. Forturnatley for me Daniel “MerlinDMC” Marlon has developed a free Dataset API server (called dsapid), based on CouchDB and Node.JS ,which I can use. With some help from him I’ve set up my server at home and I thought I’d share my experiences on how to make it work.

Installing a new VM to host dsapid

A new json file for the dsapid server.

{
  "brand": "joyent",
  "alias": "dsapid",
  "tmpfs": 1024,
  "image_uuid": "cf7e2f40-9276-11e2-af9a-0bad2233fb0b",
  "filesystems": [
    {
      "type": "lofs",
      "source": "/zones/dsapi-server-data",
      "target": "/database"
    }
  ],
  "nics": [
    {
      "nic_tag": "admin",
      "ip": "192.168.1.123",
      "netmask": "255.255.255.0",
      "gateway": "192.168.1.1"
    }
  ]
}

This one is based on SmartMachine base64 1.9.1. I’m using a lofs mount to get direct access to a directory under /zones in the GZ, this directory will hold the CouchDB database. You might also want to use a bigger tmpfs if you plan on using big datasets, the dataset is uploaded to /tmp before it’s imported into CouchDB. I use a tmpfs that’s 8 GB, but that’s not big enough sometimes (as you’ll see if you continue reading).

Installing software in the new dsapid zone

It’s time to install the software needed.

pkgin in couchdb nginx node gcc47 gmake scmgit
mkdir /opt/dsapi-ui
chown -R couchdb:couchdb /database

The software is installed and the proper directories created. The dsapid server will be installed later once the database, web server etc. are configured.

Configuration of the installed software

Add the following to /opt/local/etc/couchdb/local.ini under [couchdb]:

database_dir = /database
view_index_dir = /database

To make sure we get periodic syncing from the dataset source (which can be Joyent repo, datasets.at or some other dataset API server), add the following to crontab:
0 * * * * /opt/dsapi/sbin/dsapi-sync-manifests
0 * * * * /opt/dsapi/sbin/dsapi-sync-files

Time to configure nginx, replace the contents of /opt/local/etc/nginx/nginx.conf with the following:
user   www  www;
worker_processes  1;

events {
  # After increasing this value You probably should increase limit
  # of file descriptors (for example in start_precmd in startup script)
  worker_connections  1024;
}

http {
  include       /opt/local/etc/nginx/mime.types;
  default_type  application/octet-stream;

  sendfile        on;
  #tcp_nopush     on;
  tcp_nodelay     on;

  #keepalive_timeout  0;
  keepalive_timeout  65;

  gzip               on;
  gzip_http_version 1.1;
  gzip_proxied      any;
  gzip_vary          on;
  gzip_types text/plain text/html text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript application/javascript text/x-js;
  gzip_disable "MSIE [1-6]\.(?!.*SV1)";
  gzip_buffers    16 8k;

  client_max_body_size 1024m;

  server {
    listen       80;
    server_name  localhost;
    root         /opt/dsapi-ui;

    location = /ping {
      proxy_pass http://localhost:8080;
    }

    location = /stats {
      proxy_pass http://localhost:8080;
    }

    location = /datasets {
      proxy_pass http://localhost:8080;
      proxy_read_timeout 500;
      proxy_connect_timeout 500;
    }

    location ~* ^/datasets/[a-f0-9-]+$ {
      proxy_pass http://localhost:8080;
      proxy_read_timeout 500;
      proxy_connect_timeout 500;
    }

    location ~* ^/datasets/[a-f0-9-]+/.+ {
      proxy_pass http://localhost:5984;
      proxy_read_timeout 500;
      proxy_connect_timeout 500;
    }
  }
}

Increase client_max_body_size to the same as your tmpfs size if you increased that earlier. I’ve also added high timeouts to make sure there’s enough time to upload big datasets.
Now it’s time to start the services we’ve configured.
svcadm enable epmd:default
svcadm enable couchdb:default

Installing dsapid

It’s time to install the dsapid server.

git clone git://github.com/MerlinDMC/smartos-public-dsapi.git /opt/dsapi
cd /opt/dsapi
npm install
svccfg import /opt/dsapid/smf/dsapid.xml
svcadm enable dsapid:default
svcadm enable nginx:default

Adding remote dataset API servers for syncing

If you want to sync an external dataset server, do the following (use -f for manifests AND dataset image files).

/opt/dsapi/bin/add-sync-source joyent https://datasets.joyent.com/datasets (fetch only manifests files get served from the joyent server)
/opt/dsapi/bin/add-sync-source joyent https://datasets.joyent.com/datasets -f (fetch manifests and files)

Installing the web GUI

There’s a web GUI for the dataset server, although it’s made for datasets.at it will work internally (with some stuff still referring to datasets.at).

curl -O https://dl.dropbox.com/u/2265989/SmartOS/dsapi-ui.tar.bz2
tar -xjf dsapi-ui.tar.bz2 -C /opt/dsapi-ui

The web GUI should be available at http://[hostname]

Uploading your own datasets

To start with you need a username and an associated password to be able to upload.

/opt/dsapi/bin/grant-upload [username] [password]

If you haven’t created a dataset before, read my blog post about how to do it. Once you have your manifest file and dataset image file, it’s time to upload it. This is done with curl, for example.
curl -X PUT -u [username]:[password] -F manifest=@[manifest name].dsmanifest -F [image name]=@[image name] http://[dsapi server]/datasets/[dataset UUID]

The “image name” will be something like winserver.zvol.gz or testzone.zfs.bz2. The dataset UUID is the same UUID that you gave the dataset during creation.

If you for some reason don’t get a dataset fully uploaded you’ll have to delete it through Futon, CouchDB’s web management GUI, manually. If you don’t want to change the configuration on the server, which says that CouchDB is only listening on localhost, you can set up a SSH tunnel to access Futon. You’ll probably have to reconfigure sshd to allow root login with a password or add your public key for SSH login.

ssh -f -L 127.0.0.1:5984:127.0.0.1:5984 root@[dsapi server] -N

Now you can access Futon on http://localhost:5984/_utils/ and delete the document created for the dataset (named its UUID).

This is everthing you need to use the dsapi server. To use it with imgadm:

echo "http://[dsapi server]/datasets" > /var/db/imgadm/sources.list
imgadm update

Installing very big datasets

Datasets that are very big will need a very big tmpfs. If you’re dataset is 4 GB you will need over 8 GB of tmps. If you don’t have enough memory you can bypass nginx, which is usually used, and communicate directly with dsapid. By doing this you will only need ~4 GB tmpfs if you are importing a 4 GB dataset.

Start by disabling nginx.

svcadm disable nginx

The next step is to get dsapid to listen on 0.0.0.0 instead of 127.0.0.1. You can disable dsapid in the way you did with nginx and set an env variable (DSAPI_HOST to 127.0.0.1) and start dsapid manually. This won’t persist, if you want this to be persistent:
  • Disable dsapid
  • Remove dsapid using svccfg delete dsapid
  • Edit /opt/dsapi/smf/dsapid.xml and change the DSAPI_HOST variable in the xml file
  • Import the xml file again using svccfg import /opt/dsapi/smf/dsapid.xml

Now you can install the big dataset, use port 8080 with curl. When done you can change back to using nginx. You could (probably) compile nginx with an upload module, but since the nginx available in pkgin doesn’t have this I haven’t tried it.

Using the local dsapid with FiFO

The dsapid is now working with imgadm in native SmartOS, but as I mentioned in the beginning of the post I use FiFo to manage my VMs. To get this working you need some extra tweaks. To start with you need to add the following to your dsapid servers nginx.conf.

if ($request_method = 'OPTIONS') {
   add_header 'Access-Control-Allow-Origin' '*';
   add_header 'Access-Control-Allow-Credentials' 'true';
   add_header 'Access-Control-Allow-Methods' 'GET, POST, OPTIONS';
   add_header 'Access-Control-Allow-Headers' 'DNT,X-Mx-ReqToken,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type';
   add_header 'Access-Control-Max-Age' 1728000;
   add_header 'Content-Type' 'text/plain charset=UTF-8';
   add_header 'Content-Length' 0;
   return 204;
}

if ($request_method = 'POST') {
   add_header 'Access-Control-Allow-Origin' '*';
   add_header 'Access-Control-Allow-Credentials' 'true';
   add_header 'Access-Control-Allow-Methods' 'GET, POST, OPTIONS';
   add_header 'Access-Control-Allow-Headers' 'DNT,X-Mx-ReqToken,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type';
}

if ($request_method = 'GET') {
   add_header 'Access-Control-Allow-Origin' '*';
   add_header 'Access-Control-Allow-Credentials' 'true';
   add_header 'Access-Control-Allow-Methods' 'GET, POST, OPTIONS';
   add_header 'Access-Control-Allow-Headers' 'DNT,X-Mx-ReqToken,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type';
}

Add this to the last three “location” statements containing references to /datasets and restart nginx. Now it’s time to make the last change, tell FiFo to use the local dsapi server. Edit /opt/local/jingles/app/scripts/config.js and change the following line:
datasets: 'datasets.at',

Change from datasets.at to whatever your server is called.

Done. Enjoy using your local Dataset API Server!


Viewing all articles
Browse latest Browse all 2

Latest Images

Trending Articles





Latest Images