Migrating data into Drupal over port forwarding

Migrate is a framework for getting data into a Drupal website from all sorts of remote data sources. When those sources are MySQL databases, they can be represented in your site's settings.php.

However, sometimes you'll need to test your migrations outside of the trusted server(s) that can connect to your data sources. For example, say you're importing from a corporate blog outside of Drupal, into your live Drupal site. You won't want just any old server to be able to view data in the corporate blog, so your sysadmins will have already locked down access, typically with firewalls first of all (and much more.) Your local development environment probably won't be able to get through that firewall on the MySQL port. And what if you're developing from home? Worst of all, when Migrate checks all of its data sources and finds one it can't connect to, it breaks hard: a watchdog() message, a PHP exception and a HTTP 500 error.

If you can't reconfigure the firewall (for whatever reason: ours was that the third-party host administrator was on holiday for a few days!) then here's a three-part hack to get around gradually more complicated access restrictions.

1. SSH port forwarding

You need to be able to SSH to a server that your data source's firewall "trusts" - typically the production server, but it could be any server that has access through the firewall. Do so, and when you do it, add the following -L argument.

ssh account@trusted.example.com -L9991:firewalled1.example.com:3306

This forwards traffic on your local :9991 port to the MySQL port on firewalled1.example.com, as seen from the trusted server.

Repeat this for every remote server: firewalled2, firewalled3 etc. Increment the "9991" every time, so each server has a different port assigned to it. Use high-numbered ports as that keeps your port forwarding out of the way of other services that might already be running on your own local machine.

(There are tricks to forward multiple ports over a single SSH connection, but this is the most straightforward way to do it.)

2. Drupal data sources

Now you need to change your Drupal data sources. We keep settings.php in version control - making it harder to lose any special configuration like cacheing or Domain Access, if you have a serious outage - so we try not to modify the file directly.

At the end of settings.php, add the following include() directive:

<?php
/* ... end of settings.php */
include('settings-local.php');

Then, in settings-local.php, put the following code:

<?php
$port_forwards = array(
  'firewalled1.example.com' => '9991',
  'firewalled2.example.com' => '9992',
  'firewalled3.example.com' => '9993',
);
 
foreach($databases as $name => &$db) {
  // Is this a port-forwarded host
  if (!isset($port_forwards[$db['default']['host']])) {
    continue;
  }
 
  // Substitute in the host and port, keeping a record of
  // the old host for debugging purposes
  $db['default']['port'] = $port_forwards[$db['default']['host']];
  $db['default']['_old_host'] = $db['default']['host'];
  $db['default']['host'] = 'localhost'; // We'll change this in a bit!
}

Typically you'll have your local default-default Drupal database, followed by multiple remote Migrate data sources. So we distinguish between them by seeing whether or not the host is one of our firewalled hosts.

You'll see that we've set the new connection host to be localhost. This might not work as we'll see.

3. Tricking mysql-client with /etc/hosts

If you've been given a MySQL user for your migration, it will probably have come with a password, and it'll be represented as a Drupal data source something like this:

<?php
// Primary connection
$databases['migrate_data_source']['default'] = array (
  'database' => 'remote_db',
  'username' => 'drupal',
  'password' => 'jozxyqk',
  'host' => 'firewalled1.example.com',
  'driver' => 'mysql',
);

However, the drupal user in our example has almost certainly only been given permission in MySQL to access from trusted hosts and nowhere else: maybe that's the live server, which firewalled1 sees as being called "trusted.internal.example.com". That means the SQL used to grant permissions was something like:

GRANT ALL ON remote_db.* TO 'drupal'@'trusted.internal.example.com' IDENTIFIED BY 'jozxyqk';

This is good "defence in depth", but it will also lead to the following error, even after you've got your port forwarding set up:

mysql -u drupal -h localhost -P 9993 remote_db -p
Enter password: 
ERROR 1045 (28000): Access denied for user 'drupal'@'localhost' (using password: YES)

How to trick mysql-client into thinking you're coming from trusted.internal.example.com? Add this line to /etc/hosts:

127.0.0.1 trusted.internal.example.com

and connect using host=trusted.internal.example.com , not host=localhost:

mysql -u migration -h trusted.internal.example.com -P 9993 aadblog -p
Enter password: 
Reading table information for completion of table and column names
...

Update the code snippet defined above in 2, to use the new host, and you should be away. Once you're able to navigate to the Migrate admin screen, that means Drupal has successfully accessed all MySQL data sources.

That was a bit of a roundabout way of setting up a migration, wasn't it?

Absolutely. If at all possible, you should get a sysadmin to reconfigure the server instead. But sometimes, for whatever reason, that just isn't possible. In those cases, it's good to know that there exists a workaround - however tortuous.