Multi database doctrine-symfony based project

9 min readDec 17, 2023

Let’s say that we need a site with separated databases depending on logic rules. And let’s say that the number of databases can grow quickly.

Just copying the standard single database application is not an option for maintainability reasons. May be possible that allocating resources for all the application instance is not possible, or for instance, each application instance has low workload and it’s not worthy maintaining each one separately.

Also operations costs can grow easily. Imagine that you have one hundred copies and on every hot fix you have to update each instance. If for some reason the deployment fails, all the logs has to be analyzed and determine which ones needs to be replayed.

Data grows out of control — Generated by dall-e

In this context seems interesting having a single application instance and some logic to decide to which database each request goes.

The starting point

Doctrine supports multiple connections natively, but they must defined at develop time, as a list of entity managers. This means that every time that a new database is added, someone must update that file and update the application.

Searching for some help I found this thread, and that was how I started developing this approach, but quickly I found some issues.

The checklist

First of all I need some logic to decide what database to use on each request. In this example, this logic just takes the first component of the domain received in the origin header and sets the database.

It no database is available if falls back to ‘app_dev’ database if in development environment or throws an exception in production environment.

I need some method of creating databases on the go, and being sure that this databases area immediately available so…

I need a command that updates all the databases. As the number of databases grows up running the command by hand 50, 60 o hundred of times becomes not possible.

Setting up the database connection

As Karol Dabrowski shows in his Stackoverflow answer, you can change the connection configuration using a wrapper class. This class extends from the standard Doctrine connection class an ads a method to update configuration settings. It uses the getParams method to get the connection params and then updates the ‘dbname’ parameter. Also closes the connection if it’s open and creates new one.

I also implemented the getDatabases method to get a list of available databases that start with a given prefix. This prefix ensures that only databases that start with that prefix will be updated, and system databases like mysql, performance_schema… will never selected by mistake.

namespace App\Connection;


use Doctrine\DBAL\Connection;
use Doctrine\DBAL\Driver;

class DoctrineMultidatabaseConnection extends Connection {
    public function changeDatabase(string $dbName): bool {
        $params = $this->getParams();
        if ($params['dbname'] != $dbName) {
            if ($this->isConnected()) {
                $this->close();
            }
            $params['url'] = "mysql://" . $params['user'] . ":" . $params['password'] . "@" . $params['host'] . ":" . $params['port'] . "/" . $dbName;
            $params['dbname'] = $dbName;
            parent::__construct(
                $params,
                $this->_driver,
                $this->_config,
                $this->_eventManager
            );
            return true;
        }
        return false;
    }

    public function getDatabases(string $prefix = 'app_') {
        $dbs = $this->fetchAllAssociative('show databases;');
        $res = [];
        foreach ($dbs as $key => $dbName) {
            if (strpos($dbName['Database'], $prefix) === 0) {
                $res[] = $dbName['Database'];
            }
        }
        return $res;
    }
}

Once created the wrapper class it must be configured in config/packages/doctrine.yaml

doctrine:
  dbal:
    connections:
      default:
        wrapper_class: App\Connection\DoctrineMultidatabaseConnection
        url: '%env(resolve:DATABASE_URL)%'

Selecting the database

As said before, on each connection I’ll take the origin header, explode it by ‘.’ and get the first component as the database. So for instance if the request comes from customer_a.myapp.com. the resulting database name will be app_customer_a.

To get this I’ll be using a kernel event listener, so on each request, the origin header will be analyzed and the database name will be set.

<?php

namespace App\Events;

use App\Kernel;
use Doctrine\Persistence\ManagerRegistry;
use Symfony\Component\HttpFoundation\Request;
use Symfony\Component\HttpKernel\Event\KernelEvent;
use Symfony\Component\EventDispatcher\Attribute\AsEventListener;
use Symfony\Component\HttpKernel\KernelEvents;
use Symfony\Component\HttpKernel\KernelInterface;

/**
 *
 */
class KernelEventListener {

    /**
     * @param ManagerRegistry $doctrine
     * @param KernelInterface $kernel
     */
    public function __construct(
        private ManagerRegistry $doctrine,
        private KernelInterface $kernel
    ) {
    }

    /**
     * @param KernelEvent $event
     * @return void
     * @throws \Exception
     */
    #[AsEventListener(event: KernelEvents::REQUEST)]
    public function onRequest(KernelEvent $event): void {
        $dbName = $this->getDatabaseNameFromOrigin($event->getRequest());
        if (in_array($dbName, $this->doctrine->getConnection()->getDatabases())) {
            $this->doctrine->getConnection()->changeDatabase($dbName);
        } else {
            if ($this->kernel->getEnvironment() === 'prod') {
                throw new \Exception('No database available for given origin');
            }
        }

    }

    /**
     * @param Request $request
     * @return string
     */
    private function getDatabaseNameFromOrigin(Request $request): string {
        $origin = $request->headers->get('origin');
        $exploded = explode('.', $origin);
        return 'app_' . $exploded[0];
    }
}

This event listener has a private method that gets the first segment of the origin header, then prepends the ‘app_’ prefix and returns it.

The listener methods gets this database name and check if there is a database available. If so updates de connection. If not depending on the enviroment falls back to app_dev database or throws an exception.

The app_dev database is important in development environment. This one is used to build the migrations, and is needed in order the web profiler to work.

As the AsEventListener attribute is used, no extra configuration is needed.

Managing databases

Firs I created a command to create databases. This command is some kind of wrapper for doctrine:database:create command.

<?php

namespace App\Command;

use App\Connection\DoctrineMultidatabaseConnection;
use App\Kernel;
use Doctrine\Persistence\ManagerRegistry;
use Symfony\Bundle\FrameworkBundle\Console\Application;
use Symfony\Component\Console\Attribute\AsCommand;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\ArrayInput;
use Symfony\Component\Console\Input\InputArgument;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
use Symfony\Component\HttpKernel\KernelInterface;

/**
 *
 */
#[AsCommand(
    name: 'app:database:create',
    description: 'Creates a new database',
    aliases: ['app:create-database'],
    hidden: false
)]
class CreateDatabaseCommand extends Command {


    /**
     * @param ManagerRegistry $doctrine
     * @param KernelInterface $kernel
     */
    public function __construct(private ManagerRegistry $doctrine, private KernelInterface $kernel) {
        parent::__construct();
    }

    /**
     * @return void
     */
    protected function configure(): void {
        $this
            ->addArgument('databaseName', InputArgument::REQUIRED, 'The name of the database');
    }

    /**
     * @param InputInterface $input
     * @param OutputInterface $output
     * @return int
     * @throws \Exception
     */
    protected function execute(InputInterface $input, OutputInterface $output): int {
        $dbName = $input->getArgument('databaseName');
        if (strpos($dbName, 'app_') === false) {
            $dbName = 'app_' . $dbName;
        }

        /** @var DoctrineMultidatabaseConnection $doctrineConnection */
        $doctrineConnection = $this->doctrine->getConnection();
        $doctrineConnection->changeDatabase($dbName);

        $application = new Application($this->kernel);
        $application->setAutoExit(false);

        $arguments = [
            'command'          => 'doctrine:database:create',
            '--if-not-exists'  => null,
            '--no-interaction' => null
        ];

        $commandInput = new ArrayInput($arguments);
        $application->run($commandInput, $output);
        unset($application);
        unset($kernel);

        return Command::SUCCESS;
    }

}

This command sets up the database connection and executes de dontrine:database:command.

The I created a similar command to manage update databases using doctrine:migrations:migrate.

<?php

namespace App\Command;

use App\Connection\DoctrineMultidatabaseConnection;
use App\Kernel;
use Doctrine\Persistence\ManagerRegistry;
use Symfony\Bundle\FrameworkBundle\Console\Application;
use Symfony\Component\Console\Attribute\AsCommand;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\ArrayInput;
use Symfony\Component\Console\Input\InputArgument;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Input\InputOption;
use Symfony\Component\Console\Output\OutputInterface;
use Symfony\Component\HttpKernel\KernelInterface;

/**
 *
 */
#[AsCommand(
    name: 'app:database:update',
    description: 'Updates an existing database.',
    aliases: ['app:update-database'],
    hidden: false
)]
class UpdateDatabaseCommand extends Command {
    /**
     * @param ManagerRegistry $doctrine
     * @param KernelInterface $kernel
     */
    public function __construct(private ManagerRegistry $doctrine, private KernelInterface $kernel) {
        parent::__construct();
    }

    /**
     * @return void
     */
    protected function configure(): void {
        $this
            ->addArgument('databaseName', InputArgument::OPTIONAL, 'The name of the database')
            ->addOption('all', 'a', InputOption::VALUE_NONE, 'Update all databases');
    }

    /**
     * @param InputInterface $input
     * @param OutputInterface $output
     * @return int
     * @throws \Exception
     */
    protected function execute(InputInterface $input, OutputInterface $output) {
        if ($input->getOption('all')) {
            $output->writeln('Updating all databases');
            $dbs = $this->doctrine->getConnection()->getDatabases();
            foreach ($dbs as $db) {
                $kernel = new Kernel(
                    $this->kernel->getEnvironment(),
                    $this->kernel->isDebug()
                );
                $application = new Application($kernel);
                $application->setAutoExit(false);

                $arguments = [
                    'command'      => 'app:database:update',
                    'databaseName' => $db
                ];

                $greetInput = new ArrayInput($arguments);
                $application->run($greetInput, $output);
            }
        } else {
            if ($input->getArgument('databaseName')) {
                $output->writeln('Updating ' . $input->getArgument('databaseName') . ' database');
                $this->updateSingleDatabase($input->getArgument('databaseName'), $output);
            }
        }

        return Command::SUCCESS;
    }

    /**
     * @param string $databaseName
     * @param OutputInterface $output
     * @return void
     * @throws \Exception
     */
    private function updateSingleDatabase(string $databaseName, OutputInterface $output) {
        $this->doctrine->getConnection()->changeDatabase($databaseName);

        $application = new Application($this->kernel);
        $application->setAutoExit(false);

        $arguments = [
            'command'              => 'doctrine:migrations:migrate',
            '--no-interaction'     => '',
            '--no-debug'           => '',
            '--allow-no-migration' => ''
        ];

        $commandInput = new ArrayInput($arguments);
        $commandInput->setInteractive(false);
        $application->run($commandInput, $output);
    }
}

This command has two behaviours. If a single database is specified, that database is updated. If the all option is used, it will query Doctrine to get all the databases that stars with ‘app_’ and then loop over them to update each one.

At this point there is something for what I don’t have a complete explanation but a suspect.

In a previous version of this command, when I tried to loop, the database was updated, but the doctrine_migrations_table was only updated at the dev database. That made impossible to run further migrations on single databases as they will fail. After some tries I realized that booting a new kernel on each step the migrations table was updated on each database.

I haven’t investigated deeper, but I suspect that there must be some cache at kernel that is used by Doctrine migrations. ¿Any idea about this?

Testing the solution

Once the project is up (using docker compose) the dev database is created.

php bin/console doctrine:database:create

In this case I’m using the standard Doctrine migration command to use the standard Doctrine configuration (the DATABASE_URL environment variable).

Then, I created an entity, a simple user class with only one attribute (username) and created the migration file using doctrine:migration:diff.

Now is the moment to create and update the new databases, so:

php bin/console app:database:create app_customer_a
php bin/console app:database:create app_customer_b
php bin/console app:database:create app_customer_c

And then in mysql console:

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| app_customer_a     |
| app_customer_b     |
| app_customer_c     |
| app_dev            |
| information_schema |
| mysql              |
| performance_schema |
| sys                |
+--------------------+

Let’s update the databases with

php bin/console app:database:update --all

and check

select app_customer_a;
show tables;
+-----------------------------+
| Tables_in_app_customer_a    |
+-----------------------------+
| doctrine_migration_versions |
| messenger_messages          |
| user_entity                 |
+-----------------------------+
select app_customer_b;
show tables;
+-----------------------------+
| Tables_in_app_customer_b    |
+-----------------------------+
| doctrine_migration_versions |
| messenger_messages          |
| user_entity                 |
+-----------------------------+
select app_customer_c;
show tables;
+-----------------------------+
| Tables_in_app_customer_c    |
+-----------------------------+
| doctrine_migration_versions |
| messenger_messages          |
| user_entity                 |
+-----------------------------+

Every databases are updated with tables.

Now let’s go to application. I created a simple controller that creates users and exposes and end-point to check if they exists or not.

POST localhost:8081/api/user
Origin: customer_b.localhost
Content-Type: application/json

{
    "username": "John Appleseed"
}

POST localhost:8081/api/user
Origin: customer_a.localhost
Content-Type: application/json

{
    "username": "John Dow"
}

And finally, checking the database:

select app_customer_b;
mysql> select * from user_entity;
+----+----------------+
| id | username       |
+----+----------------+
|  1 | John Appleseed |
+----+----------------+
1 row in set (0.01 sec)

mysql> use app_customer_a;
mysql> select * from user_entity;
+----+----------+
| id | username |
+----+----------+
|  1 | John Doe |
+----+----------+
1 row in set (0.00 sec)

mysql> use app_customer_c;
mysql> select * from user_entity;
Empty set (0.00 sec)

So everyone is on it’s place.

I could be better

More complex use cases can be solved expanding this solution. For instance, the database list could be separated to an external storage like Redis o DynamoDB and then physically separate databases into different servers. This could be interesting for scalability or security reasons.

The request-database mapping procedure could be binded to users using an external user management solution. For instance using AWS Coginito. The database identifier could be a user attribute stored at the Cognito user data, and the on each request passed in to the request.

And so on…

The source code can be found here

Final notes

I have implemented this solution in a production state project. At the very beginning of that project, it was more or less clear that:

The number of databases will grow up quickly
The workload of each one will not be so heavy

At this moment that project has more than 80 databases, this means that with a single database application I would have 80 instances of the app running, with it’s provisioned resources.

To update that applications I would need to build the docker image, and deploy it 80 times, updating the database on each deployment.

Now with this aproach the deployments are far more simple and fast. I just run a single command that loops over every database and updates them, making sure that no one is skipped.

As the number of databases grows this command will take longer time, and at some point it will become imposible to run as it is right now. At that point it can be easily refactorized to use a messaging queue (RabbitMq, AWS SQS…) to put each update in a queue, so the “human” (me) doesn’t need to stay in front of the screen for hours, and best of all, the process can be parallelized using many consumers to that queue.