Example: Before/After showcase
This documentation will help you understand the difference between a classic code you could write.
Versus a code that had been written using
yokai/batch library.What are we trying to do?
We have a jsonl file, containing data that we want, and we must import it in our database, via doctrine/orm.
{"code":"camcorders","attributes":["description","image_stabilizer","name","optical_zoom","picture","power_requirements","price","release_date","sensor_type","sku","total_megapixels","weight"],"attribute_as_label":"name","attribute_as_image":"picture","labels":{"en_US":"Camcorders","fr_FR":"Cam\u00e9scopes num\u00e9riques","de_DE":"Digitale Videokameras"}}
{"code":"digital_cameras","attributes":["auto_exposure","auto_focus_assist_beam","auto_focus_lock","auto_focus_modes","auto_focus_points","camera_brand","camera_model_name","camera_type","description","focus","focus_adjustement","image_resolutions","iso_sensitivity","iso_sensitivity_max","iso_sensitivity_min","lens_mount_interface","light_exposure_corrections","light_exposure_modes","light_metering","max_image_resolution","name","optical_zoom","picture","power_requirements","price","release_date","sensor_type","short_description","sku","supported_aspect_ratios","supported_image_format","total_megapixels","weight"],"attribute_as_label":"name","attribute_as_image":"picture","labels":{"en_US":"Digital cameras","fr_FR":"Cam\u00e9ras digitales","de_DE":"Digitale Kameras"}}
{"code":"headphones","attributes":["description","headphone_connectivity","name","picture","power_requirements","price","release_date","sku","snr","thd","weight"],"attribute_as_label":"name","attribute_as_image":"picture","labels":{"en_US":"Headphones","fr_FR":"Casques audio","de_DE":"Kopfh\u00f6rer"}}
Note
This file is obviously much larger than these 3 lines, you might have thousands lines to process.
Before: Without Yokai Batch
The easiest way to do this is to create the one script you have already written thousands of times:
<?php
namespace App\Command;
use App\Entity\Family;
use Doctrine\Persistence\ManagerRegistry;
use Symfony\Component\Console\Attribute\AsCommand;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
use Symfony\Component\Validator\Validator\ValidatorInterface;
#[AsCommand(name: 'app:import')]
final class ImportCommand extends Command
{
public function __construct(
private readonly ValidatorInterface $validator,
private readonly ManagerRegistry $doctrine,
) {
parent::__construct();
}
protected function execute(InputInterface $input, OutputInterface $output): int
{
$path = __DIR__ . '/families.jsonl';
$file = @\fopen($path, 'r');
if ($file === false) {
throw new \RuntimeException(\sprintf('Cannot open %s for reading.', $path));
}
$manager = $this->doctrine->getManagerForClass(Family::class);
\assert($manager !== null);
$families = [];
while ($line = \fgets($file)) {
try {
$data = \json_decode($line, true, 512, \JSON_THROW_ON_ERROR);
\assert(\is_array($data));
} catch (\JsonException) {
continue;
}
$family = Family::fromData($data);
$violations = $this->validator->validate($family);
if (\count($violations) > 0) {
continue;
}
$families[] = $family;
if (\count($families) % 500 === 0) {
foreach ($families as $family) {
$manager->persist($family);
}
$manager->flush();
$families = [];
}
}
\fclose($file);
if (\count($families) > 0) {
foreach ($families as $family) {
$manager->persist($family);
}
$manager->flush();
}
return self::SUCCESS;
}
}
Warning
There are many little things you have to think about when doing batch processing.
And there are chances that you have these little things shattered in your application.
As your team grow, it will become more important to avoid duplicating things like this.
Because it is likely that someone will forget one of those little things, code will start acting funny.
After: With Yokai Batch
Now, using
yokai/batch, we will be able to factorize most of this code to show only the business part :<?php
namespace App\Command;
use App\Entity\Family;
use Doctrine\Persistence\ManagerRegistry;
use Symfony\Component\Console\Attribute\AsCommand;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
use Symfony\Component\Validator\Validator\ValidatorInterface;
use Yokai\Batch\Bridge\Doctrine\Persistence\ObjectWriter;
use Yokai\Batch\Bridge\Symfony\Validator\SkipInvalidItemProcessor;
use Yokai\Batch\Job\Item\ItemJob;
use Yokai\Batch\Job\Item\Processor\CallbackProcessor;
use Yokai\Batch\Job\Item\Processor\ChainProcessor;
use Yokai\Batch\Job\Item\Reader\Filesystem\JsonLinesReader;
use Yokai\Batch\Job\Parameters\StaticValueParameterAccessor;
use Yokai\Batch\JobExecution;
use Yokai\Batch\Storage\NullJobExecutionStorage;
#[AsCommand(name: 'app:import')]
final class ImportCommand extends Command
{
public function __construct(
private readonly ValidatorInterface $validator,
private readonly ManagerRegistry $doctrine,
) {
parent::__construct();
}
protected function execute(InputInterface $input, OutputInterface $output): int
{
(new ItemJob(
batchSize: 500,
reader: new JsonLinesReader(new StaticValueParameterAccessor(__DIR__ . '/families.jsonl')),
processor: new ChainProcessor([
new CallbackProcessor(fn(array $data) => Family::fromData($data)),
new SkipInvalidItemProcessor($this->validator),
]),
writer: new ObjectWriter($this->doctrine),
executionStorage: new NullJobExecutionStorage(),
))->execute(JobExecution::createRoot('1', 'import'));
return self::SUCCESS;
}
}
Note
Most of the classes of that snippet are from
Yokai\Batch namespace, you will reuse a lot of those along the way.After all, batch processing is almost always the same, the only things that changes are:
the data source you are reading from
some transformations you are performing on that source
the data source you are writing to