Skip to content

Commit

Permalink
Merge branch 'master' into irfan-dahir-patch-1
Browse files Browse the repository at this point in the history
  • Loading branch information
irfan-dahir authored Nov 13, 2024
2 parents 8d1e935 + 2fa3958 commit 9d572e4
Show file tree
Hide file tree
Showing 30 changed files with 458 additions and 84 deletions.
4 changes: 3 additions & 1 deletion .github/workflows/container-image-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,9 @@ jobs:

- name: Build and push by digest
id: build
uses: docker/build-push-action@v5
uses: docker/build-push-action@v6
env:
DOCKER_BUILD_NO_SUMMARY: true
with:
context: .
platforms: ${{ matrix.platform }}
Expand Down
24 changes: 19 additions & 5 deletions COMMANDS.MD
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,15 @@ For an entire list of commands, you can run `php artisan list`
- [Indexer](#indexer)
- [Anime](#indexer-anime)
- [Manga](#indexer-manga)

- [Incremental](#indexer-incremental)

## Commands

### Serve
Command: `serve`
Example: `php artisan serve`

Serve the application on the PHP development server
Serve the application on the PHP development server

### Queue

Expand Down Expand Up @@ -66,7 +67,7 @@ Example: `cache:method queue`
Since v4 uses MongoDB as a means to index cache on some endpoints, having a built cache is important since it
works best for endpoints like search or top.

`Indexer:Anime` uses [https://github.com/seanbreckenridge/mal-id-cache](https://github.com/seanbreckenridge/mal-id-cache) to fetch available MAL IDs and indexes them.
`Indexer:Anime` uses [https://github.com/purarue/mal-id-cache](https://github.com/purarue/mal-id-cache) to fetch available MAL IDs and indexes them.

This function only needs to be run once. Any entry's cache updating will automatically be taken care of if it's expired, and a client makes a request for that entry.

Expand All @@ -90,15 +91,15 @@ This translates to running entries that previously failed to index or update, in
Since v4 uses MongoDB as a means to index cache on some endpoints, having a built cache is important since it
works best for endpoints like search or top.

`Indexer:Manga` uses [https://github.com/seanbreckenridge/mal-id-cache](https://github.com/seanbreckenridge/mal-id-cache) to fetch available MAL IDs and indexes them.
`Indexer:Manga` uses [https://github.com/purarue/mal-id-cache](https://github.com/purarue/mal-id-cache) to fetch available MAL IDs and indexes them.

This function only needs to be run once. Any entry's cache updating will automatically be taken care of if it's expired, and a client makes a request for that entry.

⚠ This is strictly for performance and experience and providing better search functionality. Don't build your own anime database as that's against MyAnimeList's Terms of Service.

Command:
```
indexer:anime
indexer:manga
{--failed : Run only entries that failed to index last time}
{--resume : Resume from the last position}
{--reverse : Start from the end of the array}
Expand All @@ -109,3 +110,16 @@ indexer:anime
Example: `indexer:manga`

This simply translates to running the indexer without any additional configuration.

#### Indexer: Incremental
Incrementally indexes media entries from MAL.
This command will compare the latest version of MAL ids from the [mal_id_cache](https://github.com/purarue/mal-id-cache)
github repository and compares them with the downloaded ids from the previous run. If no ids found from the previous run, a full indexing session is started.

Command:
```
indexer:incremental {mediaType*}
{--failed : Run only entries that failed to index last time}
{--resume : Resume from the last position}
{--delay=3 : Set a delay between requests}
```
1 change: 0 additions & 1 deletion README.MD
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@ For any additional help, join our [Discord server](http://discord.jikan.moe/).
| TypeScript | [jikants](https://github.com/Julien-Broyard/jikants) by Julien Broyard<br>[jikan-client](https://github.com/javi11/jikan-client) by Javier Blanco<br>🆕 **(v4)** [jikan-ts](https://github.com/tutkli/jikan-ts) by Clara Castillo |
| PHP | [jikan-php](https://github.com/janvernieuwe/jikan-jikanPHP) by Jan Vernieuwe |
| .NET | 🆕 **(v4)** [Jikan.net](https://github.com/Ervie/jikan.net) by Ervie |
| Elixir | [JikanEx](https://github.com/seanbreckenridge/jikan_ex) by Sean Breckenridge |
| Go | 🆕 **(v4)** [jikan-go](https://github.com/darenliang/jikan-go) by Daren Liang<br>[jikan2go](https://github.com/nokusukun/jikan2go) by nokusukun |
| Ruby | [Jikan.rb](https://github.com/Zerocchi/jikan.rb) by Zerocchi |
| Dart | [jikan-dart](https://github.com/charafau/jikan-dart) by Rafal Wachol |
Expand Down
9 changes: 4 additions & 5 deletions app/Console/Commands/Indexer/AnimeIndexer.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

namespace App\Console\Commands\Indexer;

use App\Exceptions\Console\CommandAlreadyRunningException;
use App\Exceptions\Console\FileNotFoundException;
use Illuminate\Console\Command;
use Illuminate\Support\Facades\Storage;
Expand Down Expand Up @@ -67,7 +66,7 @@ public function handle()
$index = (int)$index;
$delay = (int)$delay;

$this->info("Info: AnimeIndexer uses seanbreckenridge/mal-id-cache fetch available MAL IDs and updates/indexes them\n\n");
$this->info("Info: AnimeIndexer uses purarue/mal-id-cache fetch available MAL IDs and updates/indexes them\n\n");

if ($failed && Storage::exists('indexer/indexer_anime.failed')) {
$this->ids = $this->loadFailedMalIds();
Expand Down Expand Up @@ -140,14 +139,14 @@ public function handle()

/**
* @return array
* @url https://github.com/seanbreckenridge/mal-id-cache
* @url https://github.com/purarue/mal-id-cache
*/
private function fetchMalIds() : array
{
$this->info("Fetching MAL ID Cache https://raw.githubusercontent.com/seanbreckenridge/mal-id-cache/master/cache/anime_cache.json...\n");
$this->info("Fetching MAL ID Cache https://raw.githubusercontent.com/purarue/mal-id-cache/master/cache/anime_cache.json...\n");

$ids = json_decode(
file_get_contents('https://raw.githubusercontent.com/seanbreckenridge/mal-id-cache/master/cache/anime_cache.json'),
file_get_contents('https://raw.githubusercontent.com/purarue/mal-id-cache/master/cache/anime_cache.json'),
true
);

Expand Down
6 changes: 3 additions & 3 deletions app/Console/Commands/Indexer/AnimeSweepIndexer.php
Original file line number Diff line number Diff line change
Expand Up @@ -74,14 +74,14 @@ public function handle()

/**
* @return array
* @url https://github.com/seanbreckenridge/mal-id-cache
* @url https://github.com/purarue/mal-id-cache
*/
private function fetchMalIds(): array
{
$this->info("Fetching MAL ID Cache https://raw.githubusercontent.com/seanbreckenridge/mal-id-cache/master/cache/anime_cache.json...\n");
$this->info("Fetching MAL ID Cache https://raw.githubusercontent.com/purarue/mal-id-cache/master/cache/anime_cache.json...\n");

$ids = json_decode(
file_get_contents('https://raw.githubusercontent.com/seanbreckenridge/mal-id-cache/master/cache/anime_cache.json'),
file_get_contents('https://raw.githubusercontent.com/purarue/mal-id-cache/master/cache/anime_cache.json'),
true
);

Expand Down
228 changes: 228 additions & 0 deletions app/Console/Commands/Indexer/IncrementalIndexer.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,228 @@
<?php

namespace App\Console\Commands\Indexer;

use Illuminate\Console\Command;
use Illuminate\Support\Facades\Storage;
use Illuminate\Support\Facades\Validator;

class IncrementalIndexer extends Command
{
/**
* @var bool
*/
private bool $cancelled = false;

/**
* The name and signature of the console command.
*
* @var string
*/
protected $signature = 'indexer:incremental {mediaType*}
{--delay=3 : Set a delay between requests}
{--resume : Resume from the last position}
{--failed : Run only entries that failed to index last time}';

protected function promptForMissingArgumentsUsing(): array
{
return [
'mediaType' => ['The media type to index.', 'Valid values: anime, manga']
];
}

private function getExistingIds(string $mediaType): array
{
$existingIdsHash = "";
$existingIdsRaw = "";

if (Storage::exists("indexer/incremental/$mediaType.json"))
{
$existingIdsRaw = Storage::get("indexer/incremental/$mediaType.json");
$existingIdsHash = sha1($existingIdsRaw);
}

return [$existingIdsHash, $existingIdsRaw];
}

private function getIdsToFetch(string $mediaType): array
{
$idsToFetch = [];
[$existingIdsHash, $existingIdsRaw] = $this->getExistingIds($mediaType);

if ($this->cancelled)
{
return [];
}

$newIdsRaw = file_get_contents("https://raw.githubusercontent.com/purarue/mal-id-cache/master/cache/${mediaType}_cache.json");
$newIdsHash = sha1($newIdsRaw);

/** @noinspection PhpConditionAlreadyCheckedInspection */
if ($this->cancelled)
{
return [];
}

if ($newIdsHash !== $existingIdsHash)
{
$newIds = json_decode($newIdsRaw, true);
$existingIds = json_decode($existingIdsRaw, true);

if (is_null($existingIds) || count($existingIds) === 0)
{
$idsToFetch = $newIds;
}
else
{
foreach (["sfw", "nsfw"] as $t)
{
$idsToFetch[$t] = array_diff($existingIds[$t], $newIds[$t]);
}
}

Storage::put("indexer/incremental/$mediaType.json.tmp", $newIdsRaw);
}

return $idsToFetch;
}

private function getFailedIdsToFetch(string $mediaType): array
{
return json_decode(Storage::get("indexer/incremental/{$mediaType}_failed.json"));
}

private function fetchIds(string $mediaType, array $idsToFetch, bool $resume): void
{
$index = 0;
$success = [];
$failedIds = [];
$idCount = count($idsToFetch);
if ($resume && Storage::exists("indexer/incremental/{$mediaType}_resume.save"))
{
$index = (int)Storage::get("indexer/incremental/{$mediaType}_resume.save");
$this->info("Resuming from index: $index");
}

$ids = array_merge($idsToFetch['sfw'], $idsToFetch['nsfw']);

if ($index > 0 && !isset($ids[$index]))
{
$index = 0;
$this->warn('Invalid index; set back to 0');
}

Storage::put("indexer/incremental/{$mediaType}_resume.save", 0);

$this->info("$idCount $mediaType entries available");

for ($i = $index; $i <= ($idCount - 1); $i++)
{
if ($this->cancelled)
{
return;
}

$id = $ids[$index];

$url = env('APP_URL') . "/v4/$mediaType/$id";
$this->info("Indexing/Updating " . ($i + 1) . "/$idCount $url [MAL ID: $id]");

try
{
$response = json_decode(file_get_contents($url), true);
if (!isset($response['error']) || $response['status'] == 404)
{
continue;
}

$this->error("[SKIPPED] Failed to fetch $url - {$response['error']}");
}
catch (\Exception)
{
$this->warn("[SKIPPED] Failed to fetch $url");
$failedIds[] = $id;
Storage::put("indexer/incremental/$mediaType.failed", json_encode($failedIds));
}

$success[] = $id;
Storage::put("indexer/incremental/{$mediaType}_resume.save", $index);
}

Storage::delete("indexer/incremental/{$mediaType}_resume.save");

$this->info("--- Indexing of $mediaType is complete.");
$this->info(count($success) . ' entries indexed or updated.');
if (count($failedIds) > 0)
{
$this->info(count($failedIds) . ' entries failed to index or update. Re-run with --failed to requeue failed entries only.');
}

// finalize the latest state
Storage::move("indexer/incremental/$mediaType.json.tmp", "indexer/incremental/$mediaType.json");
}

public function handle(): int
{
// validate inputs
$validator = Validator::make(
[
'mediaType' => $this->argument('mediaType'),
'delay' => $this->option('delay'),
'resume' => $this->option('resume') ?? false,
'failed' => $this->option('failed') ?? false
],
[
'mediaType' => 'required|in:anime,manga',
'delay' => 'integer|min:1',
'resume' => 'bool|prohibited_with:failed',
'failed' => 'bool|prohibited_with:resume'
]
);

if ($validator->fails()) {
$this->error($validator->errors()->toJson());
return 1;
}

// we want to handle signals from the OS
$this->trap([SIGTERM, SIGQUIT, SIGINT], fn () => $this->cancelled = true);

$resume = $this->option('resume') ?? false;
$onlyFailed = $this->option('failed') ?? false;

/**
* @var $mediaTypes array
*/
$mediaTypes = $this->argument("mediaType");

foreach ($mediaTypes as $mediaType)
{
$idsToFetch = [];

// if "--failed" option is specified just run the failed ones
if ($onlyFailed && Storage::exists("indexer/incremental/{$mediaType}_failed.json"))
{
$idsToFetch["sfw"] = $this->getFailedIdsToFetch($mediaType);
}
else
{
$idsToFetch = $this->getIdsToFetch($mediaType);
}

if ($this->cancelled)
{
return 127;
}

$idCount = count($idsToFetch);
if ($idCount === 0)
{
continue;
}

$this->fetchIds($mediaType, $idsToFetch, $resume);
}

return 0;
}
}
8 changes: 4 additions & 4 deletions app/Console/Commands/Indexer/MangaIndexer.php
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ public function handle()
$index = (int)$index;
$delay = (int)$delay;

$this->info("Info: MangaIndexer uses seanbreckenridge/mal-id-cache fetch available MAL IDs and updates/indexes them\n\n");
$this->info("Info: MangaIndexer uses purarue/mal-id-cache fetch available MAL IDs and updates/indexes them\n\n");

if ($failed && Storage::exists('indexer/indexer_manga.failed')) {
$this->ids = $this->loadFailedMalIds();
Expand Down Expand Up @@ -140,14 +140,14 @@ public function handle()

/**
* @return array
* @url https://github.com/seanbreckenridge/mal-id-cache
* @url https://github.com/purarue/mal-id-cache
*/
private function fetchMalIds() : array
{
$this->info("Fetching MAL ID Cache https://raw.githubusercontent.com/seanbreckenridge/mal-id-cache/master/cache/manga_cache.json...\n");
$this->info("Fetching MAL ID Cache https://raw.githubusercontent.com/purarue/mal-id-cache/master/cache/manga_cache.json...\n");

$ids = json_decode(
file_get_contents('https://raw.githubusercontent.com/seanbreckenridge/mal-id-cache/master/cache/manga_cache.json'),
file_get_contents('https://raw.githubusercontent.com/purarue/mal-id-cache/master/cache/manga_cache.json'),
true
);

Expand Down
Loading

0 comments on commit 9d572e4

Please sign in to comment.