[FA-27] Need to test events but seems to mostly work

This commit is contained in:
gamer147
2026-01-19 15:13:14 -05:00
parent 19ae4a8089
commit 1ecfd9cc99
26 changed files with 967 additions and 4 deletions

View File

@@ -0,0 +1,93 @@
# UserNovelDataService Backfill Scripts
SQL scripts for backfilling data from UserService and NovelService into UserNovelDataService.
## Prerequisites
1. **Run EF migrations** on the UserNovelDataService database to ensure all tables exist:
```bash
dotnet ef database update --project FictionArchive.Service.UserNovelDataService
```
This will apply the `AddNovelVolumeChapter` migration which creates:
- `Novels` table (Id, CreatedTime, LastUpdatedTime)
- `Volumes` table (Id, NovelId FK, CreatedTime, LastUpdatedTime)
- `Chapters` table (Id, VolumeId FK, CreatedTime, LastUpdatedTime)
## Execution Order
Run scripts in numeric order:
### Extraction (run against source databases)
1. `01_extract_users_from_userservice.sql` - Run against **UserService** DB
2. `02_extract_novels_from_novelservice.sql` - Run against **NovelService** DB
3. `03_extract_volumes_from_novelservice.sql` - Run against **NovelService** DB
4. `04_extract_chapters_from_novelservice.sql` - Run against **NovelService** DB
### Insertion (run against UserNovelDataService database)
5. `05_insert_users_to_usernoveldataservice.sql`
6. `06_insert_novels_to_usernoveldataservice.sql`
7. `07_insert_volumes_to_usernoveldataservice.sql`
8. `08_insert_chapters_to_usernoveldataservice.sql`
## Methods
Each script provides three options:
1. **SELECT for review** - Review data before export
2. **Generate INSERT statements** - Creates individual INSERT statements (good for small datasets)
3. **CSV export/import** - Use PostgreSQL `\copy` for bulk operations (recommended for large datasets)
## Example Workflow
### Using CSV Export/Import (Recommended)
```bash
# 1. Export from source databases
psql -h localhost -U postgres -d userservice -c "\copy (SELECT \"Id\", \"OAuthProviderId\", \"CreatedTime\", \"LastUpdatedTime\" FROM \"Users\" WHERE \"Disabled\" = false) TO '/tmp/users_export.csv' WITH CSV HEADER"
psql -h localhost -U postgres -d novelservice -c "\copy (SELECT \"Id\", \"CreatedTime\", \"LastUpdatedTime\" FROM \"Novels\") TO '/tmp/novels_export.csv' WITH CSV HEADER"
psql -h localhost -U postgres -d novelservice -c "\copy (SELECT \"Id\", \"NovelId\", \"CreatedTime\", \"LastUpdatedTime\" FROM \"Volume\" ORDER BY \"NovelId\", \"Id\") TO '/tmp/volumes_export.csv' WITH CSV HEADER"
psql -h localhost -U postgres -d novelservice -c "\copy (SELECT \"Id\", \"VolumeId\", \"CreatedTime\", \"LastUpdatedTime\" FROM \"Chapter\" ORDER BY \"VolumeId\", \"Id\") TO '/tmp/chapters_export.csv' WITH CSV HEADER"
# 2. Import into UserNovelDataService (order matters due to FK constraints!)
psql -h localhost -U postgres -d usernoveldataservice -c "\copy \"Users\" (\"Id\", \"OAuthProviderId\", \"CreatedTime\", \"LastUpdatedTime\") FROM '/tmp/users_export.csv' WITH CSV HEADER"
psql -h localhost -U postgres -d usernoveldataservice -c "\copy \"Novels\" (\"Id\", \"CreatedTime\", \"LastUpdatedTime\") FROM '/tmp/novels_export.csv' WITH CSV HEADER"
psql -h localhost -U postgres -d usernoveldataservice -c "\copy \"Volumes\" (\"Id\", \"NovelId\", \"CreatedTime\", \"LastUpdatedTime\") FROM '/tmp/volumes_export.csv' WITH CSV HEADER"
psql -h localhost -U postgres -d usernoveldataservice -c "\copy \"Chapters\" (\"Id\", \"VolumeId\", \"CreatedTime\", \"LastUpdatedTime\") FROM '/tmp/chapters_export.csv' WITH CSV HEADER"
```
**Important**: Insert order matters due to foreign key constraints:
1. Users (no dependencies)
2. Novels (no dependencies)
3. Volumes (depends on Novels)
4. Chapters (depends on Volumes)
### Using dblink (Cross-database queries)
If both databases are on the same PostgreSQL server, you can use `dblink` extension for direct cross-database inserts. See the commented examples in each insert script.
## Verification
After running the backfill, verify counts match:
```sql
-- Run on UserService DB
SELECT COUNT(*) as user_count FROM "Users" WHERE "Disabled" = false;
-- Run on NovelService DB
SELECT COUNT(*) as novel_count FROM "Novels";
SELECT COUNT(*) as volume_count FROM "Volume";
SELECT COUNT(*) as chapter_count FROM "Chapter";
-- Run on UserNovelDataService DB
SELECT COUNT(*) as user_count FROM "Users";
SELECT COUNT(*) as novel_count FROM "Novels";
SELECT COUNT(*) as volume_count FROM "Volumes";
SELECT COUNT(*) as chapter_count FROM "Chapters";
```

View File

@@ -0,0 +1,28 @@
-- Extract Users from UserService database
-- Run this against: UserService PostgreSQL database
-- Output: CSV or use COPY TO for bulk export
-- Option 1: Simple SELECT for review/testing
SELECT
"Id",
"OAuthProviderId",
"CreatedTime",
"LastUpdatedTime"
FROM "Users"
WHERE "Disabled" = false
ORDER BY "CreatedTime";
-- Option 2: Generate INSERT statements (useful for small datasets)
SELECT format(
'INSERT INTO "Users" ("Id", "OAuthProviderId", "CreatedTime", "LastUpdatedTime") VALUES (%L, %L, %L, %L) ON CONFLICT ("Id") DO NOTHING;',
"Id",
"OAuthProviderId",
"CreatedTime",
"LastUpdatedTime"
)
FROM "Users"
WHERE "Disabled" = false
ORDER BY "CreatedTime";
-- Option 3: Export to CSV (run from psql)
-- \copy (SELECT "Id", "OAuthProviderId", "CreatedTime", "LastUpdatedTime" FROM "Users" WHERE "Disabled" = false ORDER BY "CreatedTime") TO '/tmp/users_export.csv' WITH CSV HEADER;

View File

@@ -0,0 +1,24 @@
-- Extract Novels from NovelService database
-- Run this against: NovelService PostgreSQL database
-- Output: CSV or use COPY TO for bulk export
-- Option 1: Simple SELECT for review/testing
SELECT
"Id",
"CreatedTime",
"LastUpdatedTime"
FROM "Novels"
ORDER BY "Id";
-- Option 2: Generate INSERT statements
SELECT format(
'INSERT INTO "Novels" ("Id", "CreatedTime", "LastUpdatedTime") VALUES (%s, %L, %L) ON CONFLICT ("Id") DO NOTHING;',
"Id",
"CreatedTime",
"LastUpdatedTime"
)
FROM "Novels"
ORDER BY "Id";
-- Option 3: Export to CSV (run from psql)
-- \copy (SELECT "Id", "CreatedTime", "LastUpdatedTime" FROM "Novels" ORDER BY "Id") TO '/tmp/novels_export.csv' WITH CSV HEADER;

View File

@@ -0,0 +1,26 @@
-- Extract Volumes from NovelService database
-- Run this against: NovelService PostgreSQL database
-- Output: CSV or use COPY TO for bulk export
-- Option 1: Simple SELECT for review/testing
SELECT
"Id",
"NovelId",
"CreatedTime",
"LastUpdatedTime"
FROM "Volume"
ORDER BY "NovelId", "Id";
-- Option 2: Generate INSERT statements
SELECT format(
'INSERT INTO "Volumes" ("Id", "NovelId", "CreatedTime", "LastUpdatedTime") VALUES (%s, %s, %L, %L) ON CONFLICT ("Id") DO NOTHING;',
"Id",
"NovelId",
"CreatedTime",
"LastUpdatedTime"
)
FROM "Volume"
ORDER BY "NovelId", "Id";
-- Option 3: Export to CSV (run from psql)
-- \copy (SELECT "Id", "NovelId", "CreatedTime", "LastUpdatedTime" FROM "Volume" ORDER BY "NovelId", "Id") TO '/tmp/volumes_export.csv' WITH CSV HEADER;

View File

@@ -0,0 +1,26 @@
-- Extract Chapters from NovelService database
-- Run this against: NovelService PostgreSQL database
-- Output: CSV or use COPY TO for bulk export
-- Option 1: Simple SELECT for review/testing
SELECT
"Id",
"VolumeId",
"CreatedTime",
"LastUpdatedTime"
FROM "Chapter"
ORDER BY "VolumeId", "Id";
-- Option 2: Generate INSERT statements
SELECT format(
'INSERT INTO "Chapters" ("Id", "VolumeId", "CreatedTime", "LastUpdatedTime") VALUES (%s, %s, %L, %L) ON CONFLICT ("Id") DO NOTHING;',
"Id",
"VolumeId",
"CreatedTime",
"LastUpdatedTime"
)
FROM "Chapter"
ORDER BY "VolumeId", "Id";
-- Option 3: Export to CSV (run from psql)
-- \copy (SELECT "Id", "VolumeId", "CreatedTime", "LastUpdatedTime" FROM "Chapter" ORDER BY "VolumeId", "Id") TO '/tmp/chapters_export.csv' WITH CSV HEADER;

View File

@@ -0,0 +1,32 @@
-- Insert Users into UserNovelDataService database
-- Run this against: UserNovelDataService PostgreSQL database
--
-- PREREQUISITE: You must have extracted users from UserService first
-- using 01_extract_users_from_userservice.sql
-- Option 1: If you have a CSV file from export
-- \copy "Users" ("Id", "OAuthProviderId", "CreatedTime", "LastUpdatedTime") FROM '/tmp/users_export.csv' WITH CSV HEADER;
-- Option 2: Direct cross-database insert using dblink
-- First, install dblink extension if not already done:
-- CREATE EXTENSION IF NOT EXISTS dblink;
-- Example using dblink (adjust connection string):
/*
INSERT INTO "Users" ("Id", "OAuthProviderId", "CreatedTime", "LastUpdatedTime")
SELECT
"Id"::uuid,
"OAuthProviderId",
"CreatedTime"::timestamp with time zone,
"LastUpdatedTime"::timestamp with time zone
FROM dblink(
'host=localhost port=5432 dbname=userservice user=postgres password=yourpassword',
'SELECT "Id", "OAuthProviderId", "CreatedTime", "LastUpdatedTime" FROM "Users" WHERE "Disabled" = false'
) AS t("Id" uuid, "OAuthProviderId" text, "CreatedTime" timestamp with time zone, "LastUpdatedTime" timestamp with time zone)
ON CONFLICT ("Id") DO UPDATE SET
"OAuthProviderId" = EXCLUDED."OAuthProviderId",
"LastUpdatedTime" = EXCLUDED."LastUpdatedTime";
*/
-- Option 3: Paste generated INSERT statements from extraction script here
-- INSERT INTO "Users" ("Id", "OAuthProviderId", "CreatedTime", "LastUpdatedTime") VALUES (...) ON CONFLICT ("Id") DO NOTHING;

View File

@@ -0,0 +1,31 @@
-- Insert Novels into UserNovelDataService database
-- Run this against: UserNovelDataService PostgreSQL database
--
-- PREREQUISITE:
-- 1. Ensure the Novels table exists (run EF migrations first if needed)
-- 2. Extract novels from NovelService using 02_extract_novels_from_novelservice.sql
-- Option 1: If you have a CSV file from export
-- \copy "Novels" ("Id", "CreatedTime", "LastUpdatedTime") FROM '/tmp/novels_export.csv' WITH CSV HEADER;
-- Option 2: Direct cross-database insert using dblink
-- First, install dblink extension if not already done:
-- CREATE EXTENSION IF NOT EXISTS dblink;
-- Example using dblink (adjust connection string):
/*
INSERT INTO "Novels" ("Id", "CreatedTime", "LastUpdatedTime")
SELECT
"Id"::bigint,
"CreatedTime"::timestamp with time zone,
"LastUpdatedTime"::timestamp with time zone
FROM dblink(
'host=localhost port=5432 dbname=novelservice user=postgres password=yourpassword',
'SELECT "Id", "CreatedTime", "LastUpdatedTime" FROM "Novels"'
) AS t("Id" bigint, "CreatedTime" timestamp with time zone, "LastUpdatedTime" timestamp with time zone)
ON CONFLICT ("Id") DO UPDATE SET
"LastUpdatedTime" = EXCLUDED."LastUpdatedTime";
*/
-- Option 3: Paste generated INSERT statements from extraction script here
-- INSERT INTO "Novels" ("Id", "CreatedTime", "LastUpdatedTime") VALUES (...) ON CONFLICT ("Id") DO NOTHING;

View File

@@ -0,0 +1,34 @@
-- Insert Volumes into UserNovelDataService database
-- Run this against: UserNovelDataService PostgreSQL database
--
-- PREREQUISITE:
-- 1. Ensure the Volumes table exists (run EF migrations first if needed)
-- 2. Novels must be inserted first (FK constraint)
-- 3. Extract volumes from NovelService using 03_extract_volumes_from_novelservice.sql
-- Option 1: If you have a CSV file from export
-- \copy "Volumes" ("Id", "NovelId", "CreatedTime", "LastUpdatedTime") FROM '/tmp/volumes_export.csv' WITH CSV HEADER;
-- Option 2: Direct cross-database insert using dblink
-- First, install dblink extension if not already done:
-- CREATE EXTENSION IF NOT EXISTS dblink;
-- Example using dblink (adjust connection string):
/*
INSERT INTO "Volumes" ("Id", "NovelId", "CreatedTime", "LastUpdatedTime")
SELECT
"Id"::bigint,
"NovelId"::bigint,
"CreatedTime"::timestamp with time zone,
"LastUpdatedTime"::timestamp with time zone
FROM dblink(
'host=localhost port=5432 dbname=novelservice user=postgres password=yourpassword',
'SELECT "Id", "NovelId", "CreatedTime", "LastUpdatedTime" FROM "Volume"'
) AS t("Id" bigint, "NovelId" bigint, "CreatedTime" timestamp with time zone, "LastUpdatedTime" timestamp with time zone)
ON CONFLICT ("Id") DO UPDATE SET
"NovelId" = EXCLUDED."NovelId",
"LastUpdatedTime" = EXCLUDED."LastUpdatedTime";
*/
-- Option 3: Paste generated INSERT statements from extraction script here
-- INSERT INTO "Volumes" ("Id", "NovelId", "CreatedTime", "LastUpdatedTime") VALUES (...) ON CONFLICT ("Id") DO NOTHING;

View File

@@ -0,0 +1,34 @@
-- Insert Chapters into UserNovelDataService database
-- Run this against: UserNovelDataService PostgreSQL database
--
-- PREREQUISITE:
-- 1. Ensure the Chapters table exists (run EF migrations first if needed)
-- 2. Volumes must be inserted first (FK constraint)
-- 3. Extract chapters from NovelService using 04_extract_chapters_from_novelservice.sql
-- Option 1: If you have a CSV file from export
-- \copy "Chapters" ("Id", "VolumeId", "CreatedTime", "LastUpdatedTime") FROM '/tmp/chapters_export.csv' WITH CSV HEADER;
-- Option 2: Direct cross-database insert using dblink
-- First, install dblink extension if not already done:
-- CREATE EXTENSION IF NOT EXISTS dblink;
-- Example using dblink (adjust connection string):
/*
INSERT INTO "Chapters" ("Id", "VolumeId", "CreatedTime", "LastUpdatedTime")
SELECT
"Id"::bigint,
"VolumeId"::bigint,
"CreatedTime"::timestamp with time zone,
"LastUpdatedTime"::timestamp with time zone
FROM dblink(
'host=localhost port=5432 dbname=novelservice user=postgres password=yourpassword',
'SELECT "Id", "VolumeId", "CreatedTime", "LastUpdatedTime" FROM "Chapter"'
) AS t("Id" bigint, "VolumeId" bigint, "CreatedTime" timestamp with time zone, "LastUpdatedTime" timestamp with time zone)
ON CONFLICT ("Id") DO UPDATE SET
"VolumeId" = EXCLUDED."VolumeId",
"LastUpdatedTime" = EXCLUDED."LastUpdatedTime";
*/
-- Option 3: Paste generated INSERT statements from extraction script here
-- INSERT INTO "Chapters" ("Id", "VolumeId", "CreatedTime", "LastUpdatedTime") VALUES (...) ON CONFLICT ("Id") DO NOTHING;