[FA-27] Need to test events but seems to mostly work
This commit is contained in:
@@ -0,0 +1,93 @@
|
||||
# UserNovelDataService Backfill Scripts
|
||||
|
||||
SQL scripts for backfilling data from UserService and NovelService into UserNovelDataService.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. **Run EF migrations** on the UserNovelDataService database to ensure all tables exist:
|
||||
```bash
|
||||
dotnet ef database update --project FictionArchive.Service.UserNovelDataService
|
||||
```
|
||||
|
||||
This will apply the `AddNovelVolumeChapter` migration which creates:
|
||||
- `Novels` table (Id, CreatedTime, LastUpdatedTime)
|
||||
- `Volumes` table (Id, NovelId FK, CreatedTime, LastUpdatedTime)
|
||||
- `Chapters` table (Id, VolumeId FK, CreatedTime, LastUpdatedTime)
|
||||
|
||||
## Execution Order
|
||||
|
||||
Run scripts in numeric order:
|
||||
|
||||
### Extraction (run against source databases)
|
||||
1. `01_extract_users_from_userservice.sql` - Run against **UserService** DB
|
||||
2. `02_extract_novels_from_novelservice.sql` - Run against **NovelService** DB
|
||||
3. `03_extract_volumes_from_novelservice.sql` - Run against **NovelService** DB
|
||||
4. `04_extract_chapters_from_novelservice.sql` - Run against **NovelService** DB
|
||||
|
||||
### Insertion (run against UserNovelDataService database)
|
||||
5. `05_insert_users_to_usernoveldataservice.sql`
|
||||
6. `06_insert_novels_to_usernoveldataservice.sql`
|
||||
7. `07_insert_volumes_to_usernoveldataservice.sql`
|
||||
8. `08_insert_chapters_to_usernoveldataservice.sql`
|
||||
|
||||
## Methods
|
||||
|
||||
Each script provides three options:
|
||||
|
||||
1. **SELECT for review** - Review data before export
|
||||
2. **Generate INSERT statements** - Creates individual INSERT statements (good for small datasets)
|
||||
3. **CSV export/import** - Use PostgreSQL `\copy` for bulk operations (recommended for large datasets)
|
||||
|
||||
## Example Workflow
|
||||
|
||||
### Using CSV Export/Import (Recommended)
|
||||
|
||||
```bash
|
||||
# 1. Export from source databases
|
||||
psql -h localhost -U postgres -d userservice -c "\copy (SELECT \"Id\", \"OAuthProviderId\", \"CreatedTime\", \"LastUpdatedTime\" FROM \"Users\" WHERE \"Disabled\" = false) TO '/tmp/users_export.csv' WITH CSV HEADER"
|
||||
|
||||
psql -h localhost -U postgres -d novelservice -c "\copy (SELECT \"Id\", \"CreatedTime\", \"LastUpdatedTime\" FROM \"Novels\") TO '/tmp/novels_export.csv' WITH CSV HEADER"
|
||||
|
||||
psql -h localhost -U postgres -d novelservice -c "\copy (SELECT \"Id\", \"NovelId\", \"CreatedTime\", \"LastUpdatedTime\" FROM \"Volume\" ORDER BY \"NovelId\", \"Id\") TO '/tmp/volumes_export.csv' WITH CSV HEADER"
|
||||
|
||||
psql -h localhost -U postgres -d novelservice -c "\copy (SELECT \"Id\", \"VolumeId\", \"CreatedTime\", \"LastUpdatedTime\" FROM \"Chapter\" ORDER BY \"VolumeId\", \"Id\") TO '/tmp/chapters_export.csv' WITH CSV HEADER"
|
||||
|
||||
# 2. Import into UserNovelDataService (order matters due to FK constraints!)
|
||||
psql -h localhost -U postgres -d usernoveldataservice -c "\copy \"Users\" (\"Id\", \"OAuthProviderId\", \"CreatedTime\", \"LastUpdatedTime\") FROM '/tmp/users_export.csv' WITH CSV HEADER"
|
||||
|
||||
psql -h localhost -U postgres -d usernoveldataservice -c "\copy \"Novels\" (\"Id\", \"CreatedTime\", \"LastUpdatedTime\") FROM '/tmp/novels_export.csv' WITH CSV HEADER"
|
||||
|
||||
psql -h localhost -U postgres -d usernoveldataservice -c "\copy \"Volumes\" (\"Id\", \"NovelId\", \"CreatedTime\", \"LastUpdatedTime\") FROM '/tmp/volumes_export.csv' WITH CSV HEADER"
|
||||
|
||||
psql -h localhost -U postgres -d usernoveldataservice -c "\copy \"Chapters\" (\"Id\", \"VolumeId\", \"CreatedTime\", \"LastUpdatedTime\") FROM '/tmp/chapters_export.csv' WITH CSV HEADER"
|
||||
```
|
||||
|
||||
**Important**: Insert order matters due to foreign key constraints:
|
||||
1. Users (no dependencies)
|
||||
2. Novels (no dependencies)
|
||||
3. Volumes (depends on Novels)
|
||||
4. Chapters (depends on Volumes)
|
||||
|
||||
### Using dblink (Cross-database queries)
|
||||
|
||||
If both databases are on the same PostgreSQL server, you can use `dblink` extension for direct cross-database inserts. See the commented examples in each insert script.
|
||||
|
||||
## Verification
|
||||
|
||||
After running the backfill, verify counts match:
|
||||
|
||||
```sql
|
||||
-- Run on UserService DB
|
||||
SELECT COUNT(*) as user_count FROM "Users" WHERE "Disabled" = false;
|
||||
|
||||
-- Run on NovelService DB
|
||||
SELECT COUNT(*) as novel_count FROM "Novels";
|
||||
SELECT COUNT(*) as volume_count FROM "Volume";
|
||||
SELECT COUNT(*) as chapter_count FROM "Chapter";
|
||||
|
||||
-- Run on UserNovelDataService DB
|
||||
SELECT COUNT(*) as user_count FROM "Users";
|
||||
SELECT COUNT(*) as novel_count FROM "Novels";
|
||||
SELECT COUNT(*) as volume_count FROM "Volumes";
|
||||
SELECT COUNT(*) as chapter_count FROM "Chapters";
|
||||
```
|
||||
@@ -0,0 +1,28 @@
|
||||
-- Extract Users from UserService database
|
||||
-- Run this against: UserService PostgreSQL database
|
||||
-- Output: CSV or use COPY TO for bulk export
|
||||
|
||||
-- Option 1: Simple SELECT for review/testing
|
||||
SELECT
|
||||
"Id",
|
||||
"OAuthProviderId",
|
||||
"CreatedTime",
|
||||
"LastUpdatedTime"
|
||||
FROM "Users"
|
||||
WHERE "Disabled" = false
|
||||
ORDER BY "CreatedTime";
|
||||
|
||||
-- Option 2: Generate INSERT statements (useful for small datasets)
|
||||
SELECT format(
|
||||
'INSERT INTO "Users" ("Id", "OAuthProviderId", "CreatedTime", "LastUpdatedTime") VALUES (%L, %L, %L, %L) ON CONFLICT ("Id") DO NOTHING;',
|
||||
"Id",
|
||||
"OAuthProviderId",
|
||||
"CreatedTime",
|
||||
"LastUpdatedTime"
|
||||
)
|
||||
FROM "Users"
|
||||
WHERE "Disabled" = false
|
||||
ORDER BY "CreatedTime";
|
||||
|
||||
-- Option 3: Export to CSV (run from psql)
|
||||
-- \copy (SELECT "Id", "OAuthProviderId", "CreatedTime", "LastUpdatedTime" FROM "Users" WHERE "Disabled" = false ORDER BY "CreatedTime") TO '/tmp/users_export.csv' WITH CSV HEADER;
|
||||
@@ -0,0 +1,24 @@
|
||||
-- Extract Novels from NovelService database
|
||||
-- Run this against: NovelService PostgreSQL database
|
||||
-- Output: CSV or use COPY TO for bulk export
|
||||
|
||||
-- Option 1: Simple SELECT for review/testing
|
||||
SELECT
|
||||
"Id",
|
||||
"CreatedTime",
|
||||
"LastUpdatedTime"
|
||||
FROM "Novels"
|
||||
ORDER BY "Id";
|
||||
|
||||
-- Option 2: Generate INSERT statements
|
||||
SELECT format(
|
||||
'INSERT INTO "Novels" ("Id", "CreatedTime", "LastUpdatedTime") VALUES (%s, %L, %L) ON CONFLICT ("Id") DO NOTHING;',
|
||||
"Id",
|
||||
"CreatedTime",
|
||||
"LastUpdatedTime"
|
||||
)
|
||||
FROM "Novels"
|
||||
ORDER BY "Id";
|
||||
|
||||
-- Option 3: Export to CSV (run from psql)
|
||||
-- \copy (SELECT "Id", "CreatedTime", "LastUpdatedTime" FROM "Novels" ORDER BY "Id") TO '/tmp/novels_export.csv' WITH CSV HEADER;
|
||||
@@ -0,0 +1,26 @@
|
||||
-- Extract Volumes from NovelService database
|
||||
-- Run this against: NovelService PostgreSQL database
|
||||
-- Output: CSV or use COPY TO for bulk export
|
||||
|
||||
-- Option 1: Simple SELECT for review/testing
|
||||
SELECT
|
||||
"Id",
|
||||
"NovelId",
|
||||
"CreatedTime",
|
||||
"LastUpdatedTime"
|
||||
FROM "Volume"
|
||||
ORDER BY "NovelId", "Id";
|
||||
|
||||
-- Option 2: Generate INSERT statements
|
||||
SELECT format(
|
||||
'INSERT INTO "Volumes" ("Id", "NovelId", "CreatedTime", "LastUpdatedTime") VALUES (%s, %s, %L, %L) ON CONFLICT ("Id") DO NOTHING;',
|
||||
"Id",
|
||||
"NovelId",
|
||||
"CreatedTime",
|
||||
"LastUpdatedTime"
|
||||
)
|
||||
FROM "Volume"
|
||||
ORDER BY "NovelId", "Id";
|
||||
|
||||
-- Option 3: Export to CSV (run from psql)
|
||||
-- \copy (SELECT "Id", "NovelId", "CreatedTime", "LastUpdatedTime" FROM "Volume" ORDER BY "NovelId", "Id") TO '/tmp/volumes_export.csv' WITH CSV HEADER;
|
||||
@@ -0,0 +1,26 @@
|
||||
-- Extract Chapters from NovelService database
|
||||
-- Run this against: NovelService PostgreSQL database
|
||||
-- Output: CSV or use COPY TO for bulk export
|
||||
|
||||
-- Option 1: Simple SELECT for review/testing
|
||||
SELECT
|
||||
"Id",
|
||||
"VolumeId",
|
||||
"CreatedTime",
|
||||
"LastUpdatedTime"
|
||||
FROM "Chapter"
|
||||
ORDER BY "VolumeId", "Id";
|
||||
|
||||
-- Option 2: Generate INSERT statements
|
||||
SELECT format(
|
||||
'INSERT INTO "Chapters" ("Id", "VolumeId", "CreatedTime", "LastUpdatedTime") VALUES (%s, %s, %L, %L) ON CONFLICT ("Id") DO NOTHING;',
|
||||
"Id",
|
||||
"VolumeId",
|
||||
"CreatedTime",
|
||||
"LastUpdatedTime"
|
||||
)
|
||||
FROM "Chapter"
|
||||
ORDER BY "VolumeId", "Id";
|
||||
|
||||
-- Option 3: Export to CSV (run from psql)
|
||||
-- \copy (SELECT "Id", "VolumeId", "CreatedTime", "LastUpdatedTime" FROM "Chapter" ORDER BY "VolumeId", "Id") TO '/tmp/chapters_export.csv' WITH CSV HEADER;
|
||||
@@ -0,0 +1,32 @@
|
||||
-- Insert Users into UserNovelDataService database
|
||||
-- Run this against: UserNovelDataService PostgreSQL database
|
||||
--
|
||||
-- PREREQUISITE: You must have extracted users from UserService first
|
||||
-- using 01_extract_users_from_userservice.sql
|
||||
|
||||
-- Option 1: If you have a CSV file from export
|
||||
-- \copy "Users" ("Id", "OAuthProviderId", "CreatedTime", "LastUpdatedTime") FROM '/tmp/users_export.csv' WITH CSV HEADER;
|
||||
|
||||
-- Option 2: Direct cross-database insert using dblink
|
||||
-- First, install dblink extension if not already done:
|
||||
-- CREATE EXTENSION IF NOT EXISTS dblink;
|
||||
|
||||
-- Example using dblink (adjust connection string):
|
||||
/*
|
||||
INSERT INTO "Users" ("Id", "OAuthProviderId", "CreatedTime", "LastUpdatedTime")
|
||||
SELECT
|
||||
"Id"::uuid,
|
||||
"OAuthProviderId",
|
||||
"CreatedTime"::timestamp with time zone,
|
||||
"LastUpdatedTime"::timestamp with time zone
|
||||
FROM dblink(
|
||||
'host=localhost port=5432 dbname=userservice user=postgres password=yourpassword',
|
||||
'SELECT "Id", "OAuthProviderId", "CreatedTime", "LastUpdatedTime" FROM "Users" WHERE "Disabled" = false'
|
||||
) AS t("Id" uuid, "OAuthProviderId" text, "CreatedTime" timestamp with time zone, "LastUpdatedTime" timestamp with time zone)
|
||||
ON CONFLICT ("Id") DO UPDATE SET
|
||||
"OAuthProviderId" = EXCLUDED."OAuthProviderId",
|
||||
"LastUpdatedTime" = EXCLUDED."LastUpdatedTime";
|
||||
*/
|
||||
|
||||
-- Option 3: Paste generated INSERT statements from extraction script here
|
||||
-- INSERT INTO "Users" ("Id", "OAuthProviderId", "CreatedTime", "LastUpdatedTime") VALUES (...) ON CONFLICT ("Id") DO NOTHING;
|
||||
@@ -0,0 +1,31 @@
|
||||
-- Insert Novels into UserNovelDataService database
|
||||
-- Run this against: UserNovelDataService PostgreSQL database
|
||||
--
|
||||
-- PREREQUISITE:
|
||||
-- 1. Ensure the Novels table exists (run EF migrations first if needed)
|
||||
-- 2. Extract novels from NovelService using 02_extract_novels_from_novelservice.sql
|
||||
|
||||
-- Option 1: If you have a CSV file from export
|
||||
-- \copy "Novels" ("Id", "CreatedTime", "LastUpdatedTime") FROM '/tmp/novels_export.csv' WITH CSV HEADER;
|
||||
|
||||
-- Option 2: Direct cross-database insert using dblink
|
||||
-- First, install dblink extension if not already done:
|
||||
-- CREATE EXTENSION IF NOT EXISTS dblink;
|
||||
|
||||
-- Example using dblink (adjust connection string):
|
||||
/*
|
||||
INSERT INTO "Novels" ("Id", "CreatedTime", "LastUpdatedTime")
|
||||
SELECT
|
||||
"Id"::bigint,
|
||||
"CreatedTime"::timestamp with time zone,
|
||||
"LastUpdatedTime"::timestamp with time zone
|
||||
FROM dblink(
|
||||
'host=localhost port=5432 dbname=novelservice user=postgres password=yourpassword',
|
||||
'SELECT "Id", "CreatedTime", "LastUpdatedTime" FROM "Novels"'
|
||||
) AS t("Id" bigint, "CreatedTime" timestamp with time zone, "LastUpdatedTime" timestamp with time zone)
|
||||
ON CONFLICT ("Id") DO UPDATE SET
|
||||
"LastUpdatedTime" = EXCLUDED."LastUpdatedTime";
|
||||
*/
|
||||
|
||||
-- Option 3: Paste generated INSERT statements from extraction script here
|
||||
-- INSERT INTO "Novels" ("Id", "CreatedTime", "LastUpdatedTime") VALUES (...) ON CONFLICT ("Id") DO NOTHING;
|
||||
@@ -0,0 +1,34 @@
|
||||
-- Insert Volumes into UserNovelDataService database
|
||||
-- Run this against: UserNovelDataService PostgreSQL database
|
||||
--
|
||||
-- PREREQUISITE:
|
||||
-- 1. Ensure the Volumes table exists (run EF migrations first if needed)
|
||||
-- 2. Novels must be inserted first (FK constraint)
|
||||
-- 3. Extract volumes from NovelService using 03_extract_volumes_from_novelservice.sql
|
||||
|
||||
-- Option 1: If you have a CSV file from export
|
||||
-- \copy "Volumes" ("Id", "NovelId", "CreatedTime", "LastUpdatedTime") FROM '/tmp/volumes_export.csv' WITH CSV HEADER;
|
||||
|
||||
-- Option 2: Direct cross-database insert using dblink
|
||||
-- First, install dblink extension if not already done:
|
||||
-- CREATE EXTENSION IF NOT EXISTS dblink;
|
||||
|
||||
-- Example using dblink (adjust connection string):
|
||||
/*
|
||||
INSERT INTO "Volumes" ("Id", "NovelId", "CreatedTime", "LastUpdatedTime")
|
||||
SELECT
|
||||
"Id"::bigint,
|
||||
"NovelId"::bigint,
|
||||
"CreatedTime"::timestamp with time zone,
|
||||
"LastUpdatedTime"::timestamp with time zone
|
||||
FROM dblink(
|
||||
'host=localhost port=5432 dbname=novelservice user=postgres password=yourpassword',
|
||||
'SELECT "Id", "NovelId", "CreatedTime", "LastUpdatedTime" FROM "Volume"'
|
||||
) AS t("Id" bigint, "NovelId" bigint, "CreatedTime" timestamp with time zone, "LastUpdatedTime" timestamp with time zone)
|
||||
ON CONFLICT ("Id") DO UPDATE SET
|
||||
"NovelId" = EXCLUDED."NovelId",
|
||||
"LastUpdatedTime" = EXCLUDED."LastUpdatedTime";
|
||||
*/
|
||||
|
||||
-- Option 3: Paste generated INSERT statements from extraction script here
|
||||
-- INSERT INTO "Volumes" ("Id", "NovelId", "CreatedTime", "LastUpdatedTime") VALUES (...) ON CONFLICT ("Id") DO NOTHING;
|
||||
@@ -0,0 +1,34 @@
|
||||
-- Insert Chapters into UserNovelDataService database
|
||||
-- Run this against: UserNovelDataService PostgreSQL database
|
||||
--
|
||||
-- PREREQUISITE:
|
||||
-- 1. Ensure the Chapters table exists (run EF migrations first if needed)
|
||||
-- 2. Volumes must be inserted first (FK constraint)
|
||||
-- 3. Extract chapters from NovelService using 04_extract_chapters_from_novelservice.sql
|
||||
|
||||
-- Option 1: If you have a CSV file from export
|
||||
-- \copy "Chapters" ("Id", "VolumeId", "CreatedTime", "LastUpdatedTime") FROM '/tmp/chapters_export.csv' WITH CSV HEADER;
|
||||
|
||||
-- Option 2: Direct cross-database insert using dblink
|
||||
-- First, install dblink extension if not already done:
|
||||
-- CREATE EXTENSION IF NOT EXISTS dblink;
|
||||
|
||||
-- Example using dblink (adjust connection string):
|
||||
/*
|
||||
INSERT INTO "Chapters" ("Id", "VolumeId", "CreatedTime", "LastUpdatedTime")
|
||||
SELECT
|
||||
"Id"::bigint,
|
||||
"VolumeId"::bigint,
|
||||
"CreatedTime"::timestamp with time zone,
|
||||
"LastUpdatedTime"::timestamp with time zone
|
||||
FROM dblink(
|
||||
'host=localhost port=5432 dbname=novelservice user=postgres password=yourpassword',
|
||||
'SELECT "Id", "VolumeId", "CreatedTime", "LastUpdatedTime" FROM "Chapter"'
|
||||
) AS t("Id" bigint, "VolumeId" bigint, "CreatedTime" timestamp with time zone, "LastUpdatedTime" timestamp with time zone)
|
||||
ON CONFLICT ("Id") DO UPDATE SET
|
||||
"VolumeId" = EXCLUDED."VolumeId",
|
||||
"LastUpdatedTime" = EXCLUDED."LastUpdatedTime";
|
||||
*/
|
||||
|
||||
-- Option 3: Paste generated INSERT statements from extraction script here
|
||||
-- INSERT INTO "Chapters" ("Id", "VolumeId", "CreatedTime", "LastUpdatedTime") VALUES (...) ON CONFLICT ("Id") DO NOTHING;
|
||||
Reference in New Issue
Block a user