MARC Data Import Guide#
The MARC Data Import tool loads MARC 21 bibliographic and authority records into FOLIO using FOLIO’s Data Import system via the change-manager APIs.
Overview#
MARC Data Import:
Parses binary MARC21 files (.mrc)
Uses FOLIO’s Data Import system (change-manager APIs)
Requires a Data Import Job Profile configured in FOLIO
Supports optional MARC record preprocessing before upload
Provides real-time progress tracking
Handles large files by splitting into smaller jobs
Saves job IDs for tracking in FOLIO’s Data Import logs
Basic Usage#
Command Line#
folio-data-import marc \
--gateway-url https://folio-snapshot-okapi.dev.folio.org \
--tenant-id diku \
--username diku_admin \
--password admin \
--marc-file-path records.mrc
The command will prompt you to select a Data Import Job Profile configured in your FOLIO tenant.
Using a Configuration File#
For complex or repeated MARC import jobs, use a JSON configuration file:
folio-data-import marc config.json
Example config.json:
{
"marc_files": ["records1.mrc", "records2.mrc"],
"import_profile_name": "Default - Create Instance and SRS MARC Bib",
"batch_size": 10,
"batch_delay": 1.0,
"split_files": false,
"split_size": 1000,
"no_progress": false,
"no_summary": false
}
Command-Line Parameters#
Connection Parameters#
Parameter |
Environment Variable |
Description |
|---|---|---|
|
|
FOLIO gateway URL (required) |
|
|
FOLIO tenant identifier (required) |
|
|
Username for authentication (required) |
|
|
User password (required) |
|
|
ECS member tenant ID (for multi-tenant environments) |
Job Configuration#
Parameter |
Default |
Description |
|---|---|---|
|
Required |
Path(s) to MARC file(s). Accepts multiple values and glob patterns. |
|
(interactive) |
Name of the FOLIO Data Import Job Profile. If not provided, prompts interactively. |
|
10 |
Number of records per batch (1-1000) |
|
0.0 |
Seconds to wait between batches |
|
(none) |
Comma-separated list of preprocessor names |
|
(none) |
Path to JSON file with preprocessor configuration |
|
|
Split large files into smaller jobs |
|
1000 |
Records per split (when using |
|
0 |
Number of splits to skip before starting |
|
|
Show file names in Data Import logs |
|
|
Path to save job IDs |
|
|
Disable progress bars |
|
|
Skip final job summary |
|
|
Don’t fail if summary cannot be retrieved |
|
(none) |
Path to JSON config file (overrides CLI parameters) |
Data Import Job Profiles#
The tool uses Data Import Job Profiles configured in your FOLIO tenant. These profiles define:
How MARC records are processed
What FOLIO records are created (Instances, Holdings, Items, etc.)
Field mapping rules
Match and update criteria
To view available profiles in FOLIO:
Go to Settings > Data Import > Job Profiles
Note the profile name you want to use
Common profiles:
Default - Create Instance and SRS MARC Bib - Creates bibliographic records
Default - Create SRS MARC Authority - Creates authority records
Inventory Single Record - Default Update Instance - Updates existing instances
Custom profiles created by your library
Specifying a Profile#
Provide the profile name on the command line:
folio-data-import marc \
--marc-file-path records.mrc \
--import-profile-name "Default - Create Instance and SRS MARC Bib"
Or let the tool prompt you interactively (only profiles with MARC dataType are shown):
folio-data-import marc --marc-file-path records.mrc
# Will display a list of available profiles to choose from
MARC Record Preprocessors#
Optional preprocessors can modify MARC records before they’re sent to FOLIO’s Data Import system. This is useful for:
Cleaning FOLIO-generated fields from records exported from another FOLIO system
Fixing structural issues in MARC records
Adapting records from specific sources (e.g., ABES SUDOC)
Adding prefixes to control numbers
Available Preprocessors#
Preprocessor |
Description |
|---|---|
|
Remove 999 fields with indicators ff (FOLIO-generated) |
|
Remove 999 ff fields and move other 999 fields to 945 |
|
Remove fields with no subfields or only empty subfields |
|
Fix invalid record type and bibliographic level in leader |
|
Fix invalid record type in authority record leader |
|
Set authority type to personal name |
|
Set authority type to corporate name |
|
Set authority type to meeting name |
|
Set authority type to geographic name |
|
Add prefix to 001 control number (requires config) |
|
Process ABES SUDOC records for superseding |
Basic Preprocessor Usage#
Specify preprocessors by name (comma-separated):
folio-data-import marc \
--marc-file-path records.mrc \
--preprocessor "strip_999_ff_fields,fix_bib_leader"
Preprocessor Configuration#
Some preprocessors require configuration. Provide a JSON file:
folio-data-import marc \
--marc-file-path records.mrc \
--preprocessor "prepend_prefix_001,clean_empty_fields" \
--preprocessor-config preprocessor_config.json
Example preprocessor_config.json:
{
"prepend_prefix_001": {
"prefix": "LOCAL-"
}
}
For complete preprocessor documentation, see MARC Preprocessors Guide.
File Handling#
Multiple Files#
Process multiple MARC files:
folio-data-import marc \
--marc-file-path file1.mrc \
--marc-file-path file2.mrc \
--marc-file-path file3.mrc
Glob Patterns#
Use wildcards to process multiple files:
folio-data-import marc --marc-file-path "marc_exports/*.mrc"
Files are sorted before processing.
Splitting Large Files#
For very large files, split into smaller jobs to avoid timeouts:
folio-data-import marc \
--marc-file-path large_file.mrc \
--split-files \
--split-size 1000
Resume a split job from a specific offset:
folio-data-import marc \
--marc-file-path large_file.mrc \
--split-files \
--split-size 1000 \
--split-offset 5 # Skip first 5 splits (5000 records)
Output Files#
The importer creates several output files in the same directory as the input MARC files:
File |
Description |
|---|---|
|
Job IDs for tracking in FOLIO (default location) |
|
Records that failed to parse |
|
Records from batches that failed to send |
Custom Job ID File Location#
folio-data-import marc \
--marc-file-path records.mrc \
--job-ids-file-path /path/to/job_ids.txt
Progress and Job Tracking#
Real-Time Progress#
Progress bars show:
Records uploaded (sent to FOLIO)
Records processed (imported by FOLIO)
Processing speed and time
Disable for CI/CD environments:
folio-data-import marc --marc-file-path records.mrc --no-progress
Job IDs#
Job IDs are saved to marc_import_job_ids.txt (or custom path). Use these to check job status in FOLIO:
Go to Data Import > View all logs
Find your job by ID
View detailed import results
Viewing in Data Import Logs#
Enable file names in the Data Import logs:
folio-data-import marc \
--marc-file-path records.mrc \
--file-names-in-di-logs
Workflow Example#
# 1. Set environment variables (optional)
export FOLIO_GATEWAY_URL="https://folio-snapshot-okapi.dev.folio.org"
export FOLIO_TENANT_ID="diku"
export FOLIO_USERNAME="diku_admin"
export FOLIO_PASSWORD="admin"
# 2. Import MARC records with preprocessing
folio-data-import marc \
--marc-file-path library_bibs.mrc \
--import-profile-name "Default - Create Instance and SRS MARC Bib" \
--preprocessor "strip_999_ff_fields,fix_bib_leader" \
--batch-size 10 \
--batch-delay 0.5
# 3. Check results
# - View job IDs in marc_import_job_ids.txt
# - Check Data Import logs in FOLIO UI
# - Review bad_marc_records_*.mrc for parsing failures
# - Review failed_batches_*.mrc for upload failures
Complete Example#
Full-featured MARC import with all options:
folio-data-import marc \
--gateway-url https://folio-snapshot-okapi.dev.folio.org \
--tenant-id diku \
--username diku_admin \
--password admin \
--marc-file-path "exports/*.mrc" \
--import-profile-name "Default - Create Instance and SRS MARC Bib" \
--preprocessor "strip_999_ff_fields,clean_empty_fields,fix_bib_leader" \
--preprocessor-config preprocessor_config.json \
--batch-size 10 \
--batch-delay 0.5 \
--split-files \
--split-size 1000 \
--file-names-in-di-logs \
--job-ids-file-path /var/log/folio/job_ids.txt
Common Issues#
Job Summary Not Available#
FOLIO’s import logs can be unreliable. If the job summary fails to load:
Check Data Import > View all logs in FOLIO
Look up your job ID from
marc_import_job_ids.txtUse
--let-summary-failto continue without failing:
folio-data-import marc \
--marc-file-path records.mrc \
--let-summary-fail
Use
--no-summaryto skip summary retrieval entirely:
folio-data-import marc \
--marc-file-path records.mrc \
--no-summary
Profile Not Found#
If the import profile name isn’t found:
Check exact spelling in Settings > Data Import > Job Profiles
Run without
--import-profile-nameto see available profilesProfile names are case-sensitive
Only profiles with MARC dataType are shown in the interactive list
Batch Timeouts#
For slow networks or large record batches:
folio-data-import marc \
--marc-file-path records.mrc \
--batch-size 5 \
--batch-delay 2.0
Invalid MARC Records#
Records that fail to parse are written to bad_marc_records_TIMESTAMP.mrc. Common issues:
Incorrect directory entries
Encoding issues
Truncated records
These records will require external remediation before they can be loaded.
Environment Variables#
All FOLIO connection parameters can be set via environment variables:
export FOLIO_GATEWAY_URL="https://folio-snapshot-okapi.dev.folio.org"
export FOLIO_TENANT_ID="diku"
export FOLIO_USERNAME="diku_admin"
export FOLIO_PASSWORD="admin"
Then run without repeating connection parameters:
folio-data-import marc --marc-file-path records.mrc
See Also#
DI Log Retriever - Retrieve error logs and failed MARC records from Data Import jobs