View on GitHub

AozoraEpub3-JDK21

AozoraEpub3 - Aozora Bunko to EPUB 3 Converter (JDK 21)

Development Guide

Developer documentation for contributing to AozoraEpub3 or understanding its internal implementation.

Table of Contents


Development Environment Setup

Requirements

Clone Repository

git clone https://github.com/Harusame64/AozoraEpub3-JDK21.git
cd AozoraEpub3-JDK21

IDE Setup

VS Code

  1. Install Java Extension Pack
  2. Open folder
  3. Gradle tasks auto-detected

IntelliJ IDEA

  1. File โ†’ Open โ†’ Select build.gradle
  2. โ€œImport Gradle Projectโ€ auto-configures

Eclipse

./gradlew eclipse

Then: Import โ†’ Existing Projects


Building and Testing

Basic Build

## Clean build
./gradlew clean build

## Create FAT JAR (with all dependencies)
./gradlew jar

## Create distribution packages (ZIP + TAR.GZ)
./gradlew dist

Important: distZip task is disabled. Use dist task for distribution packages.

Running Tests

## Run all tests
./gradlew test

## Generate test report
./gradlew test --rerun-tasks
## โ†’ build/reports/tests/test/index.html

Build Artifacts

Running the Application

## Launch GUI (no arguments)
java -jar build/libs/AozoraEpub3.jar

## CLI usage (convert UTF-8 text to EPUB)
java -jar build/libs/AozoraEpub3.jar -of -d out input.txt

## Vertical writing sample
java -jar build/libs/AozoraEpub3.jar -enc UTF-8 test_data/test_title.txt

## Horizontal writing sample
java -jar build/libs/AozoraEpub3.jar -enc UTF-8 -y test_data/test_yoko.txt

Project Structure

AozoraEpub3/
โ”œโ”€โ”€ src/                       # Main source code
โ”‚   โ”œโ”€โ”€ AozoraEpub3.java       # CLI entry point
โ”‚   โ”œโ”€โ”€ AozoraEpub3Applet.java # GUI entry point
โ”‚   โ””โ”€โ”€ com/github/hmdev/      # Package root
โ”‚       โ”œโ”€โ”€ converter/         # Textโ†’EPUB conversion
โ”‚       โ”œโ”€โ”€ epub/              # EPUB specification
โ”‚       โ”œโ”€โ”€ io/                # File/archive handling
โ”‚       โ”œโ”€โ”€ image/             # Image processing
โ”‚       โ”œโ”€โ”€ config/            # Configuration parsing
โ”‚       โ””โ”€โ”€ pipeline/          # Conversion pipeline
โ”‚
โ”œโ”€โ”€ test/                      # Test code
โ”‚   โ”œโ”€โ”€ AozoraEpub3SmokeTest.java
โ”‚   โ”œโ”€โ”€ IniCssIntegrationTest.java
โ”‚   โ””โ”€โ”€ com/github/hmdev/      # Package tests
โ”‚
โ”œโ”€โ”€ template/                  # Velocity templates
โ”‚   โ”œโ”€โ”€ mimetype               # EPUB mimetype file
โ”‚   โ”œโ”€โ”€ META-INF/
โ”‚   โ”‚   โ””โ”€โ”€ container.xml      # EPUB container definition
โ”‚   โ””โ”€โ”€ OPS/
โ”‚       โ”œโ”€โ”€ package.vm         # package.opf generation
โ”‚       โ”œโ”€โ”€ toc.ncx.vm         # NCX TOC generation
โ”‚       โ””โ”€โ”€ css/               # CSS templates
โ”‚           โ”œโ”€โ”€ vertical_text.vm
โ”‚           โ””โ”€โ”€ horizontal_text.vm
โ”‚
โ”œโ”€โ”€ test_data/                 # Test fixtures
โ”‚   โ”œโ”€โ”€ test_title.txt         # Title/author tests
โ”‚   โ”œโ”€โ”€ test_ruby.txt          # Ruby conversion tests
โ”‚   โ”œโ”€โ”€ test_gaiji.txt         # Gaiji conversion tests
โ”‚   โ””โ”€โ”€ img/                   # Test images
โ”‚
โ”œโ”€โ”€ presets/                   # Device presets
โ”‚   โ”œโ”€โ”€ kobo__full.ini         # Kobo maximum size
โ”‚   โ”œโ”€โ”€ kindle_pw.ini          # Kindle Paperwhite
โ”‚   โ””โ”€โ”€ reader.ini             # Sony Reader
โ”‚
โ”œโ”€โ”€ chuki_*.txt                # Notation definition files
โ”‚   โ”œโ”€โ”€ chuki_tag.txt          # Notation โ†’ Tag conversion
โ”‚   โ”œโ”€โ”€ chuki_alt.txt          # Gaiji โ†’ Alternative chars
โ”‚   โ”œโ”€โ”€ chuki_utf.txt          # Gaiji โ†’ UTF-8
โ”‚   โ”œโ”€โ”€ chuki_ivs.txt          # Gaiji โ†’ IVS
โ”‚   โ””โ”€โ”€ chuki_latin.txt        # Latin character conversion
โ”‚
โ”œโ”€โ”€ build.gradle               # Gradle build definition
โ”œโ”€โ”€ gradlew, gradlew.bat       # Gradle Wrapper
โ”œโ”€โ”€ README.md                  # User documentation
โ””โ”€โ”€ DEVELOPMENT.md             # Original dev documentation

Key Classes

Entry Points

Conversion Pipeline

I/O & Archives

Image Processing

Configuration


Code Architecture

Recent Improvements (v1.2.4+)

Modularization and Class Extraction

Separated responsibilities from the large AozoraEpub3.java (originally 645 lines) to improve maintainability:

Extracted Classes:

  1. OutputNamer (com.github.hmdev.io): Filename generation logic (50 lines)
    • Creator/title-based auto-naming
    • Filename sanitization (invalid character removal)
    • Default extension handling
  2. WriterConfigurator (com.github.hmdev.pipeline): Writer configuration aggregation (110 lines)
    • Image parameter configuration
    • TOC nesting configuration
    • Style settings (margins, line height, fonts)
  3. ArchiveTextExtractor (com.github.hmdev.io): Unified archive handling (90 lines)
    • Text extraction from zip/rar/txt
    • Text file counting
    • Cache mechanism integration

Refactoring Results:

Details: notes/refactor-plan.md

Performance Optimization ๐Ÿš€

Problem: 4 archive scans per file:

  1. Text file count
  2. Book information retrieval
  3. Image list loading
  4. Actual conversion

Solution: Archive caching mechanism

Optimization Results:

Memory Management:

Details: notes/archive-cache-optimization.md


Template System (Velocity)

Design Principles

Epub3Writer supports VelocityEngine injection to avoid global initialization dependencies.

Constructors

// Recommended: Inject VelocityEngine (testable, customizable)
new Epub3Writer(templatePath, velocityEngine)

// Or legacy style (backward compatible)
new Epub3Writer(templatePath)  // Uses global initialization

Test Usage Example

Properties p = new Properties();
p.setProperty("resource.loaders", "file");
p.setProperty("resource.loader.file.class", 
    "org.apache.velocity.runtime.resource.loader.FileResourceLoader");
p.setProperty("resource.loader.file.path", 
    projectRoot.resolve("template").resolve("OPS").toString());
VelocityEngine ve = new VelocityEngine(p);

VelocityContext ctx = new VelocityContext();
ctx.put("title", "Sample Title");
ctx.put("bookInfo", bookInfoObject);
ctx.put("sections", sectionList);
// ... other context setup

Template t = ve.getTemplate("package.vm", "UTF-8");
StringWriter out = new StringWriter();
t.merge(ctx, out);
String opf = out.toString();

Template Files

Template Purpose Output File
package.vm EPUB metadata/manifest package.opf
toc.ncx.vm Table of contents (NCX) toc.ncx
css/vertical_text.vm Vertical text CSS vertical_text.css
css/horizontal_text.vm Horizontal text CSS horizontal_text.css
xhtml/*.vm Content XHTML *.xhtml

INI Values to CSS

INI settings (font_size, line_height, etc.) are placed in Velocity context and used as CSS variables in templates.

Example (vertical_text.vm):

:root {
  --font-size: ${fontSize}%;
  --line-height: ${lineHeight};
}

Related Tests:


EPUB 3.3 Compliance

Validation Items (Automated in CI)

mimetype File

package.opf (OPF File)

container.xml

Kindle (iPhone) Support

In package.vm L60 (for ImageOnly + Kindle):

<meta name="primary-writing-mode" content="horizontal-rl"/>

Preservation Test: PackageTemplateKindleMetaTest.java

epubcheck Validation

## Custom Gradle task
./gradlew epubcheck \
  -PepubDir=build/epub_local \
  -PepubcheckJar=build/tools/epubcheck-5.3.0/epubcheck.jar

Get epubcheck:

mkdir -p build/tools
cd build/tools
curl -L -o epubcheck-5.3.0.zip \
  https://github.com/w3c/epubcheck/releases/download/v5.3.0/epubcheck-5.3.0.zip
unzip epubcheck-5.3.0.zip

Aozora Notation Support

Basic Notation (Configuration Files)

Programmatically Processed Notation

- Page left/right centering
- Auto ruby notation conversion ๏ผป๏ผƒใ€Œโ—‹ใ€ใซใ€Œโ–ณใ€ใฎใƒซใƒ“๏ผฝ โ†’ ๏ฝœโ—‹ใ€Šโ–ณใ€‹
- Auto emphasis conversion ๏ผป๏ผƒใ€Œโ—‹ใ€ใซร—ๅ‚็‚น๏ผฝ โ†’ ร— ruby
- Complex indentation (fold/indent calculation)
- Warichu line break addition
- Page break by ๅบ•ๆœฌ๏ผš

Unsupported Notation

- Corrections and "ใƒžใƒž" (ignored)
- Left ruby
- Inline bottom alignment
- Two-column layout

External Character Handling

Extract and convert gaiji (external characters) from Aozora notation:

โ€ป๏ผป๏ผƒใ€Œๅญ—ๅใ€ใ€U+6DB6๏ผฝ                          โ†’ Direct Unicode
โ€ป๏ผป๏ผƒใ€Œๅญ—ๅใ€ใ€็ฌฌ3ๆฐดๆบ–1-85-57๏ผฝ                  โ†’ JIS code โ†’ UTF-8
โ€ป๏ผป๏ผƒใ€Œใ•ใ‚“ใšใ„๏ผ‹ๅž‚ใ€ใ€UCS6DB6๏ผฝ                โ†’ UCS format

Gaiji without corresponding code are replaced with alternative characters via chuki_alt.txt.


GitHub Actions CI

Workflows

ci.yml (Build, Test, EPUB Generation, Validation)

Automated processes:

Manual Execution:
GitHub โ†’ Actions โ†’ CI โ†’ Run workflow

Optional inputs:

test.yml (Unit Tests Only)


Contributing Guidelines

Bug Reports & Feature Requests

Provide the following information in GitHub Issues:

For Bugs

For Feature Requests

Pull Requests

  1. Fork repository and create feature branch
  2. Make changes and add tests
  3. Verify all tests pass: ./gradlew test
  4. Create PR (with description and related issue links)

Coding Conventions


Common Issues

Velocity Resource Resolution Failure

Symptom: template/*.vm not found during test execution

Cause: Gradle Worker runs in different working directory

Solution:

Windows Path Issues

Symptom: \ causes escaping failures

Solution:

Test Instability

Symptom: Passes locally but fails in CI

Causes:

Solutions:


Performance Optimization Tips

Large Text Conversion

Image Processing

EPUB Generation Speed



License

GPL v3 - See README for details